20 Matching Annotations
  1. Last 7 days
    1. 320 neurons to 10,420 latent features

      What was the motivation for the value 10,420? Is the ability to extract features pretty consistent (no the actual weights obviously but the behavior of being able to extract meaningful features) as long as the space is large enough?

    2. visualizations primarily focus on features from the fourth layer

      Was the fourth layer chosen arbitrarily or did it have any particularly nice properties for visualization?

    1. assuring that the417discriminator can well capture the sequence-function relationship

      Is the discriminator learning something much more complicated than identifying something like how to identify MDH domains?

    2. convolutional neural network (CNN)-based protein discriminator

      This seems to work well based on the results, was there previous work/concepts that lead to this as the architecture for the discriminator?

  2. Dec 2024
    1. Hyperparameter optimization was 220performed using a hyperband tune

      How computationally intensive was it to solve for all of the hyperparams for this model?

  3. Nov 2024
    1. masking ratio 0.15

      Do different values for percent of tokens masked have much of an impact on model performance?

    2. average cluster size is just 2.2

      Was there much variation around this?

    1. Guider1 andGuider2, were designed to improve the network’s ability to distinguish between differentsequence types. Guider1 consists of a multi-head self-attention mechanism with 8 heads andtwo fully connected layers, while Guider2 is a Gated Recurrent Unit (GRU) with 256neurons

      Sorry if I missed this, but what was the motivation for choosing these particular discriminator models? They seem very reasonable given the results, but I'm curious how these two types of models were chosen based on the structure of the initial problem?

    1. recombination, or a haplotype switch, occurs between two consecutive vertices ai.u and ai+1.u in P ifai.h ̸ = ai+1.h

      Would an "ideal" path be one where there is a single haplotype path that crosses every single \(a_{i}.u\) vertex? This would be something that generally exists for a given sample, but if present this would be the best path?

    2. ai.u

      Just to verify I'm understanding this notation, \(a_{i}.u\) is being used like an accessor for components of the tuple \(a_{i}\)?

  4. Oct 2024
    1. PED(X, Y ) = 12 E[infπ(∥d(X, X′) − d(Y, X′)π ∥p)]+ 12 E[infπ(∥d(Y, Y ′) − d(X, Y ′)π ∥

      Sorry if this is clear, but I'm a little unclear on the notation. Is X the input data (so empirical results from a scRNA-seq experiment) and Y the generated dist? If so then are X' and Y' subsets of the respective distributions?

  5. Jul 2024
    1. P recisions

      In this case does s denote tuning to how results are classified when calling true and false positives and negatives? Or weight terms in the F-measure score it self?

    2. x−i,j (t), which is the expression of gene j (the expres-sion of all other genes is masked)

      Is this a vector with only a value at position j? So a vector of size N with only position j having a value set, hence being different than x_{j}(t)?

  6. May 2024
    1. The black arrow highlights the longest barcode.

      It might be easier to see if the longest barcode was in a different color or had a dashed line overlayed on top of it.

    2. e green and light blue clusters are on one side, and the other colors(especially the dark blue and magenta) are on the other side of the hole.

      It's hard to tell exactly which part of the structure is being referenced (at least for me). It might be helpful to add an annotation like a circle to show which area is being discussed.

    1. xi,j − ̃x

      The indexing of x_{ij} doesn't seem to be a unique element but a pair (u_{ij}, s_{ij}). Is this loss function calculating the difference between the spliced and unspliced differences combined?

  7. Feb 2024
    1. if di is equal to 1984-01-01, then U encompasses all papers published after diuntil 1989-01-01

      This part is assuming that t has been set to 5, correct?

    1. scGPT v1 outperformed the scGPT model overall, raising the issue146of the need for increasing the size of pre-training datasets for this task

      Wasn't scGPT v1 which out performed scGPT trained on a smaller pre-training data set?

    1. The fact that thisnarrative captured so much attention despite a complete lack of supporting evidence promptsus to reflect on how our biases shape our interpretation of data, and how extreme differencesin believing people based on where they work can lead to incorrect and harmful conclusions.Here, we are reflecting on our experiences, and we invite readers to do the same.

      Really interesting article!

      Given the impact this had do you feel there are changes or criticisms needed around the review and publication process of the Bloom results? I'm also curious if you have any thoughts on how pre-print and open science can do a better job with contentious results and discussions around them.

  8. Oct 2023
    1. e also found associations top53, telomere maintenance, and cell fate within 1 Mbp of our top 25 loci of interest. Ourtop 25 loci also have links to cancer and height or body size, though these prevalent diseasesand biomarkers are of course heavily studied and consequently commonly annotated, and sowe cannot know whether their appearance is simply due to their frequency

      Is this 1Mbp in either direction of a loci of interest? Just binning the human genome by 25 points gives about 1.7% of the genome within 1Mbp of these uniform bins. Depending on what percent of genes are associated with the traits of interest that could be very rare, or fairly common. Is there a way of viewing how impactful this result is in comparison to the size of the genome annotated as relevant to these traits?