- Last 7 days
-
arxiv.org arxiv.org
-
320 neurons to 10,420 latent features
What was the motivation for the value 10,420? Is the ability to extract features pretty consistent (no the actual weights obviously but the behavior of being able to extract meaningful features) as long as the space is large enough?
-
visualizations primarily focus on features from the fourth layer
Was the fourth layer chosen arbitrarily or did it have any particularly nice properties for visualization?
-
-
www.biorxiv.org www.biorxiv.org
-
assuring that the417discriminator can well capture the sequence-function relationship
Is the discriminator learning something much more complicated than identifying something like how to identify MDH domains?
-
convolutional neural network (CNN)-based protein discriminator
This seems to work well based on the results, was there previous work/concepts that lead to this as the architecture for the discriminator?
-
- Dec 2024
-
www.biorxiv.org www.biorxiv.org712086791
-
Hyperparameter optimization was 220performed using a hyperband tune
How computationally intensive was it to solve for all of the hyperparams for this model?
-
- Nov 2024
-
www.biorxiv.org www.biorxiv.org
-
masking ratio 0.15
Do different values for percent of tokens masked have much of an impact on model performance?
-
average cluster size is just 2.2
Was there much variation around this?
-
-
www.biorxiv.org www.biorxiv.org209850351
-
Guider1 andGuider2, were designed to improve the network’s ability to distinguish between differentsequence types. Guider1 consists of a multi-head self-attention mechanism with 8 heads andtwo fully connected layers, while Guider2 is a Gated Recurrent Unit (GRU) with 256neurons
Sorry if I missed this, but what was the motivation for choosing these particular discriminator models? They seem very reasonable given the results, but I'm curious how these two types of models were chosen based on the structure of the initial problem?
-
-
www.biorxiv.org www.biorxiv.org
-
recombination, or a haplotype switch, occurs between two consecutive vertices ai.u and ai+1.u in P ifai.h ̸ = ai+1.h
Would an "ideal" path be one where there is a single haplotype path that crosses every single \(a_{i}.u\) vertex? This would be something that generally exists for a given sample, but if present this would be the best path?
-
ai.u
Just to verify I'm understanding this notation, \(a_{i}.u\) is being used like an accessor for components of the tuple \(a_{i}\)?
-
- Oct 2024
-
-
PED(X, Y ) = 12 E[infπ(∥d(X, X′) − d(Y, X′)π ∥p)]+ 12 E[infπ(∥d(Y, Y ′) − d(X, Y ′)π ∥
Sorry if this is clear, but I'm a little unclear on the notation. Is X the input data (so empirical results from a scRNA-seq experiment) and Y the generated dist? If so then are X' and Y' subsets of the respective distributions?
-
- Jul 2024
-
www.biorxiv.org www.biorxiv.org
-
P recisions
In this case does
s
denote tuning to how results are classified when calling true and false positives and negatives? Or weight terms in the F-measure score it self? -
x−i,j (t), which is the expression of gene j (the expres-sion of all other genes is masked)
Is this a vector with only a value at position j? So a vector of size N with only position j having a value set, hence being different than x_{j}(t)?
-
- May 2024
-
www.biorxiv.org www.biorxiv.org
-
The black arrow highlights the longest barcode.
It might be easier to see if the longest barcode was in a different color or had a dashed line overlayed on top of it.
-
e green and light blue clusters are on one side, and the other colors(especially the dark blue and magenta) are on the other side of the hole.
It's hard to tell exactly which part of the structure is being referenced (at least for me). It might be helpful to add an annotation like a circle to show which area is being discussed.
-
-
-
xi,j − ̃x
The indexing of
x_{ij}
doesn't seem to be a unique element but a pair(u_{ij}, s_{ij})
. Is this loss function calculating the difference between the spliced and unspliced differences combined?
-
- Feb 2024
-
arxiv.org arxiv.org
-
if di is equal to 1984-01-01, then U encompasses all papers published after diuntil 1989-01-01
This part is assuming that t has been set to 5, correct?
-
-
www.biorxiv.org www.biorxiv.org
-
scGPT v1 outperformed the scGPT model overall, raising the issue146of the need for increasing the size of pre-training datasets for this task
Wasn't scGPT v1 which out performed scGPT trained on a smaller pre-training data set?
-
-
-
The fact that thisnarrative captured so much attention despite a complete lack of supporting evidence promptsus to reflect on how our biases shape our interpretation of data, and how extreme differencesin believing people based on where they work can lead to incorrect and harmful conclusions.Here, we are reflecting on our experiences, and we invite readers to do the same.
Really interesting article!
Given the impact this had do you feel there are changes or criticisms needed around the review and publication process of the Bloom results? I'm also curious if you have any thoughts on how pre-print and open science can do a better job with contentious results and discussions around them.
-
- Oct 2023
-
www.biorxiv.org www.biorxiv.org
-
e also found associations top53, telomere maintenance, and cell fate within 1 Mbp of our top 25 loci of interest. Ourtop 25 loci also have links to cancer and height or body size, though these prevalent diseasesand biomarkers are of course heavily studied and consequently commonly annotated, and sowe cannot know whether their appearance is simply due to their frequency
Is this 1Mbp in either direction of a loci of interest? Just binning the human genome by 25 points gives about 1.7% of the genome within 1Mbp of these uniform bins. Depending on what percent of genes are associated with the traits of interest that could be very rare, or fairly common. Is there a way of viewing how impactful this result is in comparison to the size of the genome annotated as relevant to these traits?
-