When
correct?
When
correct?
L’article décrit une méthode reproducible pour synthétiser des corpus de cartes mentales en sciences humaines et sociales.
Dans une démarche mixte, l'article présente une méthode cartographique reproductible permettant de synthétiser des corpus de cartes mentales dans les champs des sciences sociales.
entre corpus variés qui combinent éléments qualitatifs, dessins, et mesures objectives et reproductibles utiles à l’analyse en SHS
entre des corpus variés, permettant de combiner matériaux et démarches qualitatives (cartes mentales et sensibles, entretiens, etc.) et quantitatives (cartographie de synthèse, indicateurs, comparaison, etc.).
standardisé
(à supprimer)
:
. (finir la phrase ici, sinon impression que toutes les étapes arrivent après les " : ")
Keywords
proposition : méthodes mixtes
les cartes mentales converties en données géographiques sont agrégées sur une grille régulière pour mesurer la densité ou le recouvrement des tracés sur une même matrice
les cartes mentales sont converties en données géographiques et agrégées sur une grille régulière afin de mesurer la densité ou le recouvrement des tracés sur une même matrice
Conçue pour dépasser l’interprétation qualitative des dessins individuels
Conçue de manière complémentaire à une démarche qualitative, ...
Cartographier des représentations spatiales de manière agrégée
Proposition : cartographie agrégée de représentations spatiales
carte
cartes
reproducible
reproductible ?
use the quicksand rules
Quicksand A quicksand pit covers the ground in roughly a 10-foot-square area and is usually 10 feet deep. When a creature enters the area, it sinks 1d4 + 1 feet into the quicksand and becomes restrained. At the start of each of the creature’s turns, it sinks another 1d4 feet. As long as the creature isn’t completely submerged in quicksand, it can escape by using its action and succeeding on a Strength check. The DC is 10 plus the number of feet the creature has sunk into the quicksand. A creature that is completely submerged in quicksand can’t breathe (see the suffocation rules in the Player’s Handbook).
A creature can pull another creature within its reach out of a quicksand pit by using its action and succeeding on a Strength check. The DC is 5 plus the number of feet the target creature has sunk into the quicksand.
💻/thinkpad/🧊/me/📓/2026/04/28
google.search?mother+of+all+demos

button
пример да
Create an account using the sidebar on the right of the screen.
пример3
Простое нарушение метрики еще не шаг в топику: это может оказаться переходом из одной метрической системы в другую. Кроме того, для шага из метрики в топику не обязательны какие-то внешние действия. Я предвижу вопрос о примерах такого движения – но любой пример тривиализировал бы эту мысль. Это значило бы расписывать, метрически заранее расписывать топику, – но в нее можно шагнуть только из своей ситуации, только своим поступком, этот шаг можно почуять только своей жизнью. Так у Бахтина (в «Философии поступка») поступок «знает» больше, чем поступающий. Амеханически-автоматический поступок мудрее поступающего и в каком-то смысле создает его.
need a solution



How to interpret Box’s M p-value p > .05 → Assumption met: The covariance matrices (how the DVs relate to each other) are similar across all groups. • You can confidently use the standard MANOVA test statistics: Wilks’ Lambda, Pillai’s Trace, etc. p < .05 → Assumption violated: The covariance matrices differ between groups, which breaks one of MANOVA’s key assumptions. • This makes certain MANOVA test results, especially Wilks’ Lambda, less reliable, it assumes equal covariance matrices. • Instead, rely on Pillai’s Trace, which is more robust and still valid even if this assumption is violated.
BULLSHIT should be p<0.001
💻/thinkpad/🧊/me/📓/2026/04/28
pdweb.archive-hypothesis.faceted.search-bush
demandé à Dieu
Test
/💻/thinkpad/🧊/me/📓/2026/04/28 personal.web.archive
Понимание другого, в котором мы начинаем узнавать себя, отличается от «по-имания» (выражение Пиамы Гайденко) больше, чем от непонимания. Когда Аристотель не хочет понимать Платона, Джамбаттиста Вико — Декарта, Кьеркегор — Гегеля, Хайдеггер — метафизику Канта, то в этом непонимании оставлено место для внезапного узнавания того, что до тех пор оставалось чужим. Наоборот, всепонимающий историк философии, орудием метода выстраивающий себе подступы к чему угодно, способен объяснить даже противоречия своего подопечного, но этим загораживает путь к настоящему пониманию. Знание и незнание, понимание и непонимание не противоположности, а две стороны одной и той же вещи — дружбы с мудростью, философии.
SURVEILLED - DOC NYCThe film's credits include: * **Director** Matthew O'Neill and Perri Peltz * **Producers** Perri Peltz, Matthew O'Neill, Ronan Far...
>> to
investigating the booming commercial spyware industry.
spyware industry
💻/thinkpad/🧊/me/📓/2026/04/28
personal.dweb.archive-search.google-surveilled

InterPersonal Computing
https://hyp.is/ikhoBELqEfGJV98MPQslOg/x.com/JonErlichman/status/1280665777025187841

Open Access and Fair Principles ccafs.cgiar.org
💻/thinkpad/🧊/me/📓/2026/04/28
personal.web.archiv-hypothes.is~faceted.search-gyuri-open+interoperable

Steve Job on InterPersonal Computing = as the next (r)evolutional step beond Personal Computing -

eLife Assessment
This study addresses an important question in gustatory neuroscience by developing a machine-learning classifier to identify distinct ingestive orofacial movement subtypes from electromyographic recordings and relating their dynamics to population-level activity in the gustatory cortex. The evidence that transitions in cortical ensemble firing are temporally associated with reorganization of ingestive movement patterns is convincing, though some aspects of the behavioral classification and neural analyses require further validation and clarification. The work provides a technically innovative framework for linking neural state dynamics to the motor expression of taste-guided decisions.
Reviewer #1 (Public review):
Summary:
This study investigates how ingestive behaviors are reflected in muscle activity and how these behaviors relate to neural dynamics in the brain. By combining muscle recordings with computational analysis, the authors identify patterns of mouth movements and show that these change over time and align with changes in brain activity. The work suggests that ingestion is not defined by a single action but by coordinated changes across multiple behaviors.
Strengths:
(1) Addresses an important and underexplored question about how ingestive behavior is organized.
(2) Combines behavioral, physiological, and computational approaches creatively.
(3) Provides a novel framework for quantifying complex ingestive movements.
(4) Demonstrates a clear temporal relationship between behavior and brain activity.
Weaknesses
(1) Behavioral labels rely on video-based scoring, which may not fully capture subtle or hidden movements.
(2) The relationship between brain activity and behavior is correlational, but sometimes interpreted more strongly.
(3) The manuscript could be clearer and more accessible to readers outside the field.
Reviewer #2 (Public review):
Summary:
In this study, Baas-Thomas et al. aim to study the neural mechanisms underlying ingestive versus rejection responses to taste stimuli by developing an EMG-based approach to identify ingestion-related orofacial movements. Whereas prior work has focused primarily on detecting rejection-related gapes, the authors introduce a machine-learning classifier that uses waveform features extracted from anterior digastric (AD) EMG signals to detect mouth- and tongue-movement (MTM) events associated with ingestion. Clustering analyses further suggest that ingestive behavior consists of multiple MTM subtypes whose relative frequencies vary across trial time and taste conditions. Finally, simultaneous recordings indicate that shifts in MTM expression follow transitions in gustatory cortex (GC) population dynamics into palatability-related firing states, supporting a role for cortical ensemble activity in coordinating ingestive motor responses.
Strengths:
Overall, the scientific question addressed in this study is well motivated. A mechanistic understanding of ingestive decision-making requires a precise characterization of the motor patterns that implement ingestion, and these behaviors have remained insufficiently resolved in prior work. The authors take a reasonable and technically innovative approach by leveraging AD EMG recordings to classify distinct orofacial movement patterns. The extracted waveform features appear effective in separating gapes from ingestion-related mouth-tongue movements, and clustering analyses further suggest the presence of distinguishable MTM subtypes that show meaningful temporal structure and neural correlates. Taken together, the work provides a potentially useful framework for linking gustatory cortical dynamics to the motor expression of taste-guided decisions.
A particularly valuable aspect of this work is the attempt to move beyond a binary characterization of ingestive behavior and instead identify multiple subtypes of ingestion-related movements. This finer behavioral resolution has the potential to provide a more realistic account of how complex consummatory actions are organized. More broadly, the effort to relate structured behavioral motifs to population-level neural dynamics is conceptually interesting and could prove useful for future studies seeking to connect circuit dynamics with the motor implementation of motivated behaviors.
Weaknesses:
(1) I have several concerns regarding the methodological comparisons used to establish the superiority of the proposed XGBoost classifier. In particular, the comparison between the XGBoost classifier and previously used QDA approaches (Figure 3) may not be entirely well-matched. The QDA framework was originally designed primarily to detect gape events and does not explicitly assign labels to MTM movements. As a result, the apparent advantage of XGBoost in identifying MTMs may partly reflect differences in task formulation rather than intrinsic differences in classification performance. From visual inspection, gape detection performance appears broadly comparable across methods.
A more informative benchmark would involve comparing XGBoost to an extended pipeline in which QDA-based gape detection is combined with a secondary movement-detection stage, distinguishing MTMs from periods of no movement. Such a comparison would better isolate the contribution of classifier architecture per se. Without this control analysis, the strength of the claim that XGBoost provides superior performance for behavioral decoding remains somewhat uncertain.
(2) The presentation of the neural ensemble analyses is considerably less comprehensive and intuitive than that of the behavioral analyses. The manuscript would benefit from more direct visualization of inferred neural state transitions. For example, plotting predicted neural states in a manner analogous to the behavioral states illustrated in Figure 6B would improve interpretability and help readers understand how neural dynamics relate temporally to behavioral changes.
In addition, the interpretation that GC ensemble dynamics drive behavioral state transitions may require further clarification. If GC activity plays a causal role in initiating behavioral changes, one might expect a consistent brain-to-behavior lag across changepoints. However, Figure 6 appears to show such lag primarily at the second transition but not at the first. This raises questions about how uniformly the proposed causal interpretation applies across state boundaries, and additional analysis or discussion is needed.
(3) The neural ensemble analyses primarily focus on constructing higher-level behavioral state variables rather than directly testing how individual movement subtypes relate to neural activity. The behavioral interpretation of the inferred state structure, therefore, remains somewhat unclear. While this approach is consistent with previous work from the authors and with broader state-transition frameworks of gustatory processing, it is not immediately obvious that this is the most informative level of analysis for the present dataset.
In particular, it would strengthen the manuscript to examine whether GC neurons or ensembles also encode lower-level motor structure, such as the occurrence of gapes or specific MTM subtypes. Demonstrating selective or mixed encoding across hierarchical levels (movement motifs versus abstract behavioral states) would help clarify the functional interpretation of the reported neural dynamics. At present, the manuscript largely assumes that GC activity reflects higher-order behavioral states without directly testing alternative representational possibilities.
(4) Because direct behavioral ground truth for intra-oral ingestive movements is difficult to obtain, MTM subtypes are inferred primarily through clustering of EMG waveform features. Although the authors demonstrate statistical separability and cross-session stability of these clusters, it remains unclear whether they correspond to discrete motor programs or instead reflect a structured partitioning of a continuous behavioral space shaped by feature selection and preprocessing choices. Perhaps some additional robustness analyses or convergent validation (e.g., alternative clustering methods, feature perturbation tests, or stronger neural and behavioral dissociations) would help clarify the biological significance of the inferred subtype structure.
Reviewer #3 (Public review):
Summary:
This study examines how ingestive-related orofacial movements relate to ensemble dynamics in gustatory cortex (GC) during taste processing. Previous work has shown that GC activity evolves through a sequence of population states following taste delivery, culminating in a transition to palatability-related firing that precedes rejection-related orofacial movements (e.g., gaping). Importantly, perturbing GC activity around the time of this transition alters the timing of gaping, suggesting that these ensemble dynamics play a functional role in linking taste evaluation to behavioral responses. The present study asks whether similar neural dynamics are also associated with ingestive-related orofacial movements that occur during the consumption of palatable stimuli.
To address this question, the authors develop a machine-learning classifier to identify distinct orofacial movements from anterior digastric EMG recordings. Using a set of labeled EMG waveforms obtained from video-scored trials, a gradient-boosted (XGBoost) classifier is trained to detect gapes, mouth/tongue movements (MTMs), and periods of no movement. Applying this classifier to a larger EMG dataset reveals that ingestive-related MTMs cluster into three distinct movement subtypes whose frequencies change systematically within trials.
The authors then relate these behavioral dynamics to previously described GC ensemble transitions identified using changepoint modeling. They report that changes in MTM subtype frequencies tend to occur shortly after the transition to palatability-related activity in GC. These results suggest that GC population dynamics are temporally associated not only with rejection-related behaviors but also with ingestive motor patterns that occur as animals prepare to consume palatable tastants.
Strengths:
The study introduces an innovative framework for extracting intricate orofacial movement information from EMG recordings. The machine-learning classifier provides a scalable method for identifying specific orofacial movements and performs better than previously published algorithms designed for gape detection. This approach allows the authors to examine movement microstructure at a temporal resolution that cannot be achieved with video scoring in freely moving animals.
A second strength is the integration of orofacial movement analysis with neural population dynamics. By relating EMG-derived movement subtypes to ensemble state transitions in GC, the study builds on a substantial body of work examining the temporal evolution of taste responses in cortex. The finding that ingestive-related movement dynamics occur shortly after the emergence of palatability-related firing provides an interesting extension of previous observations linking GC state transitions to rejection behavior.
The manuscript is also commendable in its commitment to data accessibility. By providing clear information about how the datasets can be accessed and making training data for the classifier publicly available, the authors make it possible for other researchers to examine the analytical pipeline and apply similar approaches to their own datasets. This transparency provides a useful framework for extending and building upon the methods presented here.
Weaknesses:
Some aspects of the EMG-based movement classification pipeline warrant careful interpretation. The training dataset used for classifier development is relatively small and is derived from a subset of trials in which mouth movements were clearly visible in video recordings. While the classifier performs well on this labeled dataset, it is not entirely clear how representative these labeled examples are of the full range of EMG signals present in the larger dataset.
The interpretation of the three identified MTM subtypes also remains somewhat tentative. The study convincingly demonstrates that distinct waveform-defined clusters exist in the EMG data, but the functional significance of these clusters as ingestive "behaviors" is less clear. As acknowledged by the authors, the specific roles of these movement patterns in the ingestion process remain speculative.
Finally, several conclusions in the Discussion rely on relatively strong mechanistic language when describing the relationship between GC dynamics and ingestive behavior. The data clearly demonstrate a temporal association between GC state transitions and changes in the frequencies of the different MTM subtypes. However, the results primarily support the interpretation that similar cortical dynamics are associated with ingestive and rejection-related behaviors rather than definitively establishing that these behaviors "are governed by the same underlying neural mechanisms".
Author response:
Public Reviews:
Reviewer #1 (Public review):
(1) Behavioral labels rely on video-based scoring, which may not fully capture subtle or hidden movements.
This is very true; certainly, this work is only a starting point. But the techniques used for this manuscript, despite starting with video-based scoring, specifically did allow us to differentiate behaviors that were too subtle to recognize in the video. For the revision, we will describe how this work leads to future studies in which we will be able to explore other means of collecting behavioral labels, potentially directly from simultaneous recordings of multiple muscles.
(2) The relationship between brain activity and behavior is correlational, but sometimes interpreted more strongly.
We will comb through the manuscript and make edits to be more precise and technically correct in presenting this relationship, and clarify that our suggestion of a causal link is only indirect and related to previous work (Mukherjee et al. 2019).
(3) The manuscript could be clearer and more accessible to readers outside the field.
We will edit the manuscript in multiple places to make technical and field-specific aspects more accessible. As part of this, in appreciation of Reviewer 2’s comments, we will take additional care to elaborate on and clarify our need and interpretation of SHAP values and classifier structure.
Reviewer #2 (Public review):
(1) I have several concerns regarding the methodological comparisons used to establish the superiority of the proposed XGBoost classifier. In particular, the comparison between the XGBoost classifier and previously used QDA approaches (Figure 3) may not be entirely well-matched. The QDA framework was originally designed primarily to detect gape events and does not explicitly assign labels to MTM movements. As a result, the apparent advantage of XGBoost in identifying MTMs may partly reflect differences in task formulation rather than intrinsic differences in classification performance. From visual inspection, gape detection performance appears broadly comparable across methods.
A more informative benchmark would involve comparing XGBoost to an extended pipeline in which QDA-based gape detection is combined with a secondary movement-detection stage, distinguishing MTMs from periods of no movement. Such a comparison would better isolate the contribution of classifier architecture per se. Without this control analysis, the strength of the claim that XGBoost provides superior performance for behavioral decoding remains somewhat uncertain.
The revision will further clarify that, as the reviewer notes, the primary improvement in XGB classification compared to QDA (in multi-class aggregated metrics) comes specifically from its ability to classify MTMs, and that for gapes, both QDA and XGB perform on par. We will be more explicit about the fact that our goal in constructing the classifier is not to compare “classifier architecture”—not to find the very best classifier possible—but rather to take the next step by generating an instance of a classifier that performs demonstrably better on aggregated orofacial movements. We will update the manuscript to be more clear in our claims in this regard, and how the current XGB classifier can, once validated, be bootstrapped by future techniques (possibly using more informative data sources) to more fully characterize orofacial movements.
(2) The presentation of the neural ensemble analyses is considerably less comprehensive and intuitive than that of the behavioral analyses. The manuscript would benefit from more direct visualization of inferred neural state transitions. For example, plotting predicted neural states in a manner analogous to the behavioral states illustrated in Figure 6B would improve interpretability and help readers understand how neural dynamics relate temporally to behavioral changes.
In addition, the interpretation that GC ensemble dynamics drive behavioral state transitions may require further clarification. If GC activity plays a causal role in initiating behavioral changes, one might expect a consistent brain-to-behavior lag across changepoints. However, Figure 6 appears to show such lag primarily at the second transition but not at the first. This raises questions about how uniformly the proposed causal interpretation applies across state boundaries, and additional analysis or discussion is needed.
We are happy to update the figures (likely by adding another panel to Figure 6) to clearly show inference of neural state transitions, in a manner similar to how we have shown behavioral state transitions in Fig. 6B. In addition, we will do a more comprehensive job of describing and referencing earlier work in which we have unpacked these analyses in greater detail—work that makes it clear why we would predict a lag-relationship for one set of change points and not the other.
(3) The neural ensemble analyses primarily focus on constructing higher-level behavioral state variables rather than directly testing how individual movement subtypes relate to neural activity. The behavioral interpretation of the inferred state structure, therefore, remains somewhat unclear. While this approach is consistent with previous work from the authors and with broader state-transition frameworks of gustatory processing, it is not immediately obvious that this is the most informative level of analysis for the present dataset.
In particular, it would strengthen the manuscript to examine whether GC neurons or ensembles also encode lower-level motor structure, such as the occurrence of gapes or specific MTM subtypes. Demonstrating selective or mixed encoding across hierarchical levels (movement motifs versus abstract behavioral states) would help clarify the functional interpretation of the reported neural dynamics. At present, the manuscript largely assumes that GC activity reflects higher-order behavioral states without directly testing alternative representational possibilities.
The reviewer makes a good point. While previous work from the lab (Li et al. 2016) has assessed the relationship of GC activity with both the onset of gaping (i.e., the behavioral state transition) and individual gapes and found only a relationship with onset of gaping (findings that we now explicitly describe in the revision), we have not performed a similar analysis for MTMs. We will do so and add it to the paper.
(4) Because direct behavioral ground truth for intra-oral ingestive movements is difficult to obtain, MTM subtypes are inferred primarily through clustering of EMG waveform features. Although the authors demonstrate statistical separability and cross-session stability of these clusters, it remains unclear whether they correspond to discrete motor programs or instead reflect a structured partitioning of a continuous behavioral space shaped by feature selection and preprocessing choices. Perhaps some additional robustness analyses or convergent validation (e.g., alternative clustering methods, feature perturbation tests, or stronger neural and behavioral dissociations) would help clarify the biological significance of the inferred subtype structure.
We admit (in fact, we have done so in the text) that we are not yet to the point of being able to “split hairs” to this degree (although we, like R2, see that as a goal). In the meantime, we will expand the section of Results text in which we describe the fact that the clustering of behaviors is observed both in “waveform space” (Fig. 4E was generated using standardized waveforms) and “feature space” (Fig. 4 B,C, and F), and that as such the clusters are NOT simply a partitioning of continuous, unimodal behavioral space. We will report convergent results from alternative (k-means) clustering methods to further support that conclusion. Finally, we will describe (in the Discussion section) ways to more rigorously test and extend this claim in future work.
Reviewer #3 (Public review):
Some aspects of the EMG-based movement classification pipeline warrant careful interpretation. The training dataset used for classifier development is relatively small and is derived from a subset of trials in which mouth movements were clearly visible in video recordings. While the classifier performs well on this labeled dataset, it is not entirely clear how representative these labeled examples are of the full range of EMG signals present in the larger dataset.
Very good point. We will update the text to note this qualification to the reader. We will also, however, highlight the fact that our focus on a highly reliable and representative (i.e., agreed upon by 2 independent, blind scorers) subset of labels allows us to perform more targeted analyses and make more targeted interpretation in our results. And we will also be more pointed in the revision, as we have noted above, about the fact that this work is only scratching the surface of what can be accomplished in this domain, and that future work will involve STARTING with the waveforms that aren't accounted for in terms of gapes and MTMs.
The interpretation of the three identified MTM subtypes also remains somewhat tentative. The study convincingly demonstrates that distinct waveform-defined clusters exist in the EMG data, but the functional significance of these clusters as ingestive "behaviors" is less clear. As acknowledged by the authors, the specific roles of these movement patterns in the ingestion process remain speculative.
We share R3’s desire for clarity on this point—we do not wish to imply that we understand more than we understand—and will be sure to fine-tune our language to make clearer and more explicit the fact that the distinction in the roles of the MTM subtypes in ingestion at this point remains speculative.
Finally, several conclusions in the Discussion rely on relatively strong mechanistic language when describing the relationship between GC dynamics and ingestive behavior. The data clearly demonstrate a temporal association between GC state transitions and changes in the frequencies of the different MTM subtypes. However, the results primarily support the interpretation that similar cortical dynamics are associated with ingestive and rejection-related behaviors rather than definitively establishing that these behaviors "are governed by the same underlying neural mechanisms".
We will soften our language to clarify which of our Discussion suggestions are speculation, highlighting for the reader the fact that our data, while consistent with evidence suggesting a causal link between the GC transition and gaping (Li et al., 2016; Mukherjee et al., 2019), do not prove a causal neural-behavioral link for MTMs.
References:
Li, Jennifer X., et al. “Sensory Cortical Activity Is Related to the Selection of a Rhythmic Motor Action Pattern.” The Journal of Neuroscience, vol. 36, no. 20, May 2016, pp. 5596–607. DOI.org (Crossref), https://doi.org/10.1523/JNEUROSCI.3949-15.2016.
Mukherjee, Narendra, et al. “Impact of Precisely-Timed Inhibition of Gustatory Cortex on Taste Behavior Depends on Single-Trial Ensemble Dynamics.” eLife, edited by Laura L. Colgin et al., vol. 8, June 2019, p. e45968. eLife, https://doi.org/10.7554/eLife.45968.
Teaching Agents to Write Testable Code
这个正是我们要做的, 就是动态注入工具。 比如一些金融操作涉及到确定性违背,我们需要动态进行工具计算。返回危险程度
a Ralph Wiggum Loop where a hook forces
这个恰好是我们的设计核心算法 通过钩子函数进行拦截,避免agent 直接执行错误操作
Fetch experiment traces from LangSmithSpawn parallel error analysis agents → main agent synthesizes findings + suggestionsAggregate feedback and make targeted changes to the harness.
如果只是单纯的拿到输入和输出,那可以 。 但是一定不能让agent 拿到测试数据。 一旦通过测试数据,构建pattern , 优化迭代就会出问题。
System Prompt, Tools, and Middleware (our term for hooks around model and tool calls).
可以参考 - 聚焦三大核心:系统提示词、工具与中间件(本文特指围绕模型调用和工具调用的钩子机制)。
We use Harbor to orchestrate the runs. It spins up sandboxes (Daytona),
实验通过 Harbor 统筹调度全流程:自动启动 Daytona 沙箱环境、对接智能体运行循环,并完成结果校验与分数评定。 这里两个英文值得看看是啥? 回头过来看
only tweaked the harness
这里具体怎么微调的呢
Design decisions include the system prompt, tool choice, and execution flow.
系统提示词, 工具 , 整体的 workflow ; 这是harness 的工作范畴。 给了一个定义
eLife Assessment
This fundamental study provides a major contribution to our understanding of Amyotrophic Lateral Sclerosis (ALS) pathogenesis by utilizing a primate model that overcomes the historical limitations of rodent paradigms. By demonstrating the retrograde and trans-synaptic spread of pathological TDP-43 from the periphery to the spinal cord and motor cortex, the authors propose a new model for the disease spreading. The evidence supporting these findings is compelling, characterized by rigorous post-mortem histological observations. This work will be of profound interest to neuroscientists and translational researchers seeking to decode the mechanisms of systemic disease progression in ALS.
Reviewer #1 (Public review):
Summary:
The authors have used a macaque (two animals only) to follow the migration of 'seeded' TDP43 protein in neuronal pathways - thus mimicking the spread of ALS in the human CNS. Previous experiments in rodents failed to demonstrate this, posing interesting and important biological differences, possibly related to the UMN-LMN system in higher order apes and humans.
Strengths:
An important step forward.
Weaknesses:
No weaknesses were identified by this reviewer. Only 2 animals were used, but that is appropriate given the sensate status of the macaque. In the opinion of this reviewer, the results are entirely convincing.
Reviewer #2 (Public review):
Summary:
There are astonishingly few papers trying to reproduce the process of initiation and spreading that Braaks studies have suggested and postulated. The authors should be applauded for pioneering such a difficult experiment. They overexpressed the TDP-43 protein in the motor neuron pool of the brachioradialis muscle and showed that by this technique, motor neurons in this pool died, and the muscle got denervated. They had evidence of a spreading process from the spinal cord to the cortex, demonstrated by showing widespread deposits of phosphorylated TDP-43 bilaterally in the cervical cord and the motor cortex. By their experiment, they created a dying-backwards model, not a model of corticofugal spread, like that shown by Braak. No muscle weakness was observed, not even in the brachioradialis.
Strengths:
The strength of this innovative study is the fact that this spreading experiment uses the phylogenetically young connectome of primates (macaques). They also made the thought-provoking observation of spreading from the cord to the motor cortex, not the corticofugal spread model observed by Heiko Braak. This is thought-provoking because this enables the observer to compare their model with the findings in humans.
Weaknesses:
The following aspects are not a weakness but need to be better explained for the interested reader - and potentially improved in future studies for which the authors laid the foundation:
(1) Why do the authors use the brachioradialis motor neuron pool to overexpress TDP-43? More is known about other muscles and how they are embedded in the motor connectome of primates. Why not the biceps brachii or the hand extensors or - even better - the small muscles of the hand? These are known to be strongly monosynaptically connected with the motor cortex. The authors should explain this. I am unclear if there was a specific reason which I did not see or understand. In my view, the brachioradialis is not the best representative of the primate connectome, for example, to examine this model and compare it with the corticofugal spread.
(2) In the Braaks experiment, only (seemingly soluble) non-phoshorylated TDP-43 "crossed" synapses. Phosphorylated TDP-43 did not do this. The authors of this study saw phosphorylated TDP43 in motor neurons and the cortex. Is there any potential explanation for how it crosses synapses? If it really does, there is an obvious difference to the human situation which needs to be emphasized and explained (in the future).
(3) There were significant deposits of phosphorylated TDP-43 in oligodendrocytes in humans. Whilst I understand that one experiment cannot solve every question - I am curious about whether the authors saw anything in oligodendrocytes?
(4) Which was the pattern of damage? Of course, this pattern is not likely to have a monosynaptic pattern - like in humans........but was there a pattern? Did it have a physiologically meaningful basis? Was there any relation to the corticofugal monosynaptic pattern? What are the differences? The authors speak of "multiple waves". Does this mean that if this were a corticofugal model, for example, oculomotor neurons would also degenerate?
Reviewer #3 (Public review):
Summary:
In this paper by Jones and colleagues, a non-human primate model is described in which wild-type TDP-43 is expressed in the cervical spinal cord. This gave rise to loss of motor neurons in the ventral horn at that level in the cervical spinal cord. MRI of the muscles allowed to see increased intensity in the mostly affected brachioradialis muscle, suggesting this muscle becomes denervated. At the neuropathological level, TDP-43 and pTDP-43 staining in the cytoplasm is increased, not only at the specific level of the cervical spinal cord, but also at a distance.
Strengths:
A clear strength is the state-of-the art focal expression of the TDP-43 transgene at a focal site in the cervical spinal cord. This is achieved by combining a general expression of a flipped loxP flanked TDP-43 vector using AAV9 intrathecal administration, followed by an intramuscular AAV2 hSyn CRE-TdTomato vector in the brachioradialis muscle in order to induce focal recombination and expression of TDP-43 in motor neurons innervating this muscle on one side.
Another strength is the non-human primate background, which is much closer to the human situation.
Weaknesses:
Given the complexity and cost of the model, the n is very low.
The design of the experiments and the results shown about the toxicity induced by this focal TDP-43 expression do not allow us to conclude that it is a good model for ALS for several reasons. It is not clear that the TDP-43 overexpression results in spreading weakness or in spreading motor neuron loss. The neuropathological changes described suggest that there is a kind of stress response, which extends to regions away from the site of primary damage, but more is needed to provide convincing evidence that there is spreading of disease pathology reminiscent of human ALS.
Reviewer #4 (Public review):
Summary:
In this manuscript, the authors present data describing the development of a model of ALS in rhesus macaques. They use a viral intersectional model to overexpress TDP-43 in a population of motor neurons and then study the spread of the pathology about 7 months later. They demonstrate that both the cervical spinal cord and motor cortex (new and old M1) are full of TDP-43, suggesting that the pathology spreads from the single motor pool to presumably related neurons.
Strengths:
This is a super-important study in two main ways:
(1) This could be the birth of a really important model, one that is really needed for making progress in understanding ALS and the development of therapeutics. There are shortfalls with all the rodent models. Models dependent on cell cultures are superb for understanding cell-autonomous processes, but miss out on connectivity, particularly the long-range connectivity. Organoids may ultimately prove to be beneficial, but they would need cortex, spinal cord, and muscle, and translatability from them is not assured. So a NHP model is needed, and this may be it. Furthermore, the Methods are meticulously described and will undoubtedly facilitate reproducibility.
(2) The concept of the spread of pathology has been proposed for some time, I think, based initially on the detailed clinical observations of Ravits and colleagues. The authors have looked at this directly and provide supporting evidence for this interesting hypothesis. They show spread locally and contralaterally in the spinal cord (although a figure would be nice) and to the motor cortex.
Taking only these 2 points into account is more than sufficient for me to be enthusiastic about this work.
Weaknesses:
I'd like to make a couple of points that if addressed, could, in my view, help the authors strengthen this work.
(1) We don't know how many MNs were transduced by the rAAV. There was no tdTom expression, for whatever reason. The authors show an image of a control experiment with a single MN transduced, but there should be a red motor pool, at least in the control experiments. The impression that I get is that very few were transduced, and, in my mind, this makes the findings even more interesting - maybe you don't need many "starter" MNs.
(2) Continuing on this point, this leads the authors to conclude that all BR MNs have died. They support this by the reduced MN count (see point 3). Firstly, do we know how many BR MNs there are in the rhesus macaque, and does the reduction seen correspond to this number? Secondly, and more importantly, the muscle looks normal on MRI at 28 weeks - it does not look like a denervated muscle. The authors state that it has maybe been reinnervated, but by what, if all the BR MNs are dead? This does not seem like a plausible explanation to me. Muscle histology, NMJs, and fibre typing would have been useful to understand what's going on with the MNs. (And electrophysiology would have been wonderful, but beyond the scope of this study.)
(3) Some MN biologists, like me, fuss a lot about how to count MNs, which is almost as difficult as counting the number of angels on the head of a pin. Every method has its problems. Focusing on the two methods here: (a) ChAT immunohistochemistry is pretty good in healthy states, but we don't know what happens to ChAT expression in different diseases, particularly when you have a new model. If its expression is decreased, then it is not a good marker for MNs; (b) Identifying MNs based on the size and morphology of neurons in the ventral horn is also insufficient. For example, ~30% of neurons in a typical pool are small gamma MNs, and a significant proportion (depending on the muscle) of the remainder will be small alpha MNs. So what one is counting is, at best, the large alpha MNs, not all the MNs in a pool. And in ALS, it's these largest MNs that are affected at the earliest stages. The small ones might be fine. So results will be skewed. (Hence, it would be interesting to see if the muscle had a higher proportion of Type I fibres after being reinnervated by S-type MNs.)
(4) Statistics. These are complex experiments looking at the spread of a disease. The experimental unit is therefore the monkey, n=2. In each monkey, multiple sections are analysed, which are key technical replicates and often summative. For example, do we care about the average cell number in Figures 4D, E, 5 I, J or 6G, H, or rather the total cell number? Do the error bars mean anything? To be clear, I am by no means minimising the importance of the overall convincing findings. But I do not think this statistical analysis is particularly meaningful.
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors have used a macaque (two animals only) to follow the migration of 'seeded' TDP43 protein in neuronal pathways - thus mimicking the spread of ALS in the human CNS. Previous experiments in rodents failed to demonstrate this, posing interesting and important biological differences, possibly related to the UMN-LMN system in higher order apes and humans.
Strengths:
An important step forward.
Weaknesses:
No weaknesses were identified by this reviewer. Only 2 animals were used, but that is appropriate given the sensate status of the macaque. In the opinion of this reviewer, the results are entirely convincing.
Reviewer #2 (Public review):
Summary:
There are astonishingly few papers trying to reproduce the process of initiation and spreading that Braaks studies have suggested and postulated. The authors should be applauded for pioneering such a difficult experiment. They overexpressed the TDP-43 protein in the motor neuron pool of the brachioradialis muscle and showed that by this technique, motor neurons in this pool died, and the muscle got denervated. They had evidence of a spreading process from the spinal cord to the cortex, demonstrated by showing widespread deposits of phosphorylated TDP-43 bilaterally in the cervical cord and the motor cortex. By their experiment, they created a dying-backwards model, not a model of corticofugal spread, like that shown by Braak. No muscle weakness was observed, not even in the brachioradialis.
Strengths:
The strength of this innovative study is the fact that this spreading experiment uses the phylogenetically young connectome of primates (macaques). They also made the thought-provoking observation of spreading from the cord to the motor cortex, not the corticofugal spread model observed by Heiko Braak. This is thought-provoking because this enables the observer to compare their model with the findings in humans.
Weaknesses:
The following aspects are not a weakness but need to be better explained for the interested reader - and potentially improved in future studies for which the authors laid the foundation:
(1) Why do the authors use the brachioradialis motor neuron pool to overexpress TDP-43? More is known about other muscles and how they are embedded in the motor connectome of primates. Why not the biceps brachii or the hand extensors or - even better - the small muscles of the hand? These are known to be strongly monosynaptically connected with the motor cortex. The authors should explain this. I am unclear if there was a specific reason which I did not see or understand. In my view, the brachioradialis is not the best representative of the primate connectome, for example, to examine this model and compare it with the corticofugal spread.
The brachioradialis muscle was chosen primarily for reasons of animal welfare; our concern when designing the experiments was that the muscle we chose for injection might become very wasted and weak before the experiment had been completed. If we had injected a hand muscle, this would have affected manipulation, feeding and grooming behaviours, whereas had we injected biceps brachii or forearm extensors, this would have affected more important behaviours requiring strength for body support in the home cage (e.g. climbing, swinging, etc.). The advantage of choosing brachioradialis is that there is some functional redundancy; in macaques, compared to biceps brachii, brachioradialis has a relatively minor role in elbow flexion and supination of the forearm. We therefore reasoned that there should be physiological compensation for any weakness in brachioradialis, and thus minimal effects on normal behaviour.
A secondary practical consideration was the importance of good quality MR imaging of the injected muscle and the positioning of the focussing coil; because of the physical constraints related to the monkey sitting in our narrow-bore scanner, the forearm muscles were the optimal choice.
With reference to the ‘primate connectome’, whilst hand muscles are known to have strong cortico-motoneuronal connections, we have shown previously that monosynaptic corticomotoneuronal connections are as strong in muscles innervated by the deep radial nerve (like brachioradialis) as in intrinsic hand muscles (Witham et al, 2016).
Finally, for the purposes of these experiments, all we required was a method for inoculating TDP-43 into a motor neuron pool within the spinal cord, without direct surgical trauma to the spinal cord. Our aim was to test the hypothesis that extracellular TDP-43 is sufficient to cause spreading neuronal changes in macaque, similar to those observed in human ALS/MND; our aim was not to replicate the actual pattern of human MND observed clinically.
These points will be addressed in a revised version of the manuscript.
(2) In the Braaks experiment, only (seemingly soluble) non-phoshorylated TDP-43 "crossed" synapses. Phosphorylated TDP-43 did not do this. The authors of this study saw phosphorylated TDP43 in motor neurons and the cortex. Is there any potential explanation for how it crosses synapses? If it really does, there is an obvious difference to the human situation which needs to be emphasized and explained (in the future).
To clarify, there was no evidence of phosphorylated TDP-43 crossing synapses. It is more likely that excess non-phosphorylated TDP-43 crossed synapses, and that this then subsequently led to TDP-43 phosphorylation.
(3) There were significant deposits of phosphorylated TDP-43 in oligodendrocytes in humans. Whilst I understand that one experiment cannot solve every question - I am curious about whether the authors saw anything in oligodendrocytes?
We have not looked at this.
(4) Which was the pattern of damage? Of course, this pattern is not likely to have a monosynaptic pattern - like in humans........but was there a pattern? Did it have a physiologically meaningful basis? Was there any relation to the corticofugal monosynaptic pattern? What are the differences? The authors speak of "multiple waves". Does this mean that if this were a corticofugal model, for example, oculomotor neurons would also degenerate?
The description of ‘multiple waves’ in paragraph 2 of the discussion section is entirely hypothetical, based on the assumption that there are different mechanisms by which TDP-43 spreads through the nervous system, from slow local spread by diffusion to more rapid long-range axonal spread to widely separated regions. For the neuropathological staging analysis, we therefore looked at different brain regions (hypoglossal nuclei, reticular formation, inferior olives, frontal cortex, temporal cortex and hippocampal formation). This analysis only showed loss of motor neurons in the spinal cord ipsilateral to the side of the muscle injections, in segments consistent with the location of brachioradialis motoneurons. We did not demonstrate a ‘pattern of damage’ as described in humans in our experiments because this is a pre-symptomatic pre-clinical model, with no established ‘damage’ from each wave. We speculate that this is because animals were terminated too early in the disease process.
However, whilst there was no established neuronal degeneration outside the cervical spinal cord, the observation that there were more pTDP-43 positive Betz cells in left (contralateral to the brachioradialis injection) New M1 than Old M1 (see Figure 6I and J) would support spread via monosynaptic connections to motoneurons; New M1 is where most monosynaptic cortico-motoneuronal connections originate.
Reviewer #3 (Public review):
Summary:
In this paper by Jones and colleagues, a non-human primate model is described in which wild-type TDP-43 is expressed in the cervical spinal cord. This gave rise to loss of motor neurons in the ventral horn at that level in the cervical spinal cord. MRI of the muscles allowed to see increased intensity in the mostly affected brachioradialis muscle, suggesting this muscle becomes denervated. At the neuropathological level, TDP-43 and pTDP-43 staining in the cytoplasm is increased, not only at the specific level of the cervical spinal cord, but also at a distance.
Strengths:
A clear strength is the state-of-the art focal expression of the TDP-43 transgene at a focal site in the cervical spinal cord. This is achieved by combining a general expression of a flipped loxP flanked TDP-43 vector using AAV9 intrathecal administration, followed by an intramuscular AAV2 hSyn CRE-TdTomato vector in the brachioradialis muscle in order to induce focal recombination and expression of TDP-43 in motor neurons innervating this muscle on one side.
Another strength is the non-human primate background, which is much closer to the human situation.
Weaknesses:
Given the complexity and cost of the model, the n is very low.
As is common in most studies in non-human primates, we have carried out all statistical analysis within one animal (e.g. the comparison of motoneuron numbers between left and right cord). We then show that results are reproducible in two animals. Although the number of animals is lower than in a typical rodent study, we see this as an advantage of the model, adhering to the 3Rs principle of ‘reduction’.
The design of the experiments and the results shown about the toxicity induced by this focal TDP-43 expression do not allow us to conclude that it is a good model for ALS for several reasons. It is not clear that the TDP-43 overexpression results in spreading weakness or in spreading motor neuron loss. The neuropathological changes described suggest that there is a kind of stress response, which extends to regions away from the site of primary damage, but more is needed to provide convincing evidence that there is spreading of disease pathology reminiscent of human ALS.
As already noted in our response to Reviewer 2 (point 1), animal welfare is an important consideration when designing these complex experiments in primates. We could not therefore justify allowing the animals to survive until extensive wasting and weakness were evident, recapitulating the human disease.
The model developed in these experiments is therefore a pre-symptomatic pre-clinical model, in which animals are terminated before pathology leading to widespread motor neuron loss is evident. At post mortem we do have evidence of motor neuron loss in the segments supplying brachioradialis (C4-C8).
Stress of various forms, including blunt trauma (e.g. Anderson et al, 2021), stab/electrode insertion injury (e.g. Zambusi et al, 2022), chemical (e.g. arsenite) exposure (e.g. Huang et al, 2024), or hypoxia (Marcus et al, 2021) can result in pathological nucleocytoplasmic translocation of TDP-43. In our model, there was no direct trauma to the brain or spinal cord ante mortem, excluding one major cause of tissue stress. Hypoxia during the process of euthanasia is possible, but we would expect there would not be enough time before death for this to manifest as TDP-43 translocation. In the literature TDP-43 translocation due to stress is diffuse; we have demonstrated that in our model the TDP-43 pathology is not diffuse but selective. For example, there was no evidence of disease in the oculomotor nuclei; in the primary motor cortex (M1) there are significantly more pathological changes in the evolutionarily younger ‘NewM1’ compared to the neighbouring ‘OldM1’.
It is therefore improbable that our findings could be explained by ‘a kind of stress response’. Our findings are better explained by spread of the TDP-43 protein.
Reviewer #4 (Public review):
Summary:
In this manuscript, the authors present data describing the development of a model of ALS in rhesus macaques. They use a viral intersectional model to overexpress TDP-43 in a population of motor neurons and then study the spread of the pathology about 7 months later. They demonstrate that both the cervical spinal cord and motor cortex (new and old M1) are full of TDP-43, suggesting that the pathology spreads from the single motor pool to presumably related neurons.
Strengths:
This is a super-important study in two main ways:
(1) This could be the birth of a really important model, one that is really needed for making progress in understanding ALS and the development of therapeutics. There are shortfalls with all the rodent models. Models dependent on cell cultures are superb for understanding cell-autonomous processes, but miss out on connectivity, particularly the long-range connectivity. Organoids may ultimately prove to be beneficial, but they would need cortex, spinal cord, and muscle, and translatability from them is not assured. So a NHP model is needed, and this may be it.
Furthermore, the Methods are meticulously described and will undoubtedly facilitate reproducibility.
(2) The concept of the spread of pathology has been proposed for some time, I think, based initially on the detailed clinical observations of Ravits and colleagues. The authors have looked at this directly and provide supporting evidence for this interesting hypothesis. They show spread locally and contralaterally in the spinal cord (although a figure would be nice) and to the motor cortex.
Taking only these 2 points into account is more than sufficient for me to be enthusiastic about this work.
Weaknesses:
I'd like to make a couple of points that if addressed, could, in my view, help the authors strengthen this work.
(1) We don't know how many MNs were transduced by the rAAV. There was no tdTom expression, for whatever reason. The authors show an image of a control experiment with a single MN transduced, but there should be a red motor pool, at least in the control experiments. The impression that I get is that very few were transduced, and, in my mind, this makes the findings even more interesting - maybe you don't need many "starter" MNs.
Unfortunately, we cannot know how many motoneurons were transduced.
However, the reviewer may be correct, that it is actually only a small fraction of the brachioradialis pool. This is supported by the evidence for rather focal denervation seen on MRI.
(2) Continuing on this point, this leads the authors to conclude that all BR MNs have died. They support this by the reduced MN count (see point 3). Firstly, do we know how many BR MNs there are in the rhesus macaque, and does the reduction seen correspond to this number? Secondly, and more importantly, the muscle looks normal on MRI at 28 weeks - it does not look like a denervated muscle. The authors state that it has maybe been reinnervated, but by what, if all the BR MNs are dead? This does not seem like a plausible explanation to me. Muscle histology, NMJs, and fibre typing would have been useful to understand what's going on with the MNs. (And electrophysiology would have been wonderful, but beyond the scope of this study.)
To clarify, we did not conclude that all brachioradialis motor neurons had died, rather that all transfected brachioradialis motor neurons pool had died. As noted above, when these cells die and the muscle is denervated, the MRI signal changes occupy only a small volume of the muscle and are transient. We would not expect to see long-term MRI changes in muscle anatomy after this limited denervation-reinnervation event.
Analysis of muscle histology, including fibre typing, is outwith the scope of this initial paper reporting the model; we hope that this will form the basis of a future publication.
(3) Some MN biologists, like me, fuss a lot about how to count MNs, which is almost as difficult as counting the number of angels on the head of a pin. Every method has its problems. Focusing on the two methods here: (a) ChAT immunohistochemistry is pretty good in healthy states, but we don't know what happens to ChAT expression in different diseases, particularly when you have a new model. If its expression is decreased, then it is not a good marker for MNs; (b) Identifying MNs based on the size and morphology of neurons in the ventral horn is also insufficient. For example, ~30% of neurons in a typical pool are small gamma MNs, and a significant proportion (depending on the muscle) of the remainder will be small alpha MNs. So what one is counting is, at best, the large alpha MNs, not all the MNs in a pool. And in ALS, it's these largest MNs that are affected at the earliest stages. The small ones might be fine. So results will be skewed. (Hence, it would be interesting to see if the muscle had a higher proportion of Type I fibres after being reinnervated by S-type MNs.)
This is an interesting point, and we agree that each method used to quantify MN number carries its own limitations. The problem of MN identification is heightened in a MND-like pathological state, especially when considering evidence of reduced ChAT activity in spinal motoneurons in end-stage disease in post mortem human samples (Oda et al, 1995), and more recent evidence from Casas et al. (2013), who demonstrated early presymptomatic reduction in ChAT expression in SOD1G93A mice. It is important to note that this was a modest reduction, not complete abolition of signal (76% of control levels). ChAT immunoreactivity was still present and motor neurons were still identifiable as ChAT-positive at this pre-clinical stage of disease. As counts in our study were performed based on detecting ChAT in cells, it seems unlikely that we would miss cells. However, we cannot rule this out. If indeed this did occur, it would mean that the reduced motoneuron counts which we observed reflect not only cell death, but also profound motoneuron dysfunction which is presumably the proximal precursor to cell death.
We acknowledge that size-based criteria applied to ChAT-positive neurons will preferentially capture large alpha motor neurons, and that gamma motor neurons and small alpha motor neurons are likely underrepresented in our counts. Our counts therefore reflect the large alpha motor neuron population rather than the total motor neuron pool. We believe that this is not a critical limitation in the context of the present study. Large alpha motor neurons are the population of primary pathological interest in ALS and related MND, being the earliest and most severely affected subtype. The selective vulnerability of fast-fatigable large alpha motor neurons in ALS is well established, and their preferential loss is the defining feature of disease progression in both human post mortem tissue and rodent models (Lalancette-Hébert et al., 2016). In this respect, our size threshold selects for precisely the population whose degeneration is most relevant to the disease phenotype we are modelling.
We intend to include comments on these important points in the revised version of the manuscript.
In response to the final point regarding muscle histology and proportions of Type I fibres, as stated above, reporting of muscle histology, including fibre typing, is planned for a separate publication.
(4) Statistics. These are complex experiments looking at the spread of a disease. The experimental unit is therefore the monkey, n=2. In each monkey, multiple sections are analysed, which are key technical replicates and often summative. For example, do we care about the average cell number in Figures 4D, E, 5 I, J or 6G, H, or rather the total cell number? Do the error bars mean anything? To be clear, I am by no means minimising the importance of the overall convincing findings. But I do not think this statistical analysis is particularly meaningful.
Here, the experimental unit is the tissue slice, mounted on a slide for histological analysis, and not the monkey. All statistical comparisons are made within a single animal. We then show that the findings can be replicated in two animals, both of which show significant results. This is standard approach taken in primate neuroscience, given the need to reduce animal numbers to the minimum consistent with producing convincing results.
AbstractBackground Downloading and reanalyzing the existing single-cell RNA sequencing (scRNA-seq) data provides an efficient choice to gain clues and new insights. However, no tool can fetch the diverse scRNA-seq data types (raw data, count matrix, and processed object) distributed in various repositories, process and load the downloaded data to R, convert formats between scRNA-seq objects, and benchmark the format conversion tools.Findings Here, we present GEfetch2R, an R package with Docker image to (i) download diverse scRNA-seq data types, including raw data (SRA and ENA), count matrices (GEO, UCSC Cell Browser, and PanglaoDB), and processed objects (Zenodo, CELLxGENE, and HCA); (ii) process the downloaded data, load output/downloaded count matrices and annotations to R (SeuratObject/DESeqDataSet), filter the SeuratObject based on cell metadata and genes, and merge multiple SeuratObjects if applicable; (iii) convert formats between the widely used scRNA-seq objects, including SeuratObject, AnnData, SingleCellExperiment, CellDataSet/cell_data_set, and loom, and benchmark format conversion tools in terms of information kept, usability, running time, and scalability to guide the tool selection. Furthermore, GEfetch2R can also download, process, and load bulk RNA-seq raw data (SRA and ENA) and count matrices (GEO) to R (DESeqDataSet).Conclusions GEfetch2R is an R package dedicated to facilitating researchers to access and explore the existing gene expression data from various public repositories. It can function as a data downloader (supports all three scRNA-seq and two bulk RNA-seq data types), a data processor (processes and loads the output/downloaded count matrices and annotations to R), and an object format converter (between the widely used scRNA-seq objects).
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag039), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2:
General Comments This manuscript introduces a tool named HVRLocator, designed to address the issue of missing or non-standard metadata in 16S rRNA sequencing data found in public databases such as the SRA. The tool identifies amplicon regions by aligning sequences to a reference genome and attempts to detect the presence of primers using a machine learning model. This is a subject with significant practical value, particularly for conducting large-scale meta-analyses. However, there are still many issues regarding methodological rigor, the depth of validation, and comparisons with existing tools that require further clarification by the authors. Major Comments 1. Concerns regarding the singularity of the reference sequence The authors mention aligning sequences to a single Escherichia coli (J01859.1) reference genome to determine start and end positions. Is a single E. coli reference sufficient to cover Archaea or bacterial phyla that are distantly related to Proteobacteria, which may be present in environmental samples (e.g., soil, ocean)? For taxa with significant length variations or insertions/deletions (Indels), could forced alignment to the E. coli reference lead to misjudgment of start/end positions? Have the authors evaluated the impact on accuracy if a more universal reference database (such as representative sequences from SILVA or Greengenes) were used? 2. Rationality of the primer detection model (Random Forest based on Quality Scores) The authors developed a Random Forest model to predict primer presence by analyzing the quality score distribution of the first 1,000 reads. Primer detection is typically based on the sequence itself rather than quality scores. Can the authors explain why quality scores were chosen as features? Sequencing quality scores are influenced by technical factors such as sequencer status, reagent batches, and run cycles, which have no direct biological correlation with the presence of primers. Is there a risk that this model is "overfitting" specific sequencing platforms or datasets? Since the reads are already downloaded, why not directly use degenerate primer sequence matching (e.g., using Cutadapt or SeqKit logic) to determine primer presence? This seems to be a more direct and accurate method. 3. Verification of accuracy claims In the validation section, the authors claim to achieve 100% accuracy on certain datasets. In bioinformatics tool development, a claim of 100% accuracy is often a red flag. Have the authors manually checked those samples marked as "correct" by the model that might suffer from edge effects or borderline cases? 4. Dataset imbalance in the Random Forest model For the Random Forest model, the authors used 882 samples with primers and 8,940 samples without primers for training. Such an extremely imbalanced dataset, even with stratified sampling, may cause the model to be biased towards the majority class. 5. Comparison with existing tools The manuscript mentions that no tool has been designed for this specific purpose, but this may overlook some existing general-purpose tools or scripts. Many pipelines (such as certain plugins in QIIME 2, USEARCH, etc.) possess functionalities to identify primers or evaluate amplicon regions. The authors should discuss how their tool compares to these existing workflows. Minor Comments 1. Confusion regarding processing speed metrics The abstract mentions a processing speed of "0.147 samples per minute", but later the text mentions "6.5 samples per minute" and "one sample every 0.147 minutes". There is confusion regarding units and values in these three descriptions (is it samples per minute or minutes per sample?). Please unify and correct these data to ensure consistency. 2. Usage of fastq-dump The use of fastq-dump is mentioned. The SRA Toolkit's fastq-dump is relatively slow and has largely been superseded by fasterq-dump for efficiency. Why did the authors not use the more efficient fasterq-dump? 3. Definition of "Standardized metadata" The term "standardized metadata" is used frequently. Please explicitly define what constitutes "standard" metadata in the context of this tool within the text. 4. Robustness and error handling The results section mentions that some samples failed due to "NCBI portal-related issues". Does this imply the tool lacks breakpoint resumption or retry mechanisms? Given that network fluctuations are common during large-scale downloads, how is the tool's robustness demonstrated? 5. Output confidence intervals The output file contains "TRUE/FALSE" and a probability score. For samples where the probability score is at a critical threshold (e.g., around 0.5), does the tool provide an "uncertain" tag, or does it force a classification? It is suggested to add an indicator for ambiguous ranges.
AbstractBackground Downloading and reanalyzing the existing single-cell RNA sequencing (scRNA-seq) data provides an efficient choice to gain clues and new insights. However, no tool can fetch the diverse scRNA-seq data types (raw data, count matrix, and processed object) distributed in various repositories, process and load the downloaded data to R, convert formats between scRNA-seq objects, and benchmark the format conversion tools.Findings Here, we present GEfetch2R, an R package with Docker image to (i) download diverse scRNA-seq data types, including raw data (SRA and ENA), count matrices (GEO, UCSC Cell Browser, and PanglaoDB), and processed objects (Zenodo, CELLxGENE, and HCA); (ii) process the downloaded data, load output/downloaded count matrices and annotations to R (SeuratObject/DESeqDataSet), filter the SeuratObject based on cell metadata and genes, and merge multiple SeuratObjects if applicable; (iii) convert formats between the widely used scRNA-seq objects, including SeuratObject, AnnData, SingleCellExperiment, CellDataSet/cell_data_set, and loom, and benchmark format conversion tools in terms of information kept, usability, running time, and scalability to guide the tool selection. Furthermore, GEfetch2R can also download, process, and load bulk RNA-seq raw data (SRA and ENA) and count matrices (GEO) to R (DESeqDataSet).Conclusions GEfetch2R is an R package dedicated to facilitating researchers to access and explore the existing gene expression data from various public repositories. It can function as a data downloader (supports all three scRNA-seq and two bulk RNA-seq data types), a data processor (processes and loads the output/downloaded count matrices and annotations to R), and an object format converter (between the widely used scRNA-seq objects).
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag039), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 1:
The manuscript presents GEfetch2R, an R package (with a Docker image) that fetches scRNA-seq and bulk RNA-seq data from multiple repositories, loads the data into R objects, and benchmarks format-conversion tools. The problem addressed is real and important; the implementation appears practical and well documented. I see strong potential for adoption. Major comments
1) Robust cross-repository support for .RData files While GEfetch2R lists rdata among supported extensions for Zenodo and HCA, many GEO submissions and other archives still provide processed data exclusively as .RData, often bundling matrices and metadata in heterogeneous objects. Please add an explicit, repository-agnostic .RData ingestion path with: (i) automatic object introspection, (ii) standardized extraction of matrices/metadata, (iii) graceful fallbacks with clear diagnostics for non-standard objects, and (iv) reproducible examples. This materially increases real-world coverage.
2) Large-scale, automated evaluation on ~100 scRNA-seq datasets Beyond the single COVID-19 application and the conversion benchmark, please include a systematic "fetch success-rate" study across ~100 GEO scRNA-seq datasets. Provide a Dockerized workflow (publicly available) that periodically attempts end-to-end retrieval (raw / count / processed) and reports success/failure rates stratified by repository and file type, with resource/time footprints and categorized failure causes. Given heterogeneous deposition practices, even ~50% overall success would be informative.
3)Another very important point is to provide a Dockerfile together with the Docker. Minor revisions
"altas" → atlas (COVID-19 section title/caption).
"Count maatrix" → Count matrix (Figure 3 caption/table column).
"PanglanDB" → PanglaoDB (tables).
Consistency: keep SeuratObject (not "Seurat object"); keep rds lowercase;
AbstractBackground Amplicon sequencing of the 16S rRNA gene is widely used to assess microbial diversity due to its cost-effectiveness and efficiency. However, public 16S rRNA datasets often lack standardized metadata, particularly information on the sequenced hypervariable regions or primers used, which are critical for accurate analysis and data reuse. To address this, we present the HVRLocator, a computational tool that reliably identifies sequenced hypervariable regions, enhancing metadata quality and enabling more robust large-scale microbiome studies.Results The HVRLocator tool processed samples at an average rate of 0.147 per minute. Validation confirmed 100% accuracy in predicting alignment positions, correctly matching sequences to the expected primer regions based on literature. We demonstrated how to use the tool to select appropriate and comparable sequences for building a global bacterial database from V4 region amplicons of the 16S rRNA gene. Using HVRLocator, we selected 36,217 valid samples out of 45,882 runs, enabling us to identify cases where metadata incorrectly labeled sequences as targeting the V4 region.Conclusion Even when metadata is available, it can be inaccurate or misleading. HVRLocator offers a reliable and efficient method to identify the exact hypervariable sequenced region, ensuring accurate processing of large-scale 16S rRNA amplicon data. By bypassing inconsistent metadata and literature, it streamlines data curation and enhances the reliability of microbial studies, syntheses, and meta-analyses. Its use is essential for critically evaluating published data and enabling accurate and reproducible research in microbial ecology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag040), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2:
General Comments This manuscript introduces a tool named HVRLocator, designed to address the issue of missing or non-standard metadata in 16S rRNA sequencing data found in public databases such as the SRA. The tool identifies amplicon regions by aligning sequences to a reference genome and attempts to detect the presence of primers using a machine learning model. This is a subject with significant practical value, particularly for conducting large-scale meta-analyses. However, there are still many issues regarding methodological rigor, the depth of validation, and comparisons with existing tools that require further clarification by the authors. Major Comments 1. Concerns regarding the singularity of the reference sequence The authors mention aligning sequences to a single Escherichia coli (J01859.1) reference genome to determine start and end positions. Is a single E. coli reference sufficient to cover Archaea or bacterial phyla that are distantly related to Proteobacteria, which may be present in environmental samples (e.g., soil, ocean)? For taxa with significant length variations or insertions/deletions (Indels), could forced alignment to the E. coli reference lead to misjudgment of start/end positions? Have the authors evaluated the impact on accuracy if a more universal reference database (such as representative sequences from SILVA or Greengenes) were used? 2. Rationality of the primer detection model (Random Forest based on Quality Scores) The authors developed a Random Forest model to predict primer presence by analyzing the quality score distribution of the first 1,000 reads. Primer detection is typically based on the sequence itself rather than quality scores. Can the authors explain why quality scores were chosen as features? Sequencing quality scores are influenced by technical factors such as sequencer status, reagent batches, and run cycles, which have no direct biological correlation with the presence of primers. Is there a risk that this model is "overfitting" specific sequencing platforms or datasets? Since the reads are already downloaded, why not directly use degenerate primer sequence matching (e.g., using Cutadapt or SeqKit logic) to determine primer presence? This seems to be a more direct and accurate method. 3. Verification of accuracy claims In the validation section, the authors claim to achieve 100% accuracy on certain datasets. In bioinformatics tool development, a claim of 100% accuracy is often a red flag. Have the authors manually checked those samples marked as "correct" by the model that might suffer from edge effects or borderline cases? 4. Dataset imbalance in the Random Forest model For the Random Forest model, the authors used 882 samples with primers and 8,940 samples without primers for training. Such an extremely imbalanced dataset, even with stratified sampling, may cause the model to be biased towards the majority class. 5. Comparison with existing tools The manuscript mentions that no tool has been designed for this specific purpose, but this may overlook some existing general-purpose tools or scripts. Many pipelines (such as certain plugins in QIIME 2, USEARCH, etc.) possess functionalities to identify primers or evaluate amplicon regions. The authors should discuss how their tool compares to these existing workflows. Minor Comments 1. Confusion regarding processing speed metrics The abstract mentions a processing speed of "0.147 samples per minute", but later the text mentions "6.5 samples per minute" and "one sample every 0.147 minutes". There is confusion regarding units and values in these three descriptions (is it samples per minute or minutes per sample?). Please unify and correct these data to ensure consistency. 2. Usage of fastq-dump The use of fastq-dump is mentioned. The SRA Toolkit's fastq-dump is relatively slow and has largely been superseded by fasterq-dump for efficiency. Why did the authors not use the more efficient fasterq-dump? 3. Definition of "Standardized metadata" The term "standardized metadata" is used frequently. Please explicitly define what constitutes "standard" metadata in the context of this tool within the text. 4. Robustness and error handling The results section mentions that some samples failed due to "NCBI portal-related issues". Does this imply the tool lacks breakpoint resumption or retry mechanisms? Given that network fluctuations are common during large-scale downloads, how is the tool's robustness demonstrated? 5. Output confidence intervals The output file contains "TRUE/FALSE" and a probability score. For samples where the probability score is at a critical threshold (e.g., around 0.5), does the tool provide an "uncertain" tag, or does it force a classification? It is suggested to add an indicator for ambiguous ranges.
AbstractBackground Amplicon sequencing of the 16S rRNA gene is widely used to assess microbial diversity due to its cost-effectiveness and efficiency. However, public 16S rRNA datasets often lack standardized metadata, particularly information on the sequenced hypervariable regions or primers used, which are critical for accurate analysis and data reuse. To address this, we present the HVRLocator, a computational tool that reliably identifies sequenced hypervariable regions, enhancing metadata quality and enabling more robust large-scale microbiome studies.Results The HVRLocator tool processed samples at an average rate of 0.147 per minute. Validation confirmed 100% accuracy in predicting alignment positions, correctly matching sequences to the expected primer regions based on literature. We demonstrated how to use the tool to select appropriate and comparable sequences for building a global bacterial database from V4 region amplicons of the 16S rRNA gene. Using HVRLocator, we selected 36,217 valid samples out of 45,882 runs, enabling us to identify cases where metadata incorrectly labeled sequences as targeting the V4 region.Conclusion Even when metadata is available, it can be inaccurate or misleading. HVRLocator offers a reliable and efficient method to identify the exact hypervariable sequenced region, ensuring accurate processing of large-scale 16S rRNA amplicon data. By bypassing inconsistent metadata and literature, it streamlines data curation and enhances the reliability of microbial studies, syntheses, and meta-analyses. Its use is essential for critically evaluating published data and enabling accurate and reproducible research in microbial ecology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag040), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 1:
Metabarcoding data are accumulating rapidly. This paper makes a very valuable contribution to the automated extraction and curation of metabarcoding data and should be of great value in facilitating the re-use of existing data and the construction of custom databases based on these. I have not tested or tried to install the software myself, as the manuscript provided sufficient detail to enable me to assess the tool
General comments
The manuscript is written entirely in terms of "bacteria" and aligns amplicons to an E. coli model sequence. This is reasonable, but there should certainly be some acknowledgement of Archaea and ideally some mention of Eukaryotes too. These are probably things for the discussion section of this manuscript, but the authors may wish to consider whether a future version of the program could contain options to use model Archaea and Eukaryote sequences as alternatives to the E. coli model. It would also be helpful to assess how the program with its E. coli model deals with sequence data from Archaea, Eukaryotes (including mitochondria) and bacteria that are very divergent from E. coli. The methods section does not contain details of software used to generate the figures, or whether these figures are produced by "the pipeline" or by separate analysis of the .txt file that the pipeline produces. I suspect that it is that latter, in which case making the authors should make the scripts used available - as well as providing complete documentation of what has been done, this is likely to increase use made of the tool. And it would be helpful to include an output file in the supplementary materials
Specific comments
Line 64 "however the integration of these data in light of processing metadata" - not clear
Line 67-8 "though bacterial diversity increases linearly with amplicon length". Needs re-wording. The number of ASVs will increase with amplicon length, but the actual bacterial diversity in a sample is constant.
Line 79 "Wasimuddin and colleagues" should be "Wasimuddin et al". More generally, check that citations conform with journal house style
Line 79-82 "For example, Wasimuddin and colleagues [8] found that compared to three other primer sets targeting different regions, the primer pair targeting the V4 hypervariable region of the 16S rRNA gene produced the highest estimates of species richness and diversity across various sample types" There are three issues here: 1) different primer pairs vary in their coverage and bias, so different primers targeting the same variable region will produce different numbers of ASVs 2) Even with complete coverage and the absence of bias, different variable regions will generate different numbers of ASVs as a result of differences in length and rate of evolution between variable regions (and differences in the number of ASVs that are clustered into OTUs at a particular sequence similarity threshold 3) The relationship between ASVs or OTUs and "species" is not straightforward (Edgar, 2018). At minimum "species" should be replace with ASV or OTU (whichever Wasimuddin et al used)
Line 89-90 "as bacterial diversity and taxonomic resolution linearly increase with target sequence length [12]." Overlaps with statement made in line 67-8, and the same issue applies here.
Lines 167-170. The output file contains (amongst other things) "Predicted HV region Start/End: Predicted hypervariable (HV) region based on the median alignment start and end positions across all reads, inferred from literature on conserved and hypervariable regions of the 16S rRNA gene (Brosius et al., 1978; Yang et al., 2016)". This implies that the program predicts a single variable region for each study - I am not clear what this column will contain for amplicons that contain more than one variable region, although columns 11-19 indicate that the program identifies the presence/absence of each of the 9 HV regions. My guess is that the authors are using "HV region" in two different sense: 1) Its usual meaning of one region out of V1 to V9 2) The sequence from the beginning of the first of the nine variable regions the amplicon includes to the end of the last. It would also be helpful to indicate whether the sequence positions here are relative to the E coli model or refer to sequence positions in the amplicon
Edgar, R. C. (2018). Updating the 97% identity threshold for 16S ribosomal RNA OTUs. Bioinformatics, 34(14), 2371-2375. doi:10.1093/bioinformatics/bty113
AbstractObtaining chromosomally complete genome assemblies across the tree of life is a major goal of biodiversity genomics. However, some lineages remain recalcitrant to assembly despite recent advances in sequencing technologies and assembly tools. Birds present a substantial genome assembly challenge due to the presence of tiny, hard to assemble microchromosomes that are often highly fragmented or even missing in draft genome assemblies. As such, bird genomes require a large amount of expert manual curation effort via manipulation of genome-wide Hi-C contact maps and many current chromosome-level bird genome assemblies do not resolve the known karyotype. Microchromosomes have distinct genetic and epigenetic features. They are GC-biased, gene-rich, highly methylated, and have distinct spatial organisation in the centre of the nucleus. Importantly, they are conserved across avian evolution. Here, using a reference set of expert curated bird genomes, we have identified a set of conserved microchromosome genes and developed MicroFinder, a pipeline that uses this gene set to find small microchromosome fragments in draft genome assemblies to act as anchors for manual curation of microchromosomes. We demonstrate how MicroFinder can be used to improve the speed and accuracy of bird genome curation. Furthermore, we highlight the usefulness of MicroFinder by carrying out MicroFinder-enabled re-curation of 12 previously released chromosome-scale bird genome assemblies, increasing the sequence content of microchromosome models.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag036), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2:
I had the privilege of reviewing the manuscript titled "MicroFinder: conserved gene-set mapping and assembly ordering for manual insertion of bird microchromosomes" by Mathers et al. The manuscript presents a conserved gene set linked to bird microchromosomes for identifying putative contigs/scaffolds. Subsequently, microchromosomes contigs/scaffolds can be made into their corresponding chromosome models using orthogonal evidence from HiC data. MicroFinder utilises the current knowledge of microchromosome conservation across birds. This approach is similar to assembly evaluation method using BUSO genes.
One of the major limitation of the manuscript is the lack of validation or supportive evidence to show that manual curation results after applying MicroFinder hints are valid and robust. Authors can perform local synteny or chromosome scale alignments analyses and conservation property evaluation to demonstrate that results of assembly curation are valid. Authors can also report metrics of HiC contact maps before and after curation for inter and intra chromosomes contacts to demonstrate improvements. If this is not done, authors may have to remove results and methods corresponding to manual curation so as to focus on genes that are found in "putative" microchromosomes.
Manuscript is generally well written with some minor concerns. Analyses presented are generally robust.
It was confusing to read the difference between micro and dot chromosomes. I encourage authors to avoid "dot" chromosome term. Although it has been used in literature in the past, we can do without that term. There is no strong evidence to suggest if micro and dot chromosomes have any significant functional or system level differences. Best to avoid the term.
If authors insist on using the dot nomenclature, a justification and explanation would be required with clear definitions for both. Also, the name of the workflow may need to change as well. I leave it up to authors to make that call.
Similarly I encourage authors avoid using the term shrapnel for small unplaced contigs. Just use small unplaced contigs instead.
Finding section contains a lot of information that belongs in methods section. For example line numbers 109-117 122-125 135-137 154-156 160-164 167-172 187-192. Please revise the text so that findings section doesn't have any methods description.
A definition of what is a orthogroup and fuzzy orthogroup is required.
Result/findings section needs significant improvements. Authors have relegated much of the results to tables in supplementary information. I insist that authors summarise those results in a meaningful descriptive way and refer to supplementary information for extra details.
Lines 176-177 mentions about the manual curation of micro chromosomes. I would like to see the rules and decisions that were employed to join or break or reorder contigs/scaffolds into a chromosome model.
Authors have mentioned that 216kb-4.3mb of additional content per assembly was added. This is incorrect as the sequence content was already present in the assembly. It is just reorganised into microchromosome scaffolds. Please correct the text to say that unplaced scaffolds are organised into putative microchromosomes.
Lines 108-199 mentions about errors in original assembly. A description about the type of errors would be required.
Authors should discuss the property of eagles, falcons and parrots with rearranged/fused micro chromosomes. The proposed method may not be effective in such instances.
Authors suggest the use of 5Mbp cut off. However, in instances where a micro chromosome is incorrectly placed with a macro- chromosome may miss these instances. Authors discuss this as paralog or misalignment related issues. I suggest that authors provide a metric for the success/failure of identifying genes similar to BUSCO. Authors can run the software on all available bird genomes to define the property of such metric for each gene. Result section can explain proportions of 9400 found on macro vs micro. Proportions of 14k fuzzy genes on micro vs macro, their copy status. 9400 + 14514 doesn't add up to 16,589 orthogroup. Something is not clearly described about those numbers. Please improve the text to make meaningful assessments of conserved gene sets on Microchromosomes for it to be useful for the research community.
Methods: Lines 233-234: what is taxon in this context? Please clarify. There is also a mention of taxa with missing data. What data were missing? Please clarify.
Lines 236-237: do authors mean that chromosomes identified by the submitter of primary assembly? Please clarify.
For each species, authors should refer to refseq version of the assembly for posterity as well. Common names of species may be useful too for broad readership.
Line 254: please modify the section header to remove assembly version as they are not useful
Methods describing the orthogroup clustering should include details about how alignments were filtered and processed. This is currently missing.
Significance of phylogenetic analyses in the context of manuscript is not very clear. May be remove that section. Perhaps authors can utilise the phylogenetic distance as a way to discuss how conserved gene sets are behaving between species based on distance.
Results section can include run time and compute resource usage metrics for others to estimate resource requirements for such analyses.
Updated assemblies can be submitted to NCBI. Authors should consider this.
AbstractObtaining chromosomally complete genome assemblies across the tree of life is a major goal of biodiversity genomics. However, some lineages remain recalcitrant to assembly despite recent advances in sequencing technologies and assembly tools. Birds present a substantial genome assembly challenge due to the presence of tiny, hard to assemble microchromosomes that are often highly fragmented or even missing in draft genome assemblies. As such, bird genomes require a large amount of expert manual curation effort via manipulation of genome-wide Hi-C contact maps and many current chromosome-level bird genome assemblies do not resolve the known karyotype. Microchromosomes have distinct genetic and epigenetic features. They are GC-biased, gene-rich, highly methylated, and have distinct spatial organisation in the centre of the nucleus. Importantly, they are conserved across avian evolution. Here, using a reference set of expert curated bird genomes, we have identified a set of conserved microchromosome genes and developed MicroFinder, a pipeline that uses this gene set to find small microchromosome fragments in draft genome assemblies to act as anchors for manual curation of microchromosomes. We demonstrate how MicroFinder can be used to improve the speed and accuracy of bird genome curation. Furthermore, we highlight the usefulness of MicroFinder by carrying out MicroFinder-enabled re-curation of 12 previously released chromosome-scale bird genome assemblies, increasing the sequence content of microchromosome models.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag036), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 1:
I am very happy to see that MicroFinder is going to be published! Last year I used it very often to curated the bird assemblies. I found no major issues, but only the minor one.
The only crucial (but still technical issue) is that your protein dataset is from dot microchromosomes, i.e. not from the all microchromosomes. So I highly recommend to use "dot microchromosomes" where relevant including the title of the manuscript.
Minor issues:
row 19 (Abstract background) change "major goal" to a softer statement. Generation of the assemblies is a very important task of bioiversity genomics but not a major one
row 54-55 Do you imply that typical bird genome contains 37-41 chromosome pairs? There are a lot of birds with lower number of chromosome, so i am not sure that it is typical.. Also a reference to publication from 1981 looks outdated
row 109 - why only eleven assemblies were selected?
row 111 - 112 Please, highlight how many orders/families were not covered
rows 129 - 137 This lines are in some contradiction with all the text including the abstract. Your dataset is focused on a dot chromosomes and not on the all microchromosomes. I suggest to replace "microchromosomes" nearly everywhere to "dot microchromosomes" including the title
row 173 - 185 I am very skeptical about expanding the results obtained on a single genome assembly to the whole family, especially if remember that your dataset covers less than a half of bird orders. My experience with Microfinder tells that sometimes it select contigs/scaffold belonging to macrochromosomes. However, not many and they are usually short. Please, soften statements
row 429 Reference 13 is in French and doesn't have an English translation of the title
The frontispiece drawing from Burges’s original Castell Coch proposal, 1874 © Bute Archive at Mount Stuart
foundation drawing of castell coch
好建议:需要对策略添加涨停买不进、跌停卖不出的逻辑
RRID:SCR_015872
DOI: 10.7554/eLife.108071
Resource: UCSF ChimeraX (RRID:SCR_015872)
Curator: @evieth
SciCrunch record: RRID:SCR_015872
RRID:SCR_017344
DOI: 10.7554/eLife.108071
Resource: Cell Ranger (RRID:SCR_017344)
Curator: @scibot
SciCrunch record: RRID:SCR_017344
RRID:SCR_016341
DOI: 10.7554/eLife.108071
Resource: Seurat (RRID:SCR_016341)
Curator: @scibot
SciCrunch record: RRID:SCR_016341
MMRRC (Stock# 005557
DOI: 10.3390/nu17243947
Resource: RRID:MMRRC_005557-UCD
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_005557-UCD
RRID:BDSC_24461
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_24461
Curator: @evieth
SciCrunch record: RRID:BDSC_24461
RRID:BDSC_29445
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_29445
Curator: @evieth
SciCrunch record: RRID:BDSC_29445
RRID:BDSC_23896
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_23896
Curator: @evieth
SciCrunch record: RRID:BDSC_23896
RRID:BDSC_9981
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_9981
Curator: @evieth
SciCrunch record: RRID:BDSC_9981
RRID:BDSC_32194
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_32194
Curator: @evieth
SciCrunch record: RRID:BDSC_32194
RRID:BDSC_9968
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_9968
Curator: @evieth
SciCrunch record: RRID:BDSC_9968
RRID:BDSC_42750
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_42750
Curator: @evieth
SciCrunch record: RRID:BDSC_42750
RRID:BDSC_91810
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_91810
Curator: @evieth
SciCrunch record: RRID:BDSC_91810
RRID:BDSC_41734
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_41734
Curator: @evieth
SciCrunch record: RRID:BDSC_41734
RRID:BDSC_23137
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_23137
Curator: @evieth
SciCrunch record: RRID:BDSC_23137
RRID:BDSC_76042
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_76042
Curator: @scibot
SciCrunch record: RRID:BDSC_76042
RRID:BDSC_80908
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_80908
Curator: @scibot
SciCrunch record: RRID:BDSC_80908
RRID:BDSC_9984
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_9984
Curator: @scibot
SciCrunch record: RRID:BDSC_9984
RRID:BDSC_23911
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_23911
Curator: @scibot
SciCrunch record: RRID:BDSC_23911
RRID:BDSC_9973
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_9973
Curator: @scibot
SciCrunch record: RRID:BDSC_9973
RRID:BDSC_23125
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_23125
Curator: @scibot
SciCrunch record: RRID:BDSC_23125
RRID:BDSC_9951
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_9951
Curator: @scibot
SciCrunch record: RRID:BDSC_9951
RRID:BDSC_23897
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_23897
Curator: @scibot
SciCrunch record: RRID:BDSC_23897
RRID:BDSC_42748
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_42748
Curator: @scibot
SciCrunch record: RRID:BDSC_42748
RRID:BDSC_26818
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_26818
Curator: @scibot
SciCrunch record: RRID:BDSC_26818
RRID:BDSC_3605
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_3605
Curator: @scibot
SciCrunch record: RRID:BDSC_3605
RRID:BDSC_24617
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_24617
Curator: @scibot
SciCrunch record: RRID:BDSC_24617
RRID:BDSC_41735
DOI: 10.1523/JNEUROSCI.2132-25.2026
Resource: RRID:BDSC_41735
Curator: @scibot
SciCrunch record: RRID:BDSC_41735
RRID:BDSC_80141
DOI: 10.1523/JNEUROSCI.1623-25.2026
Resource: RRID:BDSC_80141
Curator: @evieth
SciCrunch record: RRID:BDSC_80141
RRID:SCR_021391
DOI: 10.1523/JNEUROSCI.1623-25.2026
Resource: DeepLabCut (RRID:SCR_021391)
Curator: @scibot
SciCrunch record: RRID:SCR_021391
RRID:BDSC_79032
DOI: 10.1523/JNEUROSCI.1623-25.2026
Resource: RRID:BDSC_79032
Curator: @scibot
SciCrunch record: RRID:BDSC_79032
RRID:SCR_001622
DOI: 10.1523/JNEUROSCI.1623-25.2026
Resource: MATLAB (RRID:SCR_001622)
Curator: @scibot
SciCrunch record: RRID:SCR_001622
RRID:BDSC_27390
DOI: 10.1523/JNEUROSCI.1623-25.2026
Resource: RRID:BDSC_27390
Curator: @scibot
SciCrunch record: RRID:BDSC_27390
RRID:MMRRC_011015-UCD
DOI: 10.1523/ENEURO.0386-25.2026
Resource: (MMRRC Cat# 011015-UCD,RRID:MMRRC_011015-UCD)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_011015-UCD
RRID: SCR_001905
DOI: 10.1371/journal.pone.0346239
Resource: R Project for Statistical Computing (RRID:SCR_001905)
Curator: @evieth
SciCrunch record: RRID:SCR_001905
RRID: SCR_002285
DOI: 10.1371/journal.pone.0346239
Resource: Fiji (RRID:SCR_002285)
Curator: @evieth
SciCrunch record: RRID:SCR_002285
RRID: SCR_001905
DOI: 10.1371/journal.pone.0346239
Resource: R Project for Statistical Computing (RRID:SCR_001905)
Curator: @evieth
SciCrunch record: RRID:SCR_001905
RRID:AB_3717327
DOI: 10.1371/journal.pone.0346239
Resource: RRID:AB_3717327
Curator: @scibot
SciCrunch record: RRID:AB_3717327
RRID:SCR_002798
DOI: 10.1371/journal.pone.0346239
Resource: GraphPad Prism (RRID:SCR_002798)
Curator: @scibot
SciCrunch record: RRID:SCR_002798
Rax-Cre line is a mouse BAC transgenic, made by the GENSAT project and cryobanked at the Mutant Mouse Resource and Research Center (MMRRC)
DOI: 10.1167/iovs.67.4.33
Resource: (MMRRC Cat# 034748-UCD,RRID:MMRRC_034748-UCD)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_034748-UCD
Addgene #12260
DOI: 10.1158/0008-5472.CAN-25-4701
Resource: RRID:Addgene_12260
Curator: @evieth
SciCrunch record: RRID:Addgene_12260
RRID: CVCL_2478
DOI: 10.1155/humu/3675889
Resource: (KCB Cat# KCB 2011103YJ, RRID:CVCL_2478)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_2478
RRID:CVCL_0021
DOI: 10.1155/humu
Resource: (RCB Cat# RCB0461, RRID:CVCL_0021)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0021
IMSR_JAX:000671
DOI: 10.1111/jne.70182
Resource: (IMSR Cat# JAX_000671,RRID:IMSR_JAX:000671)
Curator: @areedewitt04
SciCrunch record: RRID:IMSR_JAX:000671
RRID:CVCL_0026
DOI: 10.1111/bph.70455
Resource: (KCLB Cat# 30080, RRID:CVCL_0026)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0026
Tg25109: RRID:MMRRC_075940
DOI: 10.1093/nar/gkag287
Resource: RRID:MMRRC_075940-JAX
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_075940-JAX
C57BL/6J-Tgfb1em2Lutzy/Mmjax, JAX stock #:065809 (obtained from the Mutant Mouse Resource and Research Center [MMRRC]
DOI: 10.1084/jem.20240801
Resource: RRID:MMRRC_065809-JAX
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_065809-JAX
MMRRC
DOI: 10.1038/s42003-025-09221-2
Resource: Mutant Mouse Regional Resource Center (RRID:SCR_002953)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:SCR_002953
RRID: RRRC_00168
DOI: 10.1038/s41598-026-41666-1
Resource: RRID:RRRC_00168
Curator: @evieth
SciCrunch record: RRID:RRRC_00168
MMRRC #0654240UCD
DOI: 10.1038/s41586-026-10313-0
Resource: (MMRRC Cat# 065424-UCD,RRID:MMRRC_065424-UCD)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_065424-UCD
RRID: CVCL_0023
DOI: 10.1016/j.jare.2026.04.055
Resource: (CCLV Cat# CCLV-RIE 1035, RRID:CVCL_0023)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0023
MMRRC 000259-UNC
DOI: 10.1016/j.devcel.2024.12.014
Resource: (MMRRC Cat# 000259-UNC,RRID:MMRRC_000259-UNC)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_000259-UNC
CCL-2
DOI: 10.1016/j.crmeth.2026.101345
Resource: (BCRC Cat# 60005, RRID:CVCL_0030)
Curator: @evieth
SciCrunch record: RRID:CVCL_0030
CRL-1580
DOI: 10.1016/j.crmeth.2026.101345
Resource: (CLS Cat# 400118/p8830_P3X63Ag8653, RRID:CVCL_4032)
Curator: @evieth
SciCrunch record: RRID:CVCL_4032
TIB-202
DOI: 10.1016/j.crmeth.2026.101345
Resource: (RRID:CVCL_0006)
Curator: @evieth
SciCrunch record: RRID:CVCL_0006
Pou2af1tm1a(KOMP)Wtsi
DOI: 10.1016/j.celrep.2025.116470
Resource: Mutant Mouse Regional Resource Center (RRID:SCR_002953)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:SCR_002953
LS174T
DOI: 10.1016/j.biopha.2026.119399
Resource: (KCLB Cat# 10188, RRID:CVCL_1384)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_1384
HCT116
DOI: 10.1016/j.biopha.2026.119399
Resource: (RRID:CVCL_0291)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0291
RRID: CVCL_0062
DOI: 10.1002/cbdv.71229
Resource: (RRID:CVCL_0062)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0062
RRID: CVCL_0027
DOI: 10.1002/cbdv.71229
Resource: (KCLB Cat# 88065, RRID:CVCL_0027)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0027
RRID: CVCL_0023
DOI: 10.1002/cbdv.71229
Resource: (CCLV Cat# CCLV-RIE 1035, RRID:CVCL_0023)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_0023
MMRRC Stock No. 034829‐JAX
DOI: 10.1002/alz.71423
Resource: (MMRRC Cat# 034829-JAX,RRID:MMRRC_034829-JAX)
Curator: @AleksanderDrozdz
SciCrunch record: RRID:MMRRC_034829-JAX
RRID:CVCL_1629
DOI: 10.1002/1878-0261.70235
Resource: (NCI-DTP Cat# OVCAR-8, RRID:CVCL_1629)
Curator: @areedewitt04
SciCrunch record: RRID:CVCL_1629
RRID:MMRRC_034840-JAX
DOI: 10.5607/en25047
Resource: (MMRRC Cat# 034840-JAX,RRID:MMRRC_034840-JAX)
Curator: @scibot
SciCrunch record: RRID:MMRRC_034840-JAX
RRID:SCR_002798
DOI: 10.3892/or.2026.9114
Resource: GraphPad Prism (RRID:SCR_002798)
Curator: @scibot
SciCrunch record: RRID:SCR_002798
RRID:SCR_015899
DOI: 10.3892/or.2026.9114
Resource: rna-star (RRID:SCR_004463)
Curator: @scibot
SciCrunch record: RRID:SCR_004463
RRID:SCR_002472
DOI: 10.3892/or.2026.9114
Resource: WoLF PSORT (RRID:SCR_002472)
Curator: @scibot
SciCrunch record: RRID:SCR_002472
RRID:SCR_014555
DOI: 10.3892/or.2026.9114
Resource: cBioPortal (RRID:SCR_014555)
Curator: @scibot
SciCrunch record: RRID:SCR_014555
RRID:SCR_017344
DOI: 10.3892/or.2026.9114
Resource: Cell Ranger (RRID:SCR_017344)
Curator: @scibot
SciCrunch record: RRID:SCR_017344
RRID:SCR_016620
DOI: 10.3892/or.2026.9114
Resource: Metascape (RRID:SCR_016620)
Curator: @scibot
SciCrunch record: RRID:SCR_016620
RRID:SCR_011848
DOI: 10.3892/or.2026.9114
Resource: Trimmomatic (RRID:SCR_011848)
Curator: @scibot
SciCrunch record: RRID:SCR_011848
RRID:SCR_014597
DOI: 10.3892/or.2026.9114
Resource: Cufflinks (RRID:SCR_014597)
Curator: @scibot
SciCrunch record: RRID:SCR_014597
RRID:SCR_003070
DOI: 10.3892/or.2026.9114
Resource: ImageJ (RRID:SCR_003070)
Curator: @scibot
SciCrunch record: RRID:SCR_003070
RRID:SCR_014345
DOI: 10.3892/or.2026.9114
Resource: Scaffold Proteome Software (RRID:SCR_014345)
Curator: @scibot
SciCrunch record: RRID:SCR_014345
RRID:SCR_013035
DOI: 10.3892/or.2026.9114
Resource: TopHat (RRID:SCR_013035)
Curator: @scibot
SciCrunch record: RRID:SCR_013035
RRID:AB_2336817
DOI: 10.3892/or.2026.9114
Resource: (Vector Laboratories Cat# PK-4010, RRID:AB_2336817)
Curator: @scibot
SciCrunch record: RRID:AB_2336817
RRID:AB_2798819
DOI: 10.3892/or.2026.9114
Resource: (Cell Signaling Technology Cat# 19495, RRID:AB_2798819)
Curator: @scibot
SciCrunch record: RRID:AB_2798819
RRID:AB_880418
DOI: 10.3892/or.2026.9114
Resource: (Abcam Cat# ab51608, RRID:AB_880418)
Curator: @scibot
SciCrunch record: RRID:AB_880418
RRID:AB_10694544
DOI: 10.3892/or.2026.9114
Resource: (Cell Signaling Technology Cat# 9292, RRID:AB_331419)
Curator: @scibot
SciCrunch record: RRID:AB_331419
RRID:AB_2734735
DOI: 10.3892/or.2026.9114
Resource: (Cell Signaling Technology Cat# 19245, RRID:AB_2734735)
Curator: @scibot
SciCrunch record: RRID:AB_2734735
RRID:AB_11058910
DOI: 10.3892/or.2026.9114
Resource: RRID:AB_11058910
Curator: @scibot
SciCrunch record: RRID:AB_11058910
RRID:CVCL_0504
DOI: 10.3892/or.2026.9114
Resource: (ATCC Cat# CRL-2577, RRID:CVCL_0504)
Curator: @scibot
SciCrunch record: RRID:CVCL_0504
RRID:AB_2098834
DOI: 10.3892/or.2026.9114
Resource: RRID:AB_2098834
Curator: @scibot
SciCrunch record: RRID:AB_2098834
RRID:AB_2174535
DOI: 10.3892/or.2026.9114
Resource: RRID:AB_2174535
Curator: @scibot
SciCrunch record: RRID:AB_2174535
RRID:SCR_008452
DOI: 10.3892/or.2026.9114
Resource: Thermo Fisher Scientific (RRID:SCR_008452)
Curator: @scibot
SciCrunch record: RRID:SCR_008452
RRID:SCR_001905
DOI: 10.3892/or.2026.9114
Resource: R Project for Statistical Computing (RRID:SCR_001905)
Curator: @scibot
SciCrunch record: RRID:SCR_001905
RRID:CVCL_0320
DOI: 10.3892/or.2026.9114
Resource: (RRID:CVCL_0320)
Curator: @scibot
SciCrunch record: RRID:CVCL_0320
RRID:SCR_006431
DOI: 10.1515/biol-2025-1309
Resource: Parkinson's Progression Markers Initiative (RRID:SCR_006431)
Curator: @scibot
SciCrunch record: RRID:SCR_006431
Addgene_73178
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_73178
Curator: @scibot
SciCrunch record: RRID:Addgene_73178
RRID:Addgene_59702
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_59702
Curator: @scibot
SciCrunch record: RRID:Addgene_59702
RRID:Addgene_143950
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_143950
Curator: @scibot
SciCrunch record: RRID:Addgene_143950
RRID:Addgene_12260
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_12260
Curator: @scibot
SciCrunch record: RRID:Addgene_12260
RRID:Addgene_52962
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_52962
Curator: @scibot
SciCrunch record: RRID:Addgene_52962
RRID:Addgene_8454
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_8454
Curator: @scibot
SciCrunch record: RRID:Addgene_8454
RRID:Addgene_96917
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_96917
Curator: @scibot
SciCrunch record: RRID:Addgene_96917
RRID:Addgene_48138
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_48138
Curator: @scibot
SciCrunch record: RRID:Addgene_48138
RRID:Addgene_145026
DOI: 10.1371/journal.ppat.1014056
Resource: RRID:Addgene_145026
Curator: @scibot
SciCrunch record: RRID:Addgene_145026
RRID:CVCL_0123
DOI: 10.1371/journal.pone.0346976
Resource: (ATCC Cat# CL-173, RRID:CVCL_0123)
Curator: @scibot
SciCrunch record: RRID:CVCL_0123
RRID:CVCL_4383
DOI: 10.1371/journal.pone.0346976
Resource: (IZSLER Cat# BS CL 132, RRID:CVCL_4383)
Curator: @scibot
SciCrunch record: RRID:CVCL_4383
RRID:AB_3678889
DOI: 10.1186/s40478-026-02240-y
Resource: RRID:AB_3678889
Curator: @scibot
SciCrunch record: RRID:AB_3678889
RRID:SCR_022632
DOI: 10.1186/s12883-026-04908-3
Resource: Cincinnati Children's Hospital Discover Together Biobank Core Facility (RRID:SCR_022632)
Curator: @scibot
SciCrunch record: RRID:SCR_022632
RRID:AB_1107769
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (Bio X Cell Cat# BE0089, RRID:AB_1107769)
Curator: @scibot
SciCrunch record: RRID:AB_1107769
RRID:AB_10949609
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (Bio X Cell Cat# BE0164, RRID:AB_10949609)
Curator: @scibot
SciCrunch record: RRID:AB_10949609
RRID:AB_10949464
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (Bio X Cell Cat# BE0115, RRID:AB_10949464)
Curator: @scibot
SciCrunch record: RRID:AB_10949464
RRID:AB_1107791
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (Bio X Cell Cat# BE0086, RRID:AB_1107791)
Curator: @scibot
SciCrunch record: RRID:AB_1107791
RRID:CVCL_7256
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (ATCC Cat# CRL-2638, RRID:CVCL_7256)
Curator: @scibot
SciCrunch record: RRID:CVCL_7256
RRID:AB_2687845
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BD Biosciences Cat# 564217, RRID:AB_2687845)
Curator: @scibot
SciCrunch record: RRID:AB_2687845
RRID:SCR_008520
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: FlowJo (RRID:SCR_008520)
Curator: @scibot
SciCrunch record: RRID:SCR_008520
RRID:AB_2810399
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 134017, RRID:AB_2810399)
Curator: @scibot
SciCrunch record: RRID:AB_2810399
RRID:AB_2750277
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 406711 (also 406712), RRID:AB_2750277)
Curator: @scibot
SciCrunch record: RRID:AB_2750277
RRID:AB_572016
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 109109, RRID:AB_572016)
Curator: @scibot
SciCrunch record: RRID:AB_572016
RRID:AB_313254
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 106305, RRID:AB_313254)
Curator: @scibot
SciCrunch record: RRID:AB_313254
RRID:SCR_013726
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: G*Power (RRID:SCR_013726)
Curator: @scibot
SciCrunch record: RRID:SCR_013726
RRID:SCR_002798
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: GraphPad Prism (RRID:SCR_002798)
Curator: @scibot
SciCrunch record: RRID:SCR_002798
RRID:AB_2294995
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 515405, RRID:AB_2294995)
Curator: @scibot
SciCrunch record: RRID:AB_2294995
RRID:AB_2566162
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 104447, RRID:AB_2566162)
Curator: @scibot
SciCrunch record: RRID:AB_2566162
RRID:AB_2564214
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 103057, RRID:AB_2564214)
Curator: @scibot
SciCrunch record: RRID:AB_2564214
RRID:AB_394657
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BD Biosciences Cat# 553142, RRID:AB_394657)
Curator: @scibot
SciCrunch record: RRID:AB_394657
RRID:AB_2563684
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 100565, RRID:AB_2563684)
Curator: @scibot
SciCrunch record: RRID:AB_2563684
RRID:AB_830744
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 102025, RRID:AB_830744)
Curator: @scibot
SciCrunch record: RRID:AB_830744
RRID:AB_2651770
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (Miltenyi Biotec Cat# 130-111-603, RRID:AB_2651770)
Curator: @scibot
SciCrunch record: RRID:AB_2651770
RRID:AB_2632693
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 652425, RRID:AB_2632693)
Curator: @scibot
SciCrunch record: RRID:AB_2632693
RRID:AB_10679575
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: RRID:AB_10679575
Curator: @scibot
SciCrunch record: RRID:AB_10679575
RRID:AB_394598
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BD Biosciences Cat# 553065, RRID:AB_394598)
Curator: @scibot
SciCrunch record: RRID:AB_394598
RRID:AB_3713452
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: RRID:AB_3713452
Curator: @scibot
SciCrunch record: RRID:AB_3713452
RRID:AB_2737976
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BD Biosciences Cat# 563053, RRID:AB_2737976)
Curator: @scibot
SciCrunch record: RRID:AB_2737976
RRID:CVCL_B288
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (RRID:CVCL_B288)
Curator: @scibot
SciCrunch record: RRID:CVCL_B288
RRID:AB_2565884
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 103151, RRID:AB_2565884)
Curator: @scibot
SciCrunch record: RRID:AB_2565884
RRID:AB_2737972
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BD Biosciences Cat# 563046, RRID:AB_2737972)
Curator: @scibot
SciCrunch record: RRID:AB_2737972
RRID:AB_2561389
DOI: 10.1158/2326-6066.CIR-25-0256
Resource: (BioLegend Cat# 100751, RRID:AB_2561389)
Curator: @scibot
SciCrunch record: RRID:AB_2561389
RRID:SCR_026692
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: Mutect2 (RRID:SCR_026692)
Curator: @scibot
SciCrunch record: RRID:SCR_026692
RRID:SCR_002798
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: GraphPad Prism (RRID:SCR_002798)
Curator: @scibot
SciCrunch record: RRID:SCR_002798
RRID:SCR_020240
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: Life Technologies QuantStudio 5 Real Time PCR System (RRID:SCR_020240)
Curator: @scibot
SciCrunch record: RRID:SCR_020240
RRID:SCR_001876
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: GATK (RRID:SCR_001876)
Curator: @scibot
SciCrunch record: RRID:SCR_001876
RRID:SCR_014782
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: OncoKB (RRID:SCR_014782)
Curator: @scibot
SciCrunch record: RRID:SCR_014782
RRID:SCR_026402
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: RRID:SCR_026402
Curator: @scibot
SciCrunch record: RRID:SCR_026402
RRID:SCR_015618
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: RRID:SCR_015618
Curator: @scibot
SciCrunch record: RRID:SCR_015618
RRID:AB_10002714
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: (Novus Cat# NB100-904, RRID:AB_10002714)
Curator: @scibot
SciCrunch record: RRID:AB_10002714
RRID:SCR_025111
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: Leica Aperio CS2 scanner (RRID:SCR_025111)
Curator: @scibot
SciCrunch record: RRID:SCR_025111
RRID:AB_10562712
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: (Thermo Fisher Scientific Cat# A10524, RRID:AB_10562712)
Curator: @scibot
SciCrunch record: RRID:AB_10562712
RRID:AB_2755003
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: (Millipore Cat# 05-636-I, RRID:AB_2755003)
Curator: @scibot
SciCrunch record: RRID:AB_2755003
RRID:AB_10979488
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: (Thermo Fisher Scientific Cat# MA5-14520, RRID:AB_10979488)
Curator: @scibot
SciCrunch record: RRID:AB_10979488
RRID:AB_143165
DOI: 10.1158/1541-7786.MCR-25-0970
Resource: (Thermo Fisher Scientific Cat# A-11008, RRID:AB_143165)
Curator: @scibot
SciCrunch record: RRID:AB_143165