10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      Summary:

      Forbes et al. developed an integrated approach to identify cis-regulatory elements (CREs) in the large (3.6 Gbp) genome of the crustacean Parhyale hawaiensis, addressing the challenge of pinpointing these regions among large regions of non-coding sequences. They combined ATAC-seq chromatin accessibility profiling (both bulk and single-nucleus) across embryonic and adult tissues with low-coverage genome sequencing of three congeneric species (P. aquilina, P. darvishi, P. plumicornis). Without assembling congener genomes, they mapped reads with low stringency to the P. hawaiensis reference, identifying about 55k conserved islands that overlap ATAC peaks more than expected by chance. This dual filter was used to select CRE candidates for transgenic reporter validation, yielding 6 functional elements (out of 11 tested) driving ubiquitous, neuronal, or muscle-specific expression, a major advance for non-model systems with large genomes.

      Strengths:

      Forbes et al. generated high-quality ATAC data across multiple scales. Using bulk ATAC-seq (from whole embryos, developing and adult legs), they identified tens of thousands of open chromatin peaks across the assembled P. hawaiensis large genome. Moreover, using single-nucleus ATAC-seq from adult legs, they could resolve differentially accessible chromatin profiles across over 15 cell types previously identified by scRNA-seq, enabling cell-type-specific candidate selection.

      Furthermore, their innovative low-coverage comparative genomics method mapped 0.46-6.4% of congener reads to P. hawaiensis without genome assembly, revealing hundreds of thousands of conserved non-coding islands, including about 55k showing conservation in all four species, far exceeding random expectation.

      Using the developed approach, the authors could validate 6 (out of 11 candidates) reporter constructs, driving robust ubiquitous and tissue-specific expression, succeeding where prior promoter-only screening failed and providing immediately useful genetic tools for the Parhyale community.

      Weaknesses:

      The primary limitation is that functional CRE testing was performed only in P. hawaiensis. While conservation maps are valuable resources, the manuscript lacks functional validation in congener species, limiting claims about broad applicability across related genomes/species.

      The approach also failed to validate developmental CREs. None of the candidates from combined ATAC and conservation filtering drove reporter expression matching endogenous patterns. The authors appropriately hypothesize technical limits (low expression) or biological factors (long-range enhancers, shadow enhancers).

      Overall Assessment:

      Forbes et al. fully succeed with their integrated approach to (1) generate an ATAC-seq atlas plus functional CRE discovery and (2) innovative low-coverage sequencing for conservation mapping in the large 3.6 Gbp genome of Parhyale hawaiensis. Their combination of ATAC-seq chromatin accessibility profiling (bulk and single-nucleus) across embryonic and adult tissues with low-coverage genome sequencing of three congeneric species (P. aquilina, P. darvishi, P. plumicornis), without congener genome assembly, drastically shrank the CRE search space. Using this approach, the authors could validate six out of 11 candidate transgenic reporters (ubiquitous, neuronal, and muscle-specific), where prior promoter-only screening failed.

      The low-coverage mapping innovation cuts cost and labour while snATAC-seq provides cell-type resolution, making these resources valuable for building new genetic and imaging tools in Parhyale.

      This compelling method also has the potential to enable labs with limited resources to identify and characterize regulatory elements in more non-model organisms, advancing our understanding of their evolution while establishing a scalable pipeline for large-genome systems.

    2. Reviewer #2 (Public review):

      The manuscript by Forbes, Skafida, Karapidaki et al. concerns the in silico identification of cis-regulatory elements (CREs) in large genomes using chromatin accessibility (ATAC-seq) and sequence conservation (genomic DNA sequencing) data. They exemplify this method by applying it to identify novel CREs in Parhyale hawaiensis, which they validated using reporter constructs.

      The results are convincing and are well supported by the data and validations. Identified CREs are valuable for researchers interested in the regulation of the expression of genes they control.

      The methodology on the whole is also valid, as suggested by the results and previous publications on various taxa. Sequence conservation, as stated by the authors, was long used as a method to identify regions of non-coding DNA with functional and evolutionary constraints. The same applies to ATAC-seq data, which has also been used as a proxy for functional regions in different animals such as sea urchins and amphioxus. The methodology proposed is likely to be successfully used by researchers working on a variety of experimental organisms.

      The authors do not use existing genome assemblies and use short-read sequencing to identify conserved regions, and while it is not conceptually novel, such an approach is becoming more and more viable and useful considering the recent advances in next-generation sequencing technology and the decrease in price of short-read sequencing.

      Two major weaknesses are:

      (1) The novelty of the approach and its advantages should be more explicitly stated.

      (2) The authors do not discuss in depth the strength of using a combination of two methods rather than either of the two, especially considering that previously known CREs do not overlap with conserved sequences.

    3. Reviewer #3 (Public review):

      Summary:

      Forbes et al. present a new approach for identifying cis-regulatory elements in large genomes. Using Parhyale hawaiensis, a crustacean with a large genome (~3.6 Gb, comparable in size to the human genome), the authors show that current methods for identifying cis-regulatory elements, effective in smaller genomes, are markedly inefficient in organisms with large genomes. To address this limitation, they combine bulk ATAC-seq and single-cell (sc) ATAC-seq to identify chromatin regions that are either ubiquitously accessible or specifically accessible in particular cell types. They further integrate comparative genomics across multiple Parhyale species (P. hawaiensis, P. aquilina, and P. darvishi), selected at appropriate phylogenetic distances (20-95 million years divergence), to pinpoint conserved open chromatin regions likely under functional constraint.

      Using this strategy, the authors predict a set of ubiquitous and cell-type-specific cis-regulatory elements. Importantly, they validate these predictions using rigorous transgenic reporter assays, convincingly demonstrating that their approach can successfully identify functional regulatory elements where previous methods had failed.

      Strengths:

      The approach introduced by Forbes et al. is conceptually straightforward, efficient, and readily transferable to other organisms. The validation experiments show not only that a substantial proportion of the predicted elements are functional, but also that the method is capable of identifying both ubiquitous and cell-type-specific regulatory elements. Given that the identification of regulatory regions remains a major bottleneck in understanding the molecular mechanisms underlying processes of development and regeneration, this work has the potential to make a significant impact in developmental and regeneration biology, particularly for studies involving non-model organisms with large genomes.

      An additional strength is the demonstration that only the genome of the focal species requires high-quality sequencing and assembly. In contrast, species used solely for comparative analysis can be sequenced at low coverage without assembly, substantially reducing costs and increasing the accessibility of the approach.

      Weaknesses:

      While the method is effective in identifying regulatory elements that are active ubiquitously or in differentiated cell types, it failed in detecting elements associated with developmentally regulated genes. This may be due to trivial reasons, such as a very low level of expression of the selected genes. However, as acknowledged by the authors, it may also indicate inherent challenges in identifying regulatory elements associated with developmentally dynamic gene regulation, compared to those associated with genes expressed in differentiated cell types.

      A second limitation, also acknowledged by the authors, is the absence of chromatin conformation capture data, which would help link distal regulatory elements to their target genes. This limitation may be particularly relevant for developmentally regulated genes, where long-range regulatory interactions may be critical.

      Addressing these limitations will be an important direction for future work. Nonetheless, the approach as presented in this manuscript represents a key contribution that sets the stage for further methodological advances in the identification of cis-regulatory elements in large genomes.

    1. Joint Public Review:

      Summary:

      This study uses state-of-the-art imaging approaches to show that membrane contact site (MCS) markers and the ER-resident tyrosine phosphatase PTP1B accumulate on phagocytic membranes within actin-devoid zones during frustrated phagocytosis in RAW264.7 macrophages. The authors convincingly show that PTP1B interacts with Syk, an Fcγ receptor-associated tyrosine kinase that plays a critical role in phagocytosis, and that ablation of PTP1B results in hyperphosphorylation of Syk and increased superoxide production, without impacting phagocytic efficiency. Using a phosphoproteomic approach, the authors identify the adaptor protein Shc1 as a strongly phosphorylated protein during stimulation of immunoglobulin receptors by aggregated IgG. In the absence of PTP1B, the authors demonstrate an increased interaction between Shc1 and the NADPH oxidase NOX2 subunit p47phox, suggesting that PTP1B controls superoxide production by inhibiting a Syk-Shc1-NOX2 axis.

      Strengths:

      This is a well-reasoned and cogently developed study that uses contemporary methods, including high-quality TIRF microscopy combined with MAPPER (Membrane-Attached Peripheral ER) or SPLICS (split-GFP-based contact site sensors), to describe how membrane contact site markers and the ER-resident tyrosine phosphatase PTP1B accumulate in the phagocytic cup as cortical actin depolymerizes. The genetic data also convincingly show that PTP1B ablation increases Syk and Shc1 phosphorylation, enhances the Shc1/p47phox interaction, and elevates superoxide production, whereas depletion of Shc1 reduces superoxide levels. Overall, the work outlines an interesting interplay between membrane contact sites, signaling, and the phagocytic machinery of broad interest.

      Weaknesses:

      While the authors indicate that the PTP1B phosphatase downregulates superoxide production via the Syk-Shc1-NOX2 axis and present a summary model depicting the proposed sequence of events, the supporting data are currently mostly circumstantial. For example, although it is clear that PTP1B depletion increases superoxide production as well as Syk and Shc1 phosphorylation in vivo, there are no data directly demonstrating that the effects of PTP1B depletion on superoxide production require enhanced Syk or Shc1 phosphorylation. Likewise, although PTP1B depletion increases the interaction between Shc1 and p47phox, a soluble component of NOX2, there is no compelling demonstration that superoxide production in PTP1B-depleted cells truly depends on the NOX2 complex or on the Shc1/p47phox interaction.<br /> In addition, while the authors elegantly demonstrate the formation of ER-PM contact sites during frustrated phagocytosis within the actin clearance zone, as well as the localization of the PTP1B phosphatase in the same region, it remains unclear whether the presence of the phosphatase at membrane contact sites is required for its regulatory effect on superoxide production.

      Finally, it would be interesting to investigate these phenomena in other macrophage cell lines and perhaps also in more physiological contexts than frustrated phagocytosis. This would help evaluate the broader generalizability of the results and conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      The authors use a gambling task with momentary mood ratings from Rutledge et al. and compare computational models of choice and mood to identify markers of decisional and affective impairments underlying risk-prone behavior in adolescents with suicidal thoughts and behaviors (STB). The results show that adolescents with STB show enhanced gambling behavior (choosing the gamble rather than the sure amount), and this is driven by a bias towards the largest possible win rather than insensitivity to possible losses. Moreover, this group shows a diminished effect of receiving a certain reward (in the non-gambling trials) on mood. The results were replicated in a general online sample where participants were divided into groups with or without STB based on their self-report of suicidal ideation on one question in the Beck Depression Inventory self-report instrument. The authors suggest, therefore, that adolescents diagnosed with depression or anxiety with decreased sensitivity to certain rewards may need to be monitored more closely for STB due to their increased propensity to take risky decisions aimed at (expected) gains (such as relief from an unbearable situation through suicide) regardless of the potential losses. However, such a result was only found in the clinical sample and cannot be generalized more broadly based on the current findings.

      Strengths:

      ● The study uses a previously validated task design and replicates previously found results through well-explained model-free and model-based analyses.

      ● Sampling of adolescents at high risk can help target early preventative diagnoses and treatments for suicide.

      ● Replication of the results in an online cohort increases confidence in the findings.

      ● The models considered for comparison are thorough and well-motivated. The chosen models allow for teasing apart which decision and mood sensitivity parameters relate to risky decision-making across groups based on their hypotheses.

      ● Novel finding of mood (in)sensitivity to non-risky rewards and its relationship with risk behavior in STB.

      Weaknesses:

      ● Sample size of 25 for S- group is low-powered, which is explicitly mentioned as a study limitation.

      ● Modeling in the mediation analysis focused on predicting risk behavior in this task from the model-derived bias for gains and suicidal symptom scores. Thus, the implications of this work are more relevant to a basic-science understanding of the etiology of suicidal behavior than they are useful as a predictor of suicidal behavior, and it is not clear that a psychiatrist or psychologist could use this task to potentially determine who is at higher risk of attempting suicide and must be more closely monitored. Indeed, relationships between task parameters and behavior and suicidal behavior was limited to the clinical sample with a diagnosis of depression or anxiety disorder, and did not extend to the online sample. Therefore, the claim that these findings provide "computational markers for general suicidal tendency among adolescents" is unwarranted.

    2. Reviewer #2 (Public review):

      Summary:

      This article addresses a very pertinent question - what are the computational mechanisms underlying risky behaviour in patients having attempted suicide. In particular, it is impressive how the authors find a broad behavioral effect whose mechanisms they can then explain and refine through computational modeling. This work is important because currently, beyond previous suicide attempts, there has been a lack of predictive measures. This study is the first step towards that: understanding the cognition on a group level. Before then being able to include it in future predictive studies (based on the cross-sectional data, this study by itself cannot assess the predictive validity of the measure).

      Strengths:

      - Large sample size

      - Replication of their own findings

      - Well-controlled task with measures of behaviour and mood + precise and well-validated computational modeling

      Questions, based on revised manuscript and replies to other reviewers:

      (1) Replies to reviewers in general: Bayes Factors have been added, it would be good to also use common verbal terms to describe them (e.g. 'anecdotal', 'moderate' etc). For example, my reading of table S8 would be that for gambling rate there is only anecdotal evidence that it does not relate to PSWQ, BDI, and moderate evidence it does not relate to TAI.

      (2) Reply to reviewer 1 Q2 (Predicting STB):

      For the regression predicting suicidal ideation, it seems to me that what you did was a regression STB ~ gambling behaviour + approach + mood? Could you clarify? I had expected as a test of whether the task can predict STB risk something slightly different - a cross-validation (LOO or maybe 5-fold in the large sample): STB ~ gambling behaviour + approach [parameter from model] + mood [parameter from model]; and then computing in the left out participants: predicted STB. Then checking correlation between STB and predicted STB. This would allow testing whether the diverse task measures together predict STB (with the caveat, that it's cross-validated, rather than hold-out sample, unless you could train on one sample (in lab) and test on the other (online).

      (3) Reply to reviewer 2 Q1 (parameter recovery): I'm looking at S3, it seems to still show only the scatter plots and not the correlation matrices, which are now added as text notes. Can you actually show these matrices? An off-diagonal correlation of 0.63 appears quite high. I think it needs to be discussed exactly which parameters those are, and whether that impacts the interpretation of the results.

      (4) Reply to reviewer 3 Q3 (mood model): I would have imagined that the response would involve changing the mood equations (equation 8 main text) to include a term for whether the participant gambled or not, independent of the gamble value.

    3. Reviewer #3 (Public review):

      This manuscript investigates computational mechanisms underlying increased risk-taking behavior in adolescent patients with suicidal thoughts and behaviors. Using a well-established gambling task that incorporates momentary mood ratings and previously established computational modeling approaches, the authors identify particular aspects of choice behavior (which they term approach bias) and mood responsivity (to certain rewards) that differ as a function of suicidality. The authors replicate their findings on both clinical and large-scale non-clinical samples.

      The main problem, however, is that the results do not seem to support a specific conclusion with regard to suicidality. The S+ and S- groups differ substantially in the severity of symptoms, as can be seen by all symptom questionnaires and the baseline and mean mood, where S- is closer to HC than it is to S+. The main analyses control for illness duration and medication but not for symptom severity. The supplementary analysis in Figure S11 is insufficient as it mistakes the absence of evidence (i.e., p > 0.05) for evidence of absence. Therefore, the results do not adequately deconfound suicidality from general symptom severity.

      The second main issue is that the relationship between an increased approach bias and decreased mood response to CR is conceptually unclear. In this respect, it would be natural to test whether mood responses influence subsequent gambling choices. This could be done either within the model by having mood moderate the approach bias or outside the model using model-agnostic analyses.

      Additionally, there is a conceptual inconsistency between the choice and mood findings that partly results from the analytic strategy. The approach bias is implemented in choice as a categorical value-independent effect, whereas the mood responses always scale linearly with the magnitude of outcomes. One way to make the models more conceptually related would be to include a categorical value-independent mood response to choosing to gamble/not to gamble.

      The manuscript requires editing to improve clarity and precision. The use of terms such as "mood" and "approach motivation" is often inaccurate or not sufficiently specific. There are also many grammatical errors throughout the text.

      Claims of clinical relevance should be toned down, given that the findings are based on noisy parameter estimates whose clinical utility for the treatment of an individual patient is doubtful at best.

      Comments on revisions:'

      The authors adequately addressed my comments and I find the manuscript substantially strengthened.

    1. Reviewer #2 (Public review):

      Summary:

      The authors present a computational framework for generating "cell-specific" digital twins of human iPSC-CMs from a single optimized voltage clamp recording. Using deep learning trained on > 1 million artificial cells, the authors demonstrate that the model can infer 52 biophysical parameters governing 6 major ionic currents, and the resulting digital twins can reproduce experimentally recorded action potentials.

      Comments on revised version:

      The authors propose an interesting platform for digital twin construction of iPSC-CMs using an AI-based approach. However, regarding the fundamental concerns raised in the previous review round "lack of experimental validation" and "overstatement of the claims", the authors have merely added text to the "Limitations" in the Discussion, without providing any new wet-lab experimental data. This cosmetic revision fails to demonstrate the scientific validity of the platform, and the core issues remain completely unresolved.

      I think the authors need to either provide substantial additional experimental data or drastically tone down the claims throughout the manuscript based on the following three major concerns.

      (1) Lack of wet validation

      The authors show that their AI model can infer 52 parameters from a single patch-clamp recording and reproduce the overall action potential waveform. However, the most critical validation (whether the individual ion channel parameters, such as IKr/ICaL, inferred by the AI actually match the true parameters of that specific cell) is still missing. Without a direct head-to-head comparison between the parameters inferred by the model and the exact values measured using conventional wet experiments, it is impossible to determine whether the platform is providing accurate prediction (or merely performing a curve-fitting).

      (2) Absence of experimental validation for drug response simulations (Cell 1 vs. Cell 2)

      In Figure 6, the authors present a simulation result where the administration of an IKr blocker (E-4031) induces EADs in the digital twin of Cell 1, but not Cell 2. However, there is absolutely no wet-lab validation for this prediction. Unless the authors actually administer the same drug to the live Cell 1 and Cell 2 from which the recordings were taken, this "computational drug response prediction" remains purely hypothetical. There is no evidence provided that the prediction accurately reflects real biological responses.

      (3) Significant overstatement regarding "inter-individual variability" and "personalized medicine"

      The authors state in the very first sentence of the Abstract: "Individual variability shapes how diseases manifest, how patients respond to therapy, and how rare phenotypes arise". However, this opening sentence is severely disconnected from the actual conclusions and data presented in this study. The platform can capture only "cell-to-cell variability within the same dish" (which is not even validated), and thus claiming "patient-to-patient differences" is an overstatement.

    2. Reviewer #3 (Public review):

      Summary:

      This work use convolution neural network to optimize a voltage clamp protocol to identify features and parameters from human pluripotent stem cell-derived cardiomyocytes.

      Strengths:

      The major strength is the methodology used to bridge in silico prediction of cell behavior and mechanistic insights from experimental dataset.

      Comments on revised version.

      As highlighted by the authors, due to the variability of the hPSC-CM model, to increase the applicability of this method, additional experimental dataset from different hPSC-CM lines would increase the translation of this approach.

      I personally found that the detailed description of the methods, including the rationale of including/excluding some parameters, is extremely helpful to whoever would like to use this approach in their research.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In the manuscript by Winke et al, the authors present evidence that fear-induced analgesia is mediated by somatostatin projection cells from the vlPAG to the RVM. This study uses a mouse model of fear-induced analgesia, and incorporates optogenetic circuit manipulation with behaviour and electrophysiology to gain a meaningful insight into a novel circuit involved in fear-induced analgesia.

      Strengths:

      (1) This is a well-constructed study with appropriate controls and analyses.

      (2) Alternative interpretations of the data are systematically considered and eliminated via rational experiments. The authors are commended for a nice piece of experimental work.

      (3) The vlPAG is a known region of pain modulation, and this study adds valuable insight to the circuit involved in fear-associated analgesia.

      Weaknesses:

      Only male mice are included in this study. [This has been explained and noted as a limitation.]

    2. Reviewer #2 (Public review):

      Summary:

      Wenke et al. investigated the role of vlPAG somatostatin-expressing neurons in the mediation of analgesia during defensive states. A newly developed paradigm of cued fear-conditioned analgesia, which consists of a combination of an auditory fear retrieval session and a pain test, was used to evaluate this cell population's contribution to fear-mediated analgesia. Optogenetic manipulation of vlPAG SST+ neurons modulated the responses to a nociceptive cue (Hot Plate) presented concomitantly with an aversively conditioned tone. At the same time, alterations in the freezing levels could be observed during optogenetic activation of vlPAG SST+ neurons. In order to disentangle the impact of these cells on analgesia from their impact on the expression of defensive behaviors, the authors performed electrophysiological recordings from the dorsal horn in the spinal cord of anesthetized mice. A vlPAG-RVM-DH pathway was identified to trigger nociceptive C-fibers upon optic activation of the RVM. Finally, pathway-specific activation of SST+ vlPAG-RVM neurons could abolish CS-induced analgesia.

      Strengths:

      The study addresses a relevant topic, that is, brainstem circuits for pain-modulatory mechanisms as part of defensive states evoked by threat. This is important because the circuit mechanisms underlying pain are still not fully understood, and defining molecular markers of cellular circuit substrates may support the identification of potential pharmaceutical targets in treating pain. The authors confirm a previous study in that a somatostatin-positive cellular population presents a crucial vlPAG circuit element mediating anti-nociceptive effects. Key novelty aspects of the present study are the demonstration that these neurons seem to play a role specifically in threat-induced analgesia. This was possible by the elegant design and application of a novel fear analgesia paradigm, combined with cell- and pathway-specific optogenetics.

    3. Reviewer #3 (Public review):

      Summary:

      Conditioned analgesia refers to the ability of a learned fear cue to suppress pain-related behavior and neural activity. Understudied, the authors developed a novel conditioned analgesia procedure in which a cue that had been paired or unpaired with shock was played while a hot plate increased temperature. Compared to several control conditions, the authors found increased latency to a nociceptive response (paw licking). The authors identified somatostatin neurons in the periaqueductal gray as a likely mediator of the behavior. They then showed that: (1) stimulating vlPAG-SST neurons blocked nociceptive response latency increases to the CS+, (2) stimulating vlPAG-SST neurons suppressed fear retrieval freezing, (3) stimulating vs. inhibiting vlPAG-SST neurons drove opposing modulation of c-fibers and Aδ-fibers, (4) direct-projecting vlPAG SST neurons modulate freezing while RVM-projecting vlPAG SST neurons modulate conditioned analgesia.

      Strengths:

      These experiments have many strengths. The behavioral assay is chief among them. The assay is robust and controls for confounding factors to reveal a repeatable effect of a shock-paired cue to delay nociceptive responding. The optogenetic experiments provide the correct level of temporal precision, given the authors' time-specific interest in cued responding. Combining neuronal manipulations with spinal recordings is particularly innovative, especially in the context of more behavioral neuroscience-based assays. All-in-all, I found this to be an exceptionally strong set of experiments.

      Weaknesses:

      No obvious weaknesses were identified by this reviewer.

    1. Joint Public Review:

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers.]

      The major strengths of the manuscript are in the Plasmodium falciparum genetic and phenotyping approaches. PfMSP2 knockouts are made in two different strains, which is important as it is know that invasion pathways can vary between strains, but is a level of comprehensiveness that is not always delivered in P. falciparum genetic studies. The knockout strains are characterised very thoroughly using multiple different assays and the authors should be commended for publishing a good deal of negative data, where no phenotype was detected. This is not always done but is very helpful for the field and reduces the potential for experimental redundancy, i.e., others repeating work that has already been performed but never published. The quality of the writing, referencing and figures is also generally strong.

      There are certainly some areas of the manuscript that would benefit from deeper exploration, such as electron microscopy/other imaging approaches to explore whether deletion of PfMSP2 has a visible impact on merozoite surface structure, further replicates of the video microscopy assays to see whether trends in the data could reach significance (although these are very time-consuming and technically difficult assays), and follow up of some of the genes where expression is changed by PfMSP2 knockout (as the authors point out, there are no candidates that have a very obvious link to invasion suggesting that they may be compensating for PfMSP2 function, although several are expressed in schizont stages). However, there is already a substantial amount of data in the manuscript, and more detailed follow-up is reasonable to leave to future work. Overall, with the modifications made through the review process, including the addition of new controls for key experiments, the claims and conclusions are justified by the data, and the manuscript generates important new information about a highly studied Plasmodium falciparum merozoite surface protein. The studies are important and have potential for directing vaccine design targeting erythrocyte invasion, a critical step in bloodstream expansion of malaria parasites.

    1. Reviewer #1 (Public Review):

      Summary:

      This study aims to understand how cell fusion contributes to wound healing using a laser-induced injury in the notum epithelium of a developing fruit fly. The authors meticulously characterize the epithelial fusion events using a live imaging approach and report that syncytia arise by 'border breakdown' and 'cell shrinking'. The syncytial epithelial cells also appear to outcompete mononucleated cells and preferentially dissolve their tangential borders, which correlates with the accumulation of actin at the leading edge.

      Strengths:

      The strength of this study is the authors' live imaging approach to capture these dynamic fusion events that are a fundamental yet poorly understood biological process.

      Comments on revised version.

      The manuscript overall is significantly improved and authors addressed majority of my concerns. The addition of the computational vertex model (Figure 7) as well as Atg1 RNAi (Figure 4) to inhibit cell fusion provide more mechanistic insight to their study. However, the analysis of Atg1 RNAi wound assay falls short as it does directly measure changes in syncytium frequency nor size to confirm that cell fusion is reduced. The authors should quantify the number of nuclei per syncytium over the 2hr wound healing period as performed for WT in Figure 1C. It would have been ideal if they could have also performed the Act-GFP spreading assay in WT and Atg1 RNAi strains to determine if Act-GFP movement is dependent on cell fusion as purposed. At the least, further quantification of Atg1 RNAi phenotype is warranted to support their conclusions.

    2. Reviewer #2 (Public Review):

      Summary:

      Overall, this study provides a thorough description of the formation of syncytia following wounding of the proliferation-competent diploid epithelium of the pupal notum. While this phenomenon has already been described briefly for this particular tissue by the Galko lab in Wang et al 2015, the authors provide a much more detailed description and characterisation of the process providing some novel insights (radial versus tangential border breakdown, cell shrinkage, timings, syncytia outcompeting mononucleated cells, etc.).

      Strengths:

      This paper provides an elegant, thorough, descriptive characterisation of syncytia-driven wound closure using state-of-the-art confocal live imaging of the pupal notum. The authors show that laser-induced wounding of this diploid, proliferation-competent epithelium results in the formation of syncytia of various sizes in the first few cell rows around the wound edge, which progressively become bigger as healing proceeds. This results in ~50% of cells becoming part of these syncytia. The cell fusion events were convincingly demonstrated by showing the disappearance of p120ctnRFP and E-Cadherin-GFP from cell-cell borders as well as cytoplasmic GFP mixing of GFP-positive cells with a GFP-negative cell.

      Apart from cell-cell fusion by border breakdown that mostly happens in the first 2h following wounding, the authors also found that at later stages of wound healing cell shrinkage following cytoplasmic mixing contributed to syncytia formation.

      Next, the authors provided some convincing evidence that syncytia outcompete mononuclear cells for being positioned in the first cell row around the wound.

      The authors then show that radial border breakdown occurs much less frequently than tangential border breakdown. They suggest that radial border breakdown reduces the requirement for cell-cell intercalations. They also hypothesise that tangential border breakdown might allow fused cells to share resources and provide more resources to be used near the wound edge, e.g. for actomyosin cable formation. To test this, the authors generate single-cell clones that overexpress Actin-GFP. They then show convincingly how a single Actin-GFP-positive cell in the second cell row fuses with one GFP-negative cell in the first cell row. The Actin-GFP signal then spreads in the fused cell and labels some previously unlabelled actin-rich structure near the wound edge which most likely is the actomyosin cable. This provides some evidence for resource sharing by cytoplasmic mixing following fusion.

      Comments on revised version:

      The authors have extended their original manuscript by adding two key parts. First, they show a role of Atg1 in mediating cell fusion (Figure 4). Second, they provide additional evidence for a contribution of radial border fusions to wound closure through its effect on tissue fluidity and through computational modelling (Figure 7).

      This new version of the manuscript is greatly improved and provides significant new insights into the role of syncytia in aiding wound repair. There are just a few minor, yet important, additions needed to back up Figure 4 which should not require new experiments.

      Minor but important points:

      The authors show a role of Atg1 in mediating syncytia formation in Figure 4. However, since the Pnr>+ side of the wound closes slower than the non-Pnr side (control side), a few additions to this figure would be important and should not require additional experiments.

      (1) The authors should show, similar to the data shown in Figure 4D of the wound radius over time for control versus Pnr>Atg1RNAi, also the same type of data for control versus Pnr>+.

      (2) Since Pnr>+ also slows down wound healing, albeit to a lesser extent than Pnr>Atg1, the authors should also show an extra graph that provides evidence that Pnr>Atg1RNAi reduces syncytia formation more than Pnr>+ does. E.g. Two graphs could be added that show individual cell size at 4 or 5h post wounding for control versus Pnr>Atg1RNAi as well as for control versus Pnr>+ and also another graph with the same data but comparing cell size between Pnr>+ and Pnr>Atg1RNAi. Otherwise, if the expected minimum cell size for a syncytium is easy to estimate, a graph could be added that shows the percentage of cells that are above this threshold (e.g. above 100 square micron) for control versus Pnr>Atg1RNAi and control versus Pnr>+ and Pnr>+ versus Pnr>Atg1RNAi.

    3. Reviewer #3 (Public Review):

      In this revised manuscript, White et al. aimed to understand the wound-induced syncytia formation behavior in wound repair of Drosophila melanogaster pupal notum. For this purpose, the authors characterized two different types of adherens junctions' outcomes during syncytia formation around the wound region - border breakdown versus apical shrinking which appear to happen in different time points and for different time durations. The authors characterized cell-cell fusion events using cytoplasmic, junctional and nuclear markers. They determined that about half of the cells within 70 um radii from the wound undergo cell-cell fusion. They studied wound induction on the border between control epithelia and pnr domain suggesting that Atg1 is required for post-wound syncytia formation and wound closure. They showed that during wound closure syncytia gradually invade the wound leading edge mostly by radial fusion events. The data suggests that intercalation of cells from the leading edge slows down the wound closure process. They propose that cell fluidity of syncytial cells plays a role in wound closure speed. Finally, the authors showed that actin is concentrated to the front edge of syncytia located in the wound leading edge. The authors described some aspects of syncytia formation during wound closure using different approaches. Some clarifications are needed as described below.

      Major suggestions:

      (1) Introduction, page 4. The examples of developmental syncytia formation of invertebrates and vertebrates are confusing. The authors may want to make the examples clear and add additional examples. Currently, readers may assume that C. elegans cell fusions occur only in the hypodermis - other structures can be mentioned like the vulva, pharyngeal muscles, glia, tail. In addition, the authors may want to add injury-induced fusions like the C. elegans' PLM and PVD neurons (Ghosh-Roy et al., 2010; Newman et al., 2015; Oren-Suissa et al., 2017).

      (2) In cases where it is not clear whether fusion has occurred or whether mononucleated cells were ejected from the leading edge, membrane markers can be used. Page 6. Lines 96-99. The authors may want to use a membrane marker like RFP-PH driven by the epithelial cell promoter.

      (3) Pages 8-10. The authors may want to clearly explain that apical junctions shrinking is a post fusion event. That the apical shrinking is caused by the expansion of fusion pores and the migration of apical junctions towards the basolateral domain. This is something that was clearly shown during physiological epidermal cell-cell fusion in C. elegans by Mohler et al., 1998 and 2002. A cartoon showing the process of cell-cell fusion, pore expansion and apical junction dynamics would make the manuscript much clearer.

      (4) Page 9. Line 170. "...as these cells represent fusion initiation events (fusion pore) but were unable to productively stabilize and expand the site of fusion and so returned to the diploid state." The authors may want to make clear that this is an assumption that needs to be tested. Live imaging using a membrane marker may resolve whether a reversible fusion pore was generated.

      (5) Page 11. It is not clear whether Atg1 is directly required for cell fusion, or that autophagy is required for efficient cell fusion or both Atg1 and autophagy participate in the fusion process.

      (6) Page 12. Line 235. "Indeed, we observed that several hours after wounding, the entire leading edge was occupied by syncytia." This observation is based only on the adherens junction marker. Can they test basal cell membrane marker? Is it possible that the mononucleate cell in the leading edge is under the two syncytia?

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a systematic behavioral characterization of object classification abilities in macaque monkeys using a high-throughput touchscreen-based paradigm. The work shows that monkeys can learn and generalize many binary object classification rules, and compares their behavior with humans and computational models. A key finding is that monkey behavior is more closely aligned with visual deep neural networks, whereas human behavior is better captured by language-informed models. The study provides a useful benchmark for understanding visually grounded object categorization in nonhuman primates.

      Strengths:

      The study introduces a scalable and well-controlled behavioral paradigm for testing many object classification rules in macaques. The comparison across monkeys, humans, and computational models is a major strength and makes the work broadly relevant to visual neuroscience, comparative cognition, and computational modeling. The results provide an informative framework for distinguishing categorization based primarily on visual representations from categorization supported by semantic or language-based knowledge.

      Weaknesses:

      Some aspects of the interpretation would benefit from clarification. In particular, it remains somewhat unclear what stimulus-level factors drive image difficulty, how much training performance reflects general rule learning versus repeated reinforcement of specific images, and whether monkeys and humans apply the same category rules. The link between macaque IT representations and monkey behavior is also suggestive but not yet fully resolved, given the limited and separate neural dataset.

    2. Reviewer #2 (Public review):

      Summary:

      The paper tackles a very interesting question and provides a solid and systematic piece of data that may be useful for numerous NeuroAI works in the future. The question is how well can macaque monkeys with a "pretrained" visual system without human knowledge learn to categorize images based on different kinds of (sometimes arbitrary) category definitions. In general, I love the paper, and I think both the data and presentation of it are beautiful.

      Strengths:

      (1) The authors developed a scalable method for training and studying this behavior, and did an exhaustive evaluation of monkeys' behavior and learning process.

      (2) Beyond the behavior result, they performed extensive analysis and control experiments to isolate the cue monkeys are using to perform the categorization.

      (3) The extensive comparison of behavior with deep neural networks is also super interesting.

      (4) The authors performed a very careful examination of generalization behavior in monkeys, similar to standard practise in machine learning.

      (5) The presentation of the data is very beautiful and deliberately designed, kudos to the authors for their efforts!

      (6) I really enjoyed the further categorization task based on human knowledge, and the arbitrary rule task; this really pushes our understanding of the visual categorization and learning capability of monkeys.

      (7) The examination of *learning dynamics* in human vs monkey is also quite interesting, i.e., humans can "understand the rule" and learn much faster versus monkeys learning across a few days.

      Weaknesses:

      (1) Though all results are pretty cool, the organization of results, figures, and sections can be modified to flow even better.

      (2) Maybe provide DNN categorization and generalization results for the non-main monkey experiments (Figures 2,3), those comparisons can be really interesting too!

    1. Reviewer #1 (Public review):

      Summary:

      This study constructed engineered NK-92 cell extracellular vesicles displaying CD19 single-chain variable fragment and evaluated their therapeutic efficacy in MRL/lpr mouse models of systemic lupus erythematosus, demonstrating that these vesicles could deplete B cells, alleviate lupus nephritis, and improve mouse survival. However, this strategy lacks significant innovation compared to existing research. The current results are not sufficient to provide strong support for the experimental hypotheses.

      Weaknesses:

      (1) This study proposes using engineered EVs displaying CD19 scFv to target B cells for SLE treatment. However, similar core therapeutic strategies have been reported in previous studies. For instance, recently, studies have reported engineered EVs for SLE therapy (J Control Release. 2025, 384:113886; Ann Rheum Dis. 2025, 84(11):1811-1821; J Nanobiotechnology. 2026, 24(1):203). Another research team from China also constructed engineered EVs displaying anti-CD19 scFv for SLE treatment, which is highly consistent with the present work in targeting strategy, delivery vehicle, and disease model (Mol Ther. 2026:S1525-0016(26)00080-8). Moreover, the human trial of allogeneic CD19-targeted CAR-NK therapy for SLE has been published (Lancet. 2026, 406(10522):2968-2979). This study has not made original improvements in therapeutic vectors, targeting modules, therapeutic mechanisms, and indications, and thus finds it difficult to meet the requirements of high-level journals for originality and novelty.

      (2) Numerous core experiments are missing, including the validation of CD19 scFv fusion protein expression on EVs, systematic characterization of engineered EVs, verification of EVs functions and therapeutic mechanisms, and in vitro and in vivo safety assessments. The available data are insufficient to support complete conclusions.

      (3) The stable expression of CD19 scFv on EVs should be further verified by Western blot or flow cytometry. The anchoring of CD19 scFv on the outer membrane surface of EVs must be confirmed. In addition, the loading capacity of CD19 scFv on exosomes should be quantified for the dosage selection in SLE treatment.

      (4) In vitro experiments are required to confirm the specific targeting ability of CD19 scFv-EVs to B cells and clarify the precise mechanism of B cell depletion, particularly whether it is mediated by effector molecules carried by exosomes such as perforin and granzyme B.

      (5) The key quality control parameters, such as the stability, purity, buoyant density, and particle/protein ratio of engineered exosomes, should be characterized and identified.

      (6) For the in vivo treatment experiments, the author needs to explain how the treatment dose of CD19scFv-EVs was determined in order to clarify the dose-effect relationship.

      (7) It is necessary to supplement with in vivo imaging and tissue distribution data to prove that the CD19 scFv-EVs can specifically accumulate in B-cell organs such as the spleen or lymph nodes.

      (8) The author needs to clarify the mechanism by which CD19 scFv-EVs reduce B cells in vivo and verify the caspase apoptosis pathway.

      (9) For the in vivo therapeutic experiments, the clinical first-line drugs and the free CD19scFv should be used to supplement the control group to highlight the advantages of the engineered EVs.

      (10) Safety assessment in this manuscript is completely absent. Routine toxicity examinations, including hepatic and renal function tests, routine blood tests, and histopathological analysis of major organs in mice, must be supplemented. In addition, the systemic inflammatory cytokine profile and anti-drug antibody levels should be determined to rule out critical safety risks such as cytokine release syndrome and immunogenicity. The authors only focused on alterations in B cells; the impacts of the treatment on T cell subsets, NK cells, and monocytes/macrophages should be further investigated.

    2. Reviewer #2 (Public review):

      Summary:

      Sun and colleagues report the development of an engineered extracellular vesicle platform derived from NK-92 cells that display an anti-CD19 single-chain variable fragment (scFv) on their surface via fusion with LAMP-2B (V-CD19-Exo). In an MRL/lpr mouse model of SLE, the authors demonstrate that intraperitoneal administration of V-CD19-Exo reduces splenic CD19+CD20+ B cells, attenuates proteinuria and lupus nephritis pathology, downregulates pro-inflammatory cytokines (IL-17A, IFN-γ) and autoantibodies (anti-dsDNA, ANA), and improves survival from approximately 25% to 80%. The authors propose that this "cell-free" targeted extracellular vesicle strategy offers advantages over conventional cell therapies, including lower immunogenicity, scalable production, and no requirement for lymphodepletion.

      The study addresses an important question in autoimmune disease therapeutics: how to achieve targeted B cell depletion while avoiding the complexities and safety risks associated with CAR-T/CAR-NK cell therapies. The concept is novel, and the initial in vivo efficacy data are encouraging. However, several significant limitations in experimental design, mechanistic depth, and evidence rigor temper the strength of the conclusions.

      Strengths:

      (1) Novel conceptual approach.

      The adaptation of CAR targeting principles to extracellular vesicles represents a creative and potentially impactful strategy. By displaying CD19 scFv on NK-92-derived vesicles, the authors successfully confer B cell-targeting capability while retaining the cytotoxic effector functions of the parental NK cells. This "cell-free" concept addresses genuine limitations of live cell therapies, including the need for lymphodepletion, risks of cytokine release syndrome, and manufacturing complexity.

      (2) Comprehensive in vivo efficacy readouts.

      The study evaluates therapeutic effects across multiple clinically relevant endpoints: B cell depletion (flow cytometry), renal function (proteinuria, UPCR), renal histopathology (HE staining with semi-quantitative scoring), systemic inflammation (IgE, IL-17A, IFN-γ), autoantibody production (anti-dsDNA, ANA), and survival. This multi-dimensional characterization strengthens the phenotypic evidence for efficacy.

      (3) Appropriate control groups.

      The inclusion of non-targeted NK92-Exo as a control allows attribution of the observed effects to CD19-mediated targeting rather than non-specific vesicle-associated activities.

      (4) Significant survival benefit.

      The improvement in survival from 25% to approximately 80% in V-CD19-Exo-treated mice is substantial and represents arguably the most compelling evidence for therapeutic potential in this model.

      Weaknesses:

      (1) Mechanism of B-cell reduction remains unclear.

      The manuscript reports a dramatic reduction in splenic CD19+CD20+ B cells (from 10.53% to 1.51%) following V-CD19-Exo treatment. However, the authors do not establish whether this results from direct cytotoxicity (e.g., perforin/granzyme-mediated killing, apoptosis induction) or from functional suppression/downregulation of CD19 expression. The authors speculate that the effect is likely mediated by cytotoxic proteins carried by NK-92-derived vesicles, but no data are provided to support this mechanism. Essential experiments would include the detection of apoptosis markers (Annexin V, activated caspase-3/7) in B cells, assessment of perforin/granzyme B content within V-CD19-Exo, or in vitro co-culture assays demonstrating direct B cell killing.

      (2) Small sample sizes.

      Most experimental endpoints were assessed with n=5 per group, which is marginal for detecting modest effect sizes and may amplify the influence of individual biological variation. While the survival study had n=10 per group, the main mechanistic and endpoint analyses would benefit from larger cohorts (n=8-10) to increase statistical power and robustness.

      (3) No dose-response or dosing optimization studies.

      All experiments used a single dose (10⁹ particles per injection) and a fixed schedule (twice weekly for three weeks). The absence of dose-response data leaves unclear whether the observed effects represent maximal efficacy or could be achieved with lower doses, and whether alternative dosing regimens could improve outcomes or reduce potential off-target effects.

      (4) Lack of safety assessment.

      The authors emphasize the theoretical safety advantages of extracellular vesicles over cell therapies, but no systematic safety evaluation is presented. Key missing data include: histopathological examination of non-target organs (liver, lung, heart, gastrointestinal tract), assessment of off-target immune activation (T cell responses, cytokine profiles beyond those measured), and evaluation of potential accumulation or toxicity with repeated dosing.

      (5) Incomplete characterization of the engineered vesicles beyond targeting.

      While the manuscript successfully demonstrates CD19scFv display and vesicle enrichment of exosomal markers, it does not characterize whether V-CD19-Exo retains the full spectrum of NK-92 effector molecules (perforin, granzymes, FasL, TRAIL, cytokines such as IFN-γ) at functional levels. Quantitative or semi-quantitative comparison of cargo between V-CD19-Exo and parental NK-92 cells or non-engineered NK92-Exo would help contextualize the observed in vivo effects.

      (6) Sex as a biological variable is not systematically addressed.

      The authors note in the Discussion that the same treatment showed more significant efficacy in male mice compared to females (data not shown), yet all main experiments were conducted exclusively in female mice. Given the strong sex bias in SLE epidemiology (approximately 9:1 female-to-male ratio) and potential differences in immune responses between sexes, this observation warrants systematic investigation rather than a footnote. Presenting the sex-differential data or alternatively, conducting adequately powered sex-stratified analyses would substantially strengthen the manuscript.

      (7) Translational claims are premature.

      The manuscript repeatedly emphasizes advantages over cell therapy (low immunogenicity, scalable production, no requirement for lymphodepletion) as if these are established properties of V-CD19-Exo. However, no experiments directly compare V-CD19-Exo to CAR-NK or CAR-T cells in terms of efficacy, immunogenicity, or safety. Similarly, claims of "scalable production" and "high batch-to-batch consistency" are not supported by any manufacturing or quality control data. These statements should be toned down or supported with empirical evidence.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript describes the development of engineered NK-92-derived extracellular vesicles (EVs) displaying CD19scFv for targeted treatment of systemic lupus erythematosus (SLE). Using a CD19scFv-LAMP2B fusion strategy, the authors generated EVs intended to selectively target pathogenic B cells in the MRL/lpr lupus mouse model. The study reports reductions in CD19⁺CD20⁺ B-cell populations, improvements in proteinuria and renal histopathology, decreased inflammatory cytokines and autoantibody levels, reduced splenomegaly, and improved survival outcomes following treatment. The work aims to position engineered EVs as a cell-free alternative to CAR-T/CAR-NK therapies for autoimmune disease treatment. While the concept is interesting and potentially translational, the study currently lacks sufficient methodological rigor, EV purification standards, mechanistic validation, and comprehensive characterization to fully support many of the claims presented.

      Strengths:

      (1) The study addresses an important unmet clinical need in systemic lupus erythematosus and explores an innovative cell-free therapeutic strategy.

      (2) The concept of combining CAR-like targeting approaches with engineered EVs is interesting and potentially translational.

      (3) The manuscript includes both in vitro and in vivo experiments, including functional renal assessments, immune profiling, histopathology, and survival studies.

      (4) The authors attempt to evaluate multiple disease-associated readouts, including proteinuria, cytokines, autoantibodies, splenomegaly, and survival outcomes, which strengthens the overall biological relevance of the work.

      (5) The use of engineered NK92-derived vesicles as a scalable alternative to CAR-NK therapy represents a potentially attractive therapeutic platform.

      (6) The in vivo therapeutic observations in the MRL/lpr lupus model are encouraging and warrant further mechanistic investigation.

      Weaknesses:

      (1) The EV isolation strategy is not sufficiently rigorous for defining the isolated particles as "exosomes" according to current International Society for Extracellular Vesicles/MISEV guidelines. The precipitation-based workflow without density gradient purification or SEC raises major concerns regarding EV purity and identity.

      (2) No direct validation was provided demonstrating successful surface localization or functional accessibility of CD19scFv on EV membranes.

      (3) The characterization of EVs is incomplete and insufficient. Additional positive/negative EV markers, purity metrics, and orthogonal characterization methods are required.

      (4) The absence of density gradient ultracentrifugation is particularly concerning, given the systemic injection of EV preparations into mice, as contaminating soluble factors and non-vesicular particles may contribute to the observed therapeutic effects.

      (5) The manuscript lacks adequate mechanistic studies explaining how engineered EVs mediate B-cell depletion or immune modulation.

      (6) The in vitro functional assays are weakly designed, particularly the use of A549 cells for evaluating CD19-targeted vesicle function.

      (7) Important methodological details are missing, including EV normalization strategies, flow cytometry gating controls, blinding procedures, and randomization approaches.

      (8) Several figures, particularly TEM and western blot images, are of low quality and difficult to interpret.

      (9) The study does not sufficiently exclude the possibility that observed therapeutic effects result from contaminating soluble immune mediators rather than EV-specific activity.

      (10) Broader immune profiling is lacking despite the systemic immune complexity of SLE.

      (11) The statistical analysis section includes tests that are not reflected in the Results section, creating concerns regarding data presentation and consistency.

      (12) Overall, while the concept is interesting, the manuscript currently falls short of the experimental rigor expected for high-impact translational EV studies.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript from Ali Guler's lab intends to test the impact of an integrated lifestyle around the timing of food, exercise, and light on circadian rhythm, metabolic health, and sleep in wild-type mice. After observing positive outcomes from short-term studies, they applied this integrated chronobiologically anchored lifestyle to mouse models of neurodegenerative diseases. They found some encouraging trends of health improvement that largely did not reach statistical significance.

      Strengths:

      Good experimental design to systematically test the effects of shorter day, timed voluntary exercise, and time-restricted feeding in rodents. The authors started with an experimental design that incorporated some findings from published papers. They used a shorter photoperiod of 8 h, which was shown to improve SCN synchrony and amplitude of the molecular clock. The use of time-restricted feeding with feeding aligned with the dark phase also has precedence. The late-night access to the running wheel is based on the published data on treadmill exercise in the late active phase, imparting better metabolic benefits. No other study has systematically integrated all three interventions into a single study. This is one of the uniqueness of the study.

      Weaknesses:

      Since the B6 strain of mice on normal chow does not show many health impairments, the choice of this strain and diet did not enable fine-grained analyses of each intervention on health outcomes. Although the authors used male and female mice, sex differences (if any) should have been explicitly addressed.

    2. Reviewer #2 (Public review):

      Summary:

      The LiFE protocol provides shortened light exposure, as well as timed food availability and exercise (running wheel) availability. It causes mice to sleep for the first half of the active phase and to be active during the second portion, thus consolidating activity. This has some positive effect on metabolic markers and some (but not other) behavioral markers. In two AD models, there is the suggestion of a protective effect, though most of the data is not significant.

      Strengths:

      The concept is important and builds on previous studies showing cognitive benefits and decreased brain pathology in mice with time-restricted feeding or shortened light exposure. The comparison to multiple different light, food, and exercise timing regimens in Figure 1 is quite interesting and informative. The use of 2 different mouse models (5xFAD and 5xFAD::PS19) is a strength, as this latter model is rarely used. The pathological endpoints are appropriate.

      Weaknesses:

      The LiFE protocol is strange in that it induces sleep during the first several hours of the active phase. The mice seem to show food anticipatory activity, then suddenly go to sleep for a few hours during what should be their most active time of day. Is this good? Would we want such a thing in humans? Why does this happen? What is the real-life implication? How do the mice eat if they are sleeping so much during their food period?

      While many of the cognition and brain pathology experiments seem to trend in a positive direction, most are not significant, which calls into question the value of the intervention. There are a few that are significant, but the overall effect seems weak. The experiments with AD mouse models are generally underpowered and not controlled for sex, as female mice get pathology much faster in the 5xFAD model, and males have more severe pathology in the PS19 model. Combining them may mask effects.

      In all, it is an interesting and thought-provoking study which shows striking effects of the LiFE intervention on activity patterns and sleep, with modest/inconclusive effects on cognition and brain pathology. While it feels very preliminary, the study does provide some valuable information for planning future studies of circadian interventions in neurodegenerative models, even if the protective effects here are not fully solidified.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript presents a multimodal circadian intervention ("LiFE") that combines short photoperiod exposure, time-restricted feeding, and scheduled exercise and examines its effects on circadian activity structure, SCN rhythmicity, sleep, glucose regulation, cognition, and Alzheimer's disease-related phenotypes in mice. The study is ambitious in scope and conceptually appealing. In wild-type mice, the authors report that LiFE consolidates activity rhythms, enhances SCN PER2::LUC amplitude, increases sleep, lowers baseline glucose, reduces glycemic variability, and improves novel object recognition. They then extend the paradigm to 5xFAD and 5xFAD/PS19 mice, where the effects are more modest and mostly trend-level, with limited evidence for improved behavior or reduced pathology.

      Strengths:

      Overall, the work is interesting and potentially important because it moves beyond single-zeitgeber manipulations and tests the idea that combining multiple entrainment cues may produce broader physiological benefits than light, feeding, or exercise alone. The WT dataset is the strongest part of the paper and provides evidence that the combined intervention changes circadian organization and metabolic physiology.

      Weaknesses:

      Alzheimer's disease claims are considerably less convincing than the title and framing suggest. The manuscript would be stronger if the authors more clearly separated the robust conclusions in WT animals from the preliminary, underpowered, and largely non-significant findings in the disease models. In its current form, the paper contains substantial merit, but several interpretive and methodological issues should be addressed before publication.

    1. Reviewer #2 (Public review):

      Summary:

      The authors perform confirmation studies of Paul Basch's seminal schistosome work from 1981, demonstrating the development of transformed schistosomules into sexually dimorphic adult parasites, albeit without successful egg production. In addition to the findings from Basch's earlier work, the authors add some new molecular data in the form of analysis of proliferative cells in in-vitro derived animals.

      Strengths:

      The authors successfully confirm experimental results from earlier schistosome researchers, providing a potential new tool for studying schistosome biology without the need for vertebrate hosts.

      Weaknesses:

      The display of data from the authors is sometimes difficult to follow/understand where it comes from. For example:

      (1) Line 136: the authors claim state that parasites in HS and FBS conditions have substantially different mortality rates (11.3 +/- 2.7 vs 5 +/- 2.3) but a quite high p-value (0.8). Analyzing the raw data myself, this reviewer obtained a mean of 8.2 +/- 1.7% vs 4.8% +/- 4.3% with a p-value 0f 0.15. Either the data are not clearly presented, and this reviewer did not follow them, or the data presented in the text do not match the raw data in the supplemental files.

      (2) Line 187/Figure 4: though it is not clearly stated, it appears that the authors treat their EdU counts as an ordinal data set of 61 steps (from 0 to >60) rather than a continuous measure of EdU+ cells per animal. In this author's opinion, the graph strongly suggests a continuous data set, and the fact that this reviewer had to dig through poorly-labeled raw data to discover the nature of the data is problematic. The authors should either switch to a continuous data set or make it explicit that the data shown are ordinal. If counting EdU+ cells is too arduous, the authors could consider comparing the amount of EdU+ area to the amount of DAPI+ area in maximum intensity projections of their confocal images, as this would roughly approximate the amount of proliferative cells in the animals.

      There are some minor issues as well:

      (1) Line 122: it is perhaps incorrect to refer to humans as "the" definitive host of schistosomes, as S. japonicum is primarily considered a zoonotic infection with water buffalo/cows being the primary definitive host.

      (2) Line 185/298 the authors refer to EdU pulse-chase experiments, but the experiments described here are EdU pulse experiments.

      Comments on revised version.

      Following the initial submission of the manuscript and a round of peer review, the authors updated the manuscript and addressed all of this reviewer's concerns. As such, this reviewer believes that the manuscript is substantially clearer and will serve as useful literature in the field of schistosome research.

    2. Reviewer #3 (Public review):

      Summary:

      This study is significant as it established a protocol for the long-term culture of Schistosoma mansoni newly transformed cercariae which developed in vitro into sexually dimorphic forms. The impact of two different sera, Fetal Bovine Serum (FBS) and Human Serum (HS), added to the culture medium supplemented with human red blood cells was evaluated. The authors demonstrated that HS-cultured parasites were able to digest red blood cells, a critical step for long term parasite development. Furthermore, while most FBS-cultured parasites did not progress beyond an early liver stage, sexual dimorphism was clearly evident in the HS-cultured worms, albeit delayed compared to in vivo development.

      Strengths:

      This study could contribute to further in vitro studies for a better understanding of the unique sexual biology of Schistosoma mansoni and for screening novel schistosomicidal compounds. By increasing parasite development in in vitro studies this protocol could have a positive impact on the principles of the 3Rs (Replacement, Reduction and Refinement) for animal research.

      Weaknesses:

      As the authors mentioned "pairing between male and female parasites was rare. Pairing was rarely observed and only after day ~ 80 in culture. Egg production was also not achieved with this protocol.

      Comments on revised version.

      Some data presentation has been improved as suggested by other reviewers in the revised manuscript. The authors have also clarified the limitations of their long-term culture protocol for Schistosoma mansoni newly transformed cercariae which develop in vitro into sexually dimorphic forms with regards to male and female pairing. Additionally, they addressed my specific question regarding the culture conditions used for ex vivo/in vitro mating. The experimental conditions tested for in vitro developed parasites were the same as those for the pairing experiments. It remains to be investigated the factors that negatively influence pairing during the long-term in vitro culture of Schistosoma.

    1. Reviewer #1 (Public review):

      [Editors’ note, July 1, 2026: An Author Response to the reviews below will be provided in the near future.]

      Summary:

      The authors used a large dataset evaluating gut carriage of Enterobacterales and ESBL organisms from children aged 6-24 months as the basis for a modeling study to investigate what factors are most important for determining the prevalence of ESBL resistance. The modeling incorporated travel, a simple model of carriage duration (short and long), fitness cost of resistance on transmission and clearance, and antibiotic use. They found that antibiotic use is the primary driver of resistance prevalence, with transmissibility of resistant strains also important for setting the prevalence. Travel, while important when prevalence is very low, plays less of a role in maintaining prevalence once it is established (in keeping with other recent work). They estimated the fitness cost of resistance (terming a reduction of 14% on the rate of transmission and an increase of 23% on the rate of clearance as "low"). While the extent of assumptions and simplifications makes me skeptical of the quantitative conclusions, the qualitative ones seem reasonable and reinforce the long-held principles of the field--reducing antibiotic pressure and interrupting transmission--and highlight the importance of understanding the biological factors that shape the duration of carriage and the likelihood of colonization.

      Strengths:

      This study incorporates many of the factors that might influence the carriage prevalence of ESBL Enterobacterales. This builds on the work led by this group, both in primary data collection and in theory. Overall, it's such a tough problem that I commend the authors for trying to tackle it. The authors take a thoughtful, rigorous approach, acknowledging simplifications and assumptions where they need to, so as to evaluate the various factors shaping ESBL prevalence.

      Weaknesses:

      Part of the reason it's such a tough problem is that we have limited data to structure and parameterize a complex model.

      (1) The data are not sufficiently described.

      The primary data source for this modeling exercise comes from a study of 6-24-month-old children who underwent rectal swabs and evaluation of the carriage prevalence of Enterobacterales, and then whether these Enterobacterales were ESBL; moreover, the study included data on travel and on antibiotic use. Could the authors please direct us to these primary data? Could the authors also justify the parameters in their models from these data--for example, could they please provide the distribution of antibiotic use and the associated timing? Could they also explain why they decided to treat all Enterobacterales as if they were E. coli (line 307)? Is there evidence that all Enterobacterales occupy the same niche and compete with each other?

      (2) The model should be more fully described and the limitations explored/explained.

      - The authors should point to the code and the ODEs.<br /> - I understand the focus on the pediatric population; the authors argue that this is reasonable because ESBL colonization is similar across age groups. But presumably, antibiotic use differs across age groups, and there is colonization pressure from within households.<br /> - The authors only consider resistance to extended-spectrum beta-lactams and use of beta-lactam antibiotics, but ESBL Enterobacterales are often resistant to other antibiotics as well. How much does the use of other antibiotics also select for Enterbacterales that happen to carry ESBL resistance? "One bug/one drug" modeling, as done here, neglects the complexities of the actual patterns of resistance and range of antibiotic use.<br /> - Do the data support the T3 or S3 compartments, which, if I understand correctly, means no exposure to antibiotics can happen during three months after either treatment or travel? What do the data say about the patterns of antibiotic use? I'd imagine that the likelihood of antibiotic use is not homogenous, but instead, there are some who use repeated rounds of antibiotics.<br /> - Why do the authors exclude individuals who used antibiotics in the prior 7 days? What justifies that cutoff? The authors speculate that the impact of excluding these individuals is likely to be minimal; why exclude them, then? Did the authors evaluate the results if they were included?<br /> - What is the basis of "niche differentiation", as described starting on line 221? Why should clearance of one strain be slower when the strain co-occurs in a host with a strain of another type?

    2. Reviewer #2 (Public review):

      Overview:

      This study integrates several datasets into a unified modeling framework that incorporates several mechanisms thought to impact the spread of ESBL-resistant bacterial strains. The model accounts for tradeoffs between persistor and colonizer strains, travel rates, antibiotic treatment and strain clearance, direct competitive interactions, and, most importantly, a series of distinct costs associated with the carriage of ESBL resistance. The resulting 75-compartment model is internally consistent and structurally neutral. However, the parameter estimation is flawed in many ways, compromising the interpretations of the model.

      On the usage of the Swedish infant data set to estimate colonization and persistence:

      First, while other papers have taken similar approaches, the Swedish infant data set is fundamentally inadequate to estimate colonization and persistence rates. This is because very few colonies were typed per sampling event (2 to 6 colonies per event). The original authors themselves argued that strains of indistinguishable morphology would not be able to be differentiated by this method. They also provided data showing that strain identity was not directly related to colony morphology (same strain often displaying distinct morphologies).

      The consequence of this is that strains present in low abundance would be missed with a high likelihood. However, if they were to be stochastically sampled, this would count as a "colonization" event, and if they were missed in subsequent samplings, this would count as a "loss" event. In other words, the statistical methods described conflate within-host dynamics (which might lead to distinct within-host abundances) with between-host dynamics (colonization and loss).

      Beyond this conceptual issue, some technical aspects aren't particularly sound. The mean of the inferred posterior for the lambda and mu parameters are then used to calculate the beta, gamma, d, and epsilon parameters through a linear regression. The more technically correct way of doing this would be to directly infer these parameters from the data and obtain a full posterior for these parameters.

      This highlights another issue: these parameters are passed down to the next statistical model as point estimates, with no associated uncertainty. This artificially inflates the (already low) confidence of the estimates for the cost parameters.

      Finally, when this procedure generated parameters that were inconsistent with their expectations (clearance is too high to explain prevalence in France), they adjusted the parameters by discarding and recalculating their beta parameters to artificially enforce neutrality between their strains and enforce the expected prevalence. This is problematic because beta and gamma were jointly estimated, and there is no particular reason why some of them should be discarded. The more natural interpretation would be that parameters inferred from Swedish infants do not translate well to French adults, which should preclude their usage in this context.

      On the estimation of costs of ESBL resistance:

      The core of the second statistical model is to use prevalence data, travel data, and treatment data in conjunction with the previously inferred colonization and loss parameters to infer the costs of carrying antibiotic resistance. Therefore, the accuracy of this section is contingent on an accurate estimation of the previous parameters. However, these colonization and loss parameters are inherited with no uncertainty (just point estimates are passed down), which, as previously mentioned, generates an artificially precise posterior distribution for the resistance parameters.

      However, the most severe issue with the statistics lies in the choice of priors for the cost parameters. All of them are uniform in a positive range that implies a positive cost. Importantly, the average over a positive range will always be positive; therefore, this method will ALWAYS estimate a positive mean for the costs. Note that the posterior distribution of some cost parameters seems to peak around zero and abruptly decays with no mass to the left of zero. This is caused by the choice of prior. Had delta been allowed to be negative (i.e., antibiotic resistance carried a benefit, having the prior be uniform between -1 and 1), the posterior distribution would likely be much more symmetrical, and the confidence interval would have included 0.

      Restating, because the prior is a continuous function between 0 and 1, it contains infinitely more mass in the region that represents there being a cost (delta>0) than in the region representing no cost (delta=0). This means that it is a mathematical impossibility for this model to infer the absence of a cost.

      Therefore, the main finding of the paper ("We found that resistance is costly") is a mathematical artifact of the prior choice and of the model structure.

    3. Reviewer #3 (Public review):

      Cotto and colleagues integrated data analysis with mathematical modeling to examine extended-spectrum beta-lactamase (ESBL)-producing E. coli in France. While ESBL prevalence has risen globally, it has stabilized at approximately 6-8% across Europe. Established risk factors for ESBL carriage include prior antibiotic exposure and travel to high-prevalence regions, most notably South-East Asia. The dataset incorporated information on ESBL-producing E. coli and travel history in young children, and the model was calibrated to ECDC surveillance data on ESBL across Europe, supplemented by literature-derived parameters on antibiotic use, E. coli biology, and transmission dynamics. The authors report that ESBL-carrying strains exhibit a 14% fitness cost in community transmission relative to susceptible bacteria, yet are cleared 23% less frequently. ESBL carriage was strongly associated with factors that prolong gut colonization. Both antibiotic treatment rates and transmission efficiency were identified as key determinants of community-level ESBL prevalence.

      Strengths:

      The study addresses a clinically and epidemiologically important topic. The integrated modeling approach is methodologically sound and well-suited to disentangling the relative contributions of transmission and antibiotic selection pressure.

      Weaknesses:

      Several concerns regarding the data used in this study warrant consideration. First, model calibration relied on ECDC surveillance data pooled across multiple European countries, several of which have substantially lower antibiotic consumption than France (ECDC ESAC-Net Annual Epidemiological Report, 2024). Given that antibiotic use is a primary driver of ESBL selection, ESBL prevalence is likely to be heterogeneous across these settings. Calibrating to a geographically diverse dataset risks introducing systematic bias into parameter estimates that may not be representative of the French context. The authors should repeat the analysis using France-specific data, or, where this is not feasible, restrict the calibration dataset to countries with comparable antibiotic consumption profiles. Second, the travel exposure data may be insufficient to adequately capture importation dynamics from South-East Asia, as the cohort consisted exclusively of young children, a demographic less likely to travel to high-prevalence regions than older age groups. This may result in an underestimation of travel-associated importation as a contributor to community ESBL prevalence, and the generalizability of these findings to the broader population should be interpreted with caution.

    1. Reviewer #1 (Public Review):

      Zeng et al.'s work links several key issues in Cryo Electron Tomography in ways that reinforce each other, inspired by the cycleGAN model, leading to very positive results across several benchmark datasets. The related topics include tomogram cleaning and simulations (two crucial areas in the field), with "spin-off" outcomes in automatic annotation and the completion of the missing wedge. The manuscript covers nearly all essential topics in Tomography, making it very comprehensive and potentially critical in the field. The generalization capabilities on the SHREC 2021 data set are very interesting, although difficult to quantify. I appreciate the approach, but I have serious concerns about some of the limitations of the results presented by the authors.

      1. Simplified data versus nowadays challenging tomography data. It is acknowledged the difficulty in making general tests. In this work, the method shows excellent results on potentially simple data sets (the SHREC 2021, which was used for a benchmark in ET several years ago, but not much used since then) and, even more, the old Relion data set for picking).

      2. Reproducibility by the average user. I have found many cases in which a specific software produces excellent results when run by the authors. Still, the average user is lost with the parameters and cannot reproduce these promising results. I propose that the authors address this issue by involving some experimental colleagues and ask them to repeat the work. This is a general concern that applies not only to this work but to many others. I think this consideration is crucial for a field that is growing very quickly and where method development happens at an extraordinary pace... but are all of them generally useful?

    2. Reviewer #2 (Public Review):

      This study introduces DUAL (Deep Unsupervised simultAneous denoising and simuLation), an unsupervised deep learning framework that jointly addresses denoising and realistic data simulation for cryo-electron tomography (cryo-ET). By leveraging a cyclic, unpaired learning strategy, DUAL avoids reliance on paired clean ground-truth tomograms, which represents a practical advantage over many existing supervised approaches.

      Through extensive quantitative evaluations on benchmark datasets, together with qualitative and downstream analyses on diverse experimental tomograms, the authors show that DUAL performs robustly across both denoising and simulation tasks. For denoising, DUAL outperforms several widely used methods on the SHREC 2021 benchmark and achieves the highest particle-picking accuracy on the RELION benchmark, indicating strong downstream utility.

      For tomogram simulation, the study presents an unsupervised framework that jointly denoises experimental tomograms and generates synthetic volumes that closely resemble experimental data. These simulated tomograms outperform existing approaches in downstream tasks such as particle picking and enable additional applications, including missing-wedge compensation and cross-domain adaptation, without requiring labeled training data.

      Overall, this work represents a substantial contribution to the cryo-ET field by providing a versatile unsupervised tool that reduces dependence on labor-intensive manual annotation, enables realistic data augmentation for training downstream models, and facilitates artifact mitigation. As such, DUAL has the potential to accelerate methodological development and progress toward comprehensive in situ structural biology.

    3. Reviewer #3 (Public Review):

      The paper is titled "DUAL: Deep Unsupervised Simultaneous Simulation and Denoising for Cryo-Electron Tomography." The authors provided two closely related code branches: one for denoising and one for missing-wedge correction. However, I did not find the simulation component. This is important, as the authors state that "the simulation branch provides learning-based cryo-ET simulation to generate synthetic tomograms indistinguishable from experimental ones."

      In addition, no pre-trained models were provided. Given that the authors indicate that all training data are publicly available, sharing trained models together with references to the corresponding datasets would significantly facilitate evaluation of the reported performance.

      The provided instructions are quite minimal and do not currently support reproduction of the reported findings. Compared with other cryo-ET software packages, the documentation is insufficient for installation and practical use. The software also does not consistently support standard cryo-ET file formats, particularly during inference for denoising and missing-wedge correction. In particular, volume preparation (in the first notebook of either pipeline) expects MRC input, whereas inference requires NPZ input. This inconsistency makes me believe that the shared code is not tested, and likely is a new wrap up that does not correspond to the version used to generate the results in the paper.

      I also found the denoising workflow difficult to interpret. The notebooks require a "clean" target volume as input, but it is not explained how such a volume should be obtained. It is unclear whether any clean volume may be used or whether this should be simulated based on what the user expects to contain in the input. The logic about this introduced prior is not clear. Additionally, it is not clear whether the default configuration parameters provided in the notebooks correspond to those used in the paper or are intended as illustrative examples. I had requested the exact configurations used to produce the reported results to avoid ambiguity.

      After many hours of trial, debugging, and experimentation, I was able to train a model for missing-wedge correction using the default parameters, although the process was slow and memory-intensive. However, despite sustained effort over two days, I was not able to perform inference using the trained model. Full-volume inference fails due to shape mismatches, as the network is trained on fixed-size 3D patches but does not support whole-volume inputs. Patch-based inference also fails at the stitching stage due to incompatible output dimensions, even when using standard volume sizes (e.g., 1024 × 1024 × 400 voxels) that work correctly during patch preparation.

      While less central, I also found the training time to be close to prohibitive. The notebook sets the number of epochs to two for a toy example and notes that more epochs are required for real experiments. In practice, training for a single tomogram required approximately 16 hours of computation on two high-end GPUs to reach only six epochs, and likely more would be required (100s?). Due to the inference issues described above, I was not able to evaluate the trained model.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors provide a resource to the systems neuroscience community by offering their Python-based CLoPy platform for closed-loop feedback training. In addition to using neural feedback, as is common in these experiments, they include a capability to use real-time movement extracted from DeepLabCut as the control signal. The methods and repository are detailed for those who wish to use this resource. Furthermore, they demonstrate the efficacy of their system through a series of mesoscale calcium imaging experiments. These experiments use a large number of cortical regions for the control signal in the neural feedback setup, while the movement feedback experiments are analyzed more extensively. The revised preprint has improved substantially upon the previous submission.

      Strengths:

      The primary strength of the paper is the availability of their CLoPy platform. Currently, most closed-loop operant conditioning experiments are custom built by each lab, and carry a relatively large startup cost to get running. This platform lowers the barrier to entry for closed-loop operant conditioning experiments, in addition to making the experiments more accessible to those with less technical expertise.

      Another strength of the paper is the use of many different cortical regions as control signals for the neurofeedback experiments. Rodent operant conditioning experiments typically record from the motor cortex, and maybe one other region. Here, the authors demonstrate that mice can volitionally control many different cortical regions not limited to those previously studied, recording across many regions in the same experiment. This demonstrates the relative flexibility of modulating neural dynamics, including in non-motor regions.

      Finally, adapting the closed-loop platform to use real-time movement as a control signal is a nice addition. Incorporating movement kinematics into operant conditioning experiments has been a challenge due to the increased technical difficulties of extracting real-time kinematic data from video data at a latency where it can be used as a control signal for operant conditioning. In this paper, they demonstrate that the mice can learn the task using their forelimb position, at a rate that is quicker than the neurofeedback experiments.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, Gupta & Murphy present several parallel efforts. On one side, they present the hardware and software they use to build a head-fixed mouse experimental setup that they use to track in "real-time" the calcium activity in one or two spots at the surface of the cortex. On the other side, they present another setup that they use to take advantage of the "real-time" version of DeepLabCut with their mice. The hardware and software that they used/develop is described at length, both in the article and in a companion GitHub repository. Next, they present experimental work that they have done with these two setups, training mice to max out a virtual cursor to obtain a reward, by taking advantage of auditory tone feedback that is provided to the mice as they modulate either (1) their local cortical calcium activity, or (2) their limb position.

      Strengths:

      This work illustrates the fact that thanks to readily available experimental building blocks, body movement and calcium imaging can be carried out using readily available components, including imaging the brain using an incredibly cheap consumer electronics RGB camera (RGB Raspberry Pi Camera). It is a useful source of information for researchers that may be interested in building a similar setup, given the highly detailed overview of the system. Finally, it further confirms previous findings regarding the operant conditioning of the calcium dynamics at the surface of the cortex (Clancy et al. 2020) and suggests an alternative based on deeplabcut to the motor tasks that aim to image the brain at the mesoscale during forelimb movements (Quarta et al. 2022).

    3. Reviewer #3 (Public review):

      The study demonstrates the effectiveness of a cost-effective closed-loop feedback system for modulating brain activity and behavior in head-fixed mice. Authors have tested real-time closed-loop feedback system in head-fixed mice two types of graded feedback: 1) Closed-loop neurofeedback (CLNF), where feedback is derived from neuronal activity (calcium imaging), and 2) Closed-loop movement feedback (CLMF), where feedback is based on observed body movement. It is a python based opensource system, and the authors call it CLoPy. Authors also claim to provide all software, hardware schematics, and protocols to adapt it to various experimental scenarios. This system is capable and can be adapted for a wide use case scenarios.

      Authors have shown that their system can control both positive (water drop) and negative reinforcement (buzzer-vibrator). This study also shows that using the closed-loop system, mice have shown to better performance, learnt arbitrary tasks and can adapt to changes in the rules as well. By integrating real-time feedback based on cortical GCaMP imaging and behavior tracking authors have provided strong evidence that such closed-loop systems can be instrumental in exploring the dynamic interplay between brain activity and behavior.

    1. Reviewer #1 (Public review):

      Summary:

      This article presents a study consisting of two experiments, which aim to dissociate and quantify the distinct motivational functions of phasic and tonic pain within a naturalistic and immersive VR setting. Specifically, the Authors test two hypotheses: (i) that phasic pain acts as a punishment signal that drives avoidance learning; (ii) that tonic pain reduces motivational vigor, promoting energy conservation and recuperation. In both experiments, participants performed a free-operant foraging task, where they collected virtual pineapples to earn points.

      In Experiment 1, phasic pain was delivered as a brief electric shock to the grasping hand when picking up green pineapples. As phasic pain intensity increased, participants were less likely to choose painful fruits. A reinforcement learning model that incorporated reward, pain cost and effort cost was able to successfully capture behavior.

      Experiment 2 combined effects of phasic and tonic pain. Tonic pain was induced by a pressure cuff on the non-dominant arm, simulating sustained discomfort. Interestingly, tonic pain did not affect the perceived intensity or avoidance of phasic pain. However, it significantly reduced movement velocity and pineapple collection rate, interpreted as a reduction of motivational vigor. A temporal decision model incorporating vigor cost successfully captured these effects.

      Concomitant EEG recordings showed that tonic pain was associated with reduced alpha and beta power in parietal and temporal areas. Phasic pain ratings and decision values distinctively correlated with skin conductance responses.

      Overall, these findings indicate that phasic and tonic pain have distinct and dissociable motivational effects.

      Strengths:

      This is an ambitious study that provides a quantitative dissociation of the roles of phasic and tonic pain in adaptive behavior, by integrating ecological neuroscience, motivational theory, and computational modeling. The use of immersive VR combined with a free-operant foraging task offers a more ecologically valid context to study pain-related behavior compared to traditional paradigms. Furthermore, the study employs a multimodal approach by combining behavioral data, computational frameworks, physiological signals and EEG. In particular, one of the main strengths of the study is the use of sophisticated computational modeling to capture phasic and tonic pain effects. The experiment codes are available on GitHub, increasing reproducibility.

      Weaknesses:

      As recognized by the Authors, there is no control condition involving an innocuous salient stimulus to rule out non-specific effects of distraction.

    2. Reviewer #2 (Public review):

      Summary:

      The study investigated the distinct roles of phasic and tonic pain in adaptive behavior. Phasic pain was proposed to function as a teaching signal, promoting avoidance of further injury, while tonic pain was hypothesized to support recuperative behavior by reducing motivational vigor. This hypothesis was tested using an immersive virtual reality (VR) EEG foraging task, in which participants harvested fruit in a forest environment. Some fruits triggered brief phasic pain to the grasping hand, which in turn reduced the likelihood of choosing those fruits. Concurrently, tonic pressure pain applied to the contralateral upper arm was associated with reduced action velocities. The authors employed a free-operant computational framework to quantify how phasic and tonic pain modulate motivational vigor and decision value. Importantly, model parameters were found to correlate with EEG responses, providing neurophysiological support for the hypothesized functional distinctions.

      Comments on revised version.

      All my comments have been well addressed.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates how phasic and tonic pain modulate behaviour in a free-operant foraging paradigm. The authors apply a computational modeling approach to the behavioural data to quantify the decision value of phasic pain, as well as the degree to which tonic pain reduces motivational vigour. EEG assessments showed, e.g., reduced signal power at alpha and beta frequencies in tonic pain conditions compared to no-tonic-pain conditions, but no association between these neural measures and motivational vigour. The authors conclude that tonic and phasic pain serve different motivational functions, with phasic pain acting as a punishment signal promoting avoidance and tonic pain reducing motivational vigour.

      Strengths:

      The experimental paradigm is highly innovative. Assessing human behaviour in a naturalistic yet highly controlled setting represents a promising approach to pain research. Notably, assessing pain magnitude implicitly, via its motivational value, offers insights about the overall pain experience that are not usually accessible via common pain ratings.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript is an excellent follow-up to your 2022 study, in which Sox17 expression was localized to the rete testis and shown to be required for proper formation of the Sertoli cell valve (transition region). By using Nr5a1-Cre to drive conditional deletion of Sox17 specifically in rete testis cells, you demonstrate that testis weights remain normal at 2 weeks of age but become significantly reduced by 8 weeks in Sox17-cKO males. At the later time point, the seminiferous epithelium is severely disrupted, with apparent arrest of spermiogenesis: the epididymal lumen is essentially devoid of sperm, and most tubules lack elongated spermatids.

      Strengths:

      The study clearly shows the role of Sox17 in Sertoli cells as being important to SV function. The SV (transition region) between the rete testis and seminiferous tubules remains an understudied domain of testicular biology. The present work, together with the authors' prior study, highlights intriguing mechanisms operating in this specialized niche.

      Weaknesses:

      At the same time, the available data do not yet fully explain either the developmental assembly of the Sertoli valve or the precise consequences of its functional disruption. These studies are nonetheless valuable precisely because they raise more questions than they answer; the conceptual implications are thought-provoking.

    2. Reviewer #2 (Public review):

      This manuscript investigates the role of SOX17 in the formation and function of the Sertoli valve (SV) at the interface between seminiferous tubules and the rete testis (RT). Building on previous work showing that rete testis-specific deletion of Sox17 disrupts SV formation, leading to defective spermiogenesis and male infertility, the authors explore how SOX17 overexpression in Sertoli cells regulates the SV of rodent testes.

      Using transgenic mouse models with ectopic Sox17 expression in Sertoli cells, the study demonstrates that SOX17 is not only required but can also modulate SV formation. Ectopic expression in Sertoli cells induces expansion of the SV structure and partially rescues SV defects and spermatogenesis in RT-specific Sox17 conditional knockout animals. The data support a model in which SOX17 acts through paracrine signaling to regulate SV formation, although the precise mechanisms remain to be clarified.

      Overall, this is a well-executed study with novel and significant findings. The ability to experimentally manipulate SV size is particularly compelling and provides a valuable framework to study fluid dynamics and epithelial interactions in the testis. This work will be of broad interest to the reproductive biology and developmental biology communities.

    3. Reviewer #3 (Public review):

      Summary:

      These studies are based on previously published work that showed that deletion of expression of the Sox17 gene in the testis essentially deleted the formation of the Sertoli valve in the Rete testis. The authors extended this work by constructing a vector that resulted in increased Sox17 expression by Sertoli cells and enhanced formation of the Sertoli valve in both wild type and Sox17 knockout mice. The work provides strong evidence supporting the requirement for Sox17 expression to allow formation of the Sertoli valve.

      Strengths: The general approach was to express Sox17 from a Tg mouse that expressed Sox17 from Sertoli cells. This Tg mouse was bred into both the WT and the Sox17 KO mouse. The Sertoli valve was enhanced in both the WT/Tg mouse and KO/Tg mouse, showing that ectopic Sox17 could compensate in the Sox17 Ko and act in a concentration-dependent manner in the WT mouse. The results are strong and support the conclusions from the authors. The results were as expected from the original paper describing the KO of Sox 17. These results strengthen these conclusions and provide ideas for additional conclusions. These studies were technically challenging, and the authors provided a very solid manuscript.

      Weaknesses:

      The authors refer several times to high or low expression, but it all appears to be based on immunohistochemistry, and there is no real quantification using PCR, for example. The process used for cell quantification lacks a rationale for why certain numbers were assigned.

    1. Reviewer #1 (Public review):

      This study by Li and colleagues examines how defensive responses to visual threats during foraging are modulated by both reward level and social hierarchy. Using a semi-naturalistic paradigm, the authors test how the availability of water or sucrose, with sucrose being more rewarding than water, shapes escape behavior in mice exposed to looming stimuli of different intensities, which are used to probe perceived threat level and defensive responses. In parallel, the study compares dominant and subordinate animals to assess how social rank biases the trade-off between reward seeking and threat avoidance. By combining behavioral analyses with computational modeling, the work addresses how reward level and social context jointly influence escape decisions in an ethological setting.

      Across the different experimental conditions, perceived threat level is the main determinant of behavior. The authors show that looming stimuli associated with higher threat (contrast) consistently elicit faster and more robust escape responses than lower threat stimuli. This effect is particularly evident during early exposures, when animals are highly vigilant and have not yet habituated to the looming stimulus (learned that it is not dangerous). Later they described that as animals gain experience and habituate, behavior becomes more flexible, and reward level begins to exert a graded modulation of the escape response. Importantly, the authors show that under high threat conditions increasing reward value leads to more frequent and faster escape rather than greater reward pursuit, specifically in dominant mice. This finding is particularly relevant, as it suggests that highly valued rewards can heighten vigilance and thereby enhance responsiveness to threat, highlighting that reward does not simply compete with defensive behavior but can also reshape it depending on the perceived level of danger, in contrast to low threat conditions, where threat can be more easily outweighed by reward. However, it is worth noting that the authors use an extremely low contrast for the low threat condition (20%), which may to some extent be insufficient to reliably trigger escape responses. Thus, an important conceptual contribution of the study is the introduction of vigilance as a useful framework to interpret these effects. Vigilance is treated as a behavioral state reflecting heightened attention to potential danger. In line with what is known from natural foraging, mice initially maintain high vigilance when confronted with an innate threat. This perspective helps clarify a finding that might otherwise appear counterintuitive. One might expect higher rewards to motivate animals to tolerate risk, explore more, and habituate faster in any scenario. Instead, the data suggest that highly rewarding outcomes can elevate vigilance, making animals more responsive to threat and leading to faster or more frequent escape under high threat conditions. In this sense, reward does not simply compete with threat but can also amplify sensitivity to it, depending on the internal state of the animal.

      The social results are particularly interesting in this context as well. Dominant mice consistently prioritize avoidance over reward, showing stronger escape responses and slower habituation than subordinates. This behavior is well captured by the vigilance framework proposed by the authors: dominant animals appear to maintain higher vigilance, which biases decisions toward threat avoidance. The authors further suggest that stable social relationships sustain high vigilance and slow habituation, framing this as an evolutionarily conserved strategy that may enhance survival. This interpretation provides a valuable perspective on how social structure shapes defensive behavior beyond immediate physical interactions. At the same time, there are important limitations to this interpretation. All experiments were conducted in male mice, and it is possible that the relationship between social hierarchy, vigilance, and defensive behavior would differ substantially in females. In addition, the idea that stable social relationships sustain elevated vigilance should be interpreted carefully, as it does not fully align with broader views of social stability as protective against anxiety and stress and generally beneficial for mental health and resilience. These points do not undermine the findings but suggest that the social effects described here should be interpreted with caution and within the specific context of the task and sex studied.

      Another important limitation is that the neural mechanisms underlying these effects remain highly speculative. Although the manuscript includes an extensive discussion of candidate circuits, particularly involving the superior colliculus and downstream structures, these interpretations go far beyond the data presented in the study and are not directly supported by experimental evidence within the paper itself. The discussion gives substantial weight to potential circuit mechanisms based primarily on previous literature rather than on findings from the current study. Given the complexity and distributed nature of the circuits likely involved in integrating vigilance, reward, social context, and defensive behavior, the present work is better viewed as providing a strong behavioral framework rather than direct mechanistic insight into the underlying neural substrates. In this context, some references discussing how animals learn to suppress defensive responses to repeated looming threats and the neural mechanisms supporting this process could further strengthen the discussion (Salay et al 2021; Fratzl et al. 2021; Conway et al. 2025; Mederos et al. 2025).

      Methodologically, the behavioral paradigm is well suited for studying escape decisions in socially housed animals, and the machine learning based classification of defensive responses is a strength. The computational model provides a useful formalization of how threat level, reward level, and vigilance interact and may be valuable for other laboratories studying escape, approach avoidance, or conflict situations, particularly as a way to classify behavioral outcomes after pose estimation. More generally, the work will be of interest to the neuroethology community for its detailed characterization of escape behavior under naturalistic conditions. At the same time, some statements in the discussion slightly overstate the novelty of the methodological approach. For example, the claim that the study differs from earlier work by using machine learning rather than manual annotation overlooks that several previous studies have already implemented automated or semi-automated strategies to classify looming evoked defensive behaviors beyond manual scoring alone.

      Given the ethological nature of the study and the high inter individual variability reported by the authors, clarity and precision in the methods are especially important for reproducibility. While the revised manuscript addresses many earlier concerns, some aspects remain slightly difficult to follow. For example, the main text states that animals were not water deprived to minimize differences in internal state across conditions, whereas parts of the methods describe experiments in which animals were water deprived. This distinction is not always clearly explained across the different experimental sections, despite internal state being central to the interpretation of the behavioral findings. A clearer separation and description of these conditions would further strengthen confidence in the work. In addition, it was somewhat surprising that the low contrast (20%) looming condition was still sufficient to trigger robust escape responses, and additional clarification or discussion regarding stimulus saliency at this contrast level could help readers better contextualize these findings.

      Overall, this study provides a rich analysis of how reward level and social hierarchy modulate defensive behavior through changes in vigilance. It offers a useful conceptual advance for thinking about escape behavior in semi-naturalistic settings and lays a solid foundation for future work aimed at linking these behavioral states to underlying neural circuits.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript uses large-scale existing datasets that span almost the full range of human life (5-100 years) to identify two distinct architectural cortical gradients within visual cortex. These gradients are distinct in that in one cytoarchitecture and myeloarchitecture converge and in the other they diverge. The authors tested whether these gradients mapped onto known functional properties of visual cortex, as well as accounting for visual behaviours that are impacted throughout the lifespan. The manuscript also reports the identification of a hitherto unknown cluster of visual field maps in the anterior temporal lobe.

      Strengths:

      A major strength of the current manuscript is the use of large-scale measurements of human brain structure throughout the lifespan, courtesy of the Human Connectome Project Initiative. The scope of this cross-sectional analysis would be rare, if not impossible to achieve through an individual project.

      The approach employed holds promise for assessing the link between large-scale anatomical gradients in the brain and functional/behavioural properties. The current manuscript focuses on visual cortex, but the approach could easily be implemented across the brain in general.

      Weaknesses:

      While the evidence for a new topographic visual field map cluster in the anterior temporal lobe is less convincing than for clusters in posterior cortex, new analyses strengthen the claim for a visuospatially tuned cluster that shared signatures of topographically organised clusters (e.g., contralateral representations) but might lack clear evidence, at present, for such topography. Investigation of how age-related and SNR confounds contribute to gradients and their life-span development could be expanded.

      Comments on revised version.

      The authors have taken the comments onboard and performed a number of analyses that strengthen the argument for these clusters being visuospatial in nature. I appreciate the additional analyses and effort. It may be helpful to discuss the evidence for contralateral biases in the absence of clear topographic maps in cortex in the context of what others have terms visuospatial coding (Groen et al., 2021, TiCS) where just such a mechanism is described.

    1. Reviewer #1 (Public review):

      This manuscript addresses how PGCs migrate towards SGPs in the Drosophila embryo. It's been shown that Hh produced by SGPs acts as an attractive cue, and that Wunnen(s) act as repulsive cues. In this work, the authors propose that Wun and Wun2 refine PGC guidance by attenuating Hedgehog signalling coming from other tissues.

      Overall, the study is potentially interesting and could make an important contribution to the field. The data shown support the idea that Wun/Wun2 negatively regulate Hh signalling and produce PGC migration phenotypes associated with Hh. However, in my opinion, there are two major questions that should be addressed.

      (1) Which is the mechanism by which Wun/Wun2 attenuates Hh signalling? The authors propose that Wun/Wun2 block Hh ligand transmission, but their data could also be explained by other possibilities, such as altered Hh production, uptake, retention or degradation, among others. The authors should either show the effect of Wun/Wun2 in Hh transmission mechanistically or attenuate their claim.

      (2) How do Wun/Wun2 attenuate Hh signalling in PGCs? The authors propose that Wun/Wun2 function both in somatic tissues and in PGCs, but these two sites of action may have very different mechanistic implications. In the soma, Wun/Wun2 could affect Hh transmission, but a PGC-autonomous role cannot be explained simply by reduced Hh ligand transmission from producing cells; it would more likely involve ligand uptake, receptor trafficking, intracellular degradation or altered PGC responsiveness. This distinction should be central to the interpretation of the data.

    2. Reviewer #2 (Public review):

      Summary:

      In this submission, Roy et al. examine the process of Drosophila PGC migration. Directed cell migration requires the concerted activities of chemoattractants and repellents to guide cells to the correct locale. In their submission, the authors describe a role for regulated Hedgehog (Hh) signaling to inform PGC migration. In prior work, the authors reported that Hmgcr potentiates Hh signaling, providing a permissive axis. A gap in the field, however, was the identification of the repulsive cues that guide PGCs out of the midgut and toward the future gonad. In the current work, the authors report that two wunen genes (wunen and wunen 2) inhibit Hh signaling, thereby repressing Hh activity. The model is that Hmgcr and wunen(s) balance the transmission of Hh signals to enable effective PGC migration.

      Strengths:

      A strength of this work is the comprehensive genetic analysis performed by the authors. The authors examine zygotic versus maternal contributions, autonomous versus non-autonomous requirements, and use a variety of RNAi and mutant allele combinations to examine genetic requirements and interactions. Another strength is that the data presented are generally clear and well quantified. Insets are provided to enhance visualization, and relevant data are quantified through replicated experiments.

      Weaknesses:

      Weaknesses of the work include a lack of biochemical data to validate some of the proposed interactions. Although the authors do report lipidomics data, little is done with these findings to validate or place the results in the context of a mechanistic model. Despite these issues, the conclusions stated are generally well supported by the results.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors present a method to detect natural selection on transcription factor binding sites (TFBSs), which is an upgraded version of a previously published method (Liu and Robinson-Rechavi, 2020). This upgraded version of the test implements more explicit models of evolution and is shown to outperform its predecessor in terms of both power and false positive rate. I think this method can be a valuable resource for the community and can be helpful not only to studies of TFBSs but also broader evolutionary questions related to genotype-phenotype maps or fitness landscapes.

      Major comments:

      (1) Questions related to Figure 1

      Figure 1, along with the first section of the Results, shows that the SVM score and its sensitivity to mutations are generally correlated with the strength of ChIP-seq signals. It is not very clear to me, however, what the motivation is behind this part of the paper. It seems that the model used to predict binding strength is a pre-existing one, and it is unclear what is new in this section. Was the prediction model retrained using different data? Was its validity confirmed using new data? I would appreciate some more elaboration on how these results differ from what was presented in the previous study of Liu and Robinson-Rechavi (2020).

      The existence of weak or negative correlations between SVM and coverage, which reportedly reflects low-quality peaks, seems applicable not only to this paper, but also to previous ones, so I would like to have it confirmed whether the question and the authors' answers apply to previous studies as well.

      It is reported that SVM scores capture TF binding signals better than conservation-based statistics do. My intuitive interpretation is that both ChIP-seq peaks and SVM scores are supposed to reflect binding strength, whereas conservation is supposed to reflect selection (i.e., different definitions of "function" as mentioned above). It is not explicitly explained in the Results, however, what the difference indicates, leaving only an impression that the SVM score is "better" than the conservation statistics.

      In summary, I think further elaboration on the above problems would make the flow of thought of this paper easier to follow.

      (2) Lack of directional selection for low binding affinity

      In the analysis of Drosophila melanogaster ChIP-seq peaks, there were more cases of directional selection for higher binding affinity than directional selection for lower binding affinity. The authors suggested that this observation is "likely biological" because the same pattern was not seen in simulations (line 412-413). I wonder if this could have resulted from a difference in the distribution of ancestral binding affinity across TFBSs between real and simulated data. If binding affinity was generally low in the common ancestor of D. melanogaster and D. simulans, selection for low binding affinity would manifest mainly as purifying selection against mutations that increase affinity instead of directional selection. Ancestral sequences for simulations, if I understood correctly, are observed peaks in D. melanogaster (line 715-719), which would include high fraction sequences that could be rarer in the real ancestral sequences.

      The description of this particular result does not refer to a figure or table, nor is it revisited in the Discussion. Figure 5 treats peaks under directional selection as a single category. Taken together, it is hard to tell how this observation should be interpreted. If the authors consider this result as biologically meaningful, I would suggest adding more details (e.g., the number of each side).

      (3) Selection in non-focal lineages

      Regarding the detected signals of directional selection for stronger binding in certain tissues (Figure 6), I wonder if it is the focal species or those very tissues that are "special": did the human lineage undergo more adaptive regulatory evolution than the chimpanzee lineage, or do nervous and male reproductive systems have a high "propensity" for adaptive regulatory evolution? Assuming that the binding preference of the same TF did not undergo a significant change since human-chimpanzee split (which, I believe, is a built-in assumption in both RegEvo and the permutation test), it should be possible to perform the same test using chimpanzee sequences that are homologous to the human ChIP-seq peak regions. In the case of coding sequences, for example, Bakewell et al. (2007) found that it was the chimpanzee that had more genes under positive selection than humans; I wonder if TFBSs show the same or a different pattern.

      (4) Comments on terminology

      a) Meaning of "function"

      The word "function" has had different meanings in the biology literature, with some authors using "functional" to refer to anything with a phenotypic effect and some using it only for targets of selection. A (putative) TFBS would be considered "functional" as long as it has TF binding affinity if we follow the effect-based definition, but only if its binding affinity is under selection if we follow the selection-based definition. In this manuscript, the term "function" appears to have been used to refer to TF binding but not selection, most notably in the first Results section. There are also places where it is less clear what "function" means exactly (e.g., "deeply conserved elements that are likely to be functionally important" of line 61). Since this paper is about evolution, it is likely that many readers prefer the selection-based definition or assume that the selection-based definition would be used. Thus, using "function" to refer to just TF binding could be confusing. To this end, I would suggest that the authors drop the word "function" or give an explicit definition early in this paper.

      b) Directional selection in different directions

      In this paper, selection for increased TF binding affinity is referred to as "positive directional selection", and selection in the opposite direction is called "negative directional selection" (as exemplified in Figure 2). I understand that using such shorthand names would make the text less clumsy, but these two terms could potentially be confusing, as "positive selection" and "negative (purifying) selection" are also terms referring to specific types of selection and have some connection to directional and stabilizing selection. Therefore, I suggest that the authors use something like "selection for increased/decreased binding affinity" instead, or note explicitly in the text that "positive/negative directional selection" would be used as shorthand.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Laverre et al. provides an interesting new test of selection on TF binding. Rather than focusing on sequence changes, this test is specifically for changes in predicted TF binding affinity. The authors report directional selection on 5.1% of tested regions in Drosophila, as well as a signal of selection on CTCF binding in the human CNS and male reproductive system.

      Strengths:

      Overall, I think this represents an important direction for the field of molecular evolution: now that TF binding can be predicted fairly well from sequence, it can be a very useful focus for tests of selection.

      Weaknesses:

      As mentioned several times in the manuscript, Jiang and Zhang (2024) pointed out some issues with a previous permutation-based version of this test. Foremost among these was the issue of ascertainment bias: when testing only experimentally supported TF binding sites from a focal species, and then asking what type of selection (or lack of selection) led to those sites, one is guaranteed to find more substitutions that increase affinity, simply because the sites were selected in the first place as those with maximum (empirically measured) affinity.

      To address this issue, the authors simulated Drosophila CTCF peaks evolving neutrally and then tested different ascertainment cutoffs in Figure 4D. It was not entirely clear to me what is shown in Figure 4D: the text says the bins were stratified by derived delta-SVM, whereas the figure says SVM, and the legend says derived SVM (both without the delta). I was unable to find any clarification of this in the Methods section. In any case, I am not really convinced by this, for two main reasons. First, when analyzing empirical ChIP-seq data, I would guess that only a tiny fraction of the genome is bound (far less than 1%, especially in mammalian genomes). However, the most extreme bin in Figure 4D is taking the top 10% of (delta?) SVM values. What would Figure 4D look like at bins of the highest 0.1%, 0.001%, etc? My guess is there would be a strong uptick in the FPR. The second reason is actually more important and fundamental than the first. As long as this method is working as described, I cannot see any way that it would ‘not’ be impacted by ascertainment bias. As an extreme case, imagine that all TF binding sites tested had the maximum possible SVM scores; then none of them would have any chance of showing directional selection against binding, while even those that evolved neutrally would appear to have directional selection in favor of binding. Of course, real empirical data are not as extreme as this, but the same concept applies in less extreme scenarios.

      This bias could explain patterns observed in the real data. For example: "We observe much more positive than negative directional selection, a pattern likely biological rather than methodological, since it is absent from simulations." This is exactly the pattern predicted under ascertainment bias (in the extreme-scenario thought experiment above). I suspect it is absent from simulations simply because the authors did not properly account for this bias in their simulations.

      If the main result reported by the authors had been a lack of any directional selection in favor of binding, and instead only neutrality or directional selection against binding, then this ascertainment bias would not be an issue- it would only have made their results conservative. Unfortunately, this is not the case, and the directional selection in favor of binding, which is the main result emphasized from the empirical analysis, could be inflated by this bias.

      Minor point:

      The following statement: "In contrast, phastCons and phyloP scores lack such enrichment and have a lower dynamic range, suggesting that the conservation scores are less sensitive to fine-scale variation of TF occupancy and thus regulatory region function" is only true if one assumes that TF binding is the only function of this region. One could even turn this around and say the fact that the sites affecting TF binding are not the most conserved is actually evidence that TF binding is not a good indicator of these regions' entire function. I suggest the authors soften this claim that conservation scores are less sensitive to regulatory region function.

    1. Reviewer #1 (Public review):

      Summary

      The authors aim to understand, in the context of leaf shape, how the constraints imposed by development inform evolution. Leaf shape is a good place to study the influence of development on evolution because it is a trait that exhibits a lot of diversity, and the developmental mechanisms that give rise to leaf shapes are apparently rather conserved across angiosperms.

      As part of the motivation for their work, the authors cite a previous study (Geeta et al), which found that in angiosperm phylogenies, transitions from complex to simple leaf shapes occur through evolution more often than transitions in the opposite direction. Is this due to developmental constraints or adaptation?

      The authors undertake two parallel lines of work:

      (1) Extending the study of Geeta et al with more data, consisting of both phylogenies and a shape classification dataset. The conclusion from this line of inquiry is that transitions from lobed to unlobed leaves are more common than transitions away from unlobed leaves.

      (2) The authors conduct evolution simulations in a computational model of leaf development. Here, they look at {\it neutral} mutations and whether simply neutral evolution is sufficient to drive the observed trend.<br /> The conclusion of the second part of the work is that the driver of the evolution toward simple leaf shape is entropy: there are more ways to make unlobed leaves than to make lobed leaves (at least in terms of gene regulation parameters that will produce the two leaf types). The argument is that random gene regulatory networks are more likely to produce unlobed leaves than lobed leaves; therefore, neutral evolution drives this trend.

      Data Analysis

      Roughly $9000$ images of leaves were classified into 4 categories: unlobed, lobed, dissected, and compound. These labels were applied to the tips of 5 phylogenetic trees of angiosperms (3 resolved at the genus level and 2 at the species level). By fitting a continuous-time Markov chain to the labelled trees, the authors claim that there is a significantly higher rate of transition to the unlobed leaf shape compared to transitions to more complex shapes.

      Simulation

      First, the authors validate a computational model (Runions et al) for leaf growth on an experimental dataset. By changing parameters in the model, they can recapitulate the morphological changes in the shapes of Arabidopsis leaves engendered by expression of two particular genes.

      Then the authors run an evolutionary model (without selection, just random mutations) on top of the computational leaf development model. As the random walk in parameter space reaches a stationary distribution, they look at both the proportions of the leaf categories in the steady state as well as the transition rates between different categories. The result is that transitions to unlobed leaves are more common than from unlobed leaves.

      General Comments

      The authors use angiosperm phylogenies from other works as the basis for the data analysis part of their work. Given the centrality of these phylogenies for their conclusions, more information is needed about how these phylogenies were constructed and what they mean. What is the timescale that they span? What method is used to infer them? What regions of DNA were sequenced in order to build the phylogenies? Also, maybe some more discussion of angiosperm evolution (e.g., when was the most recent common ancestor of all angiosperms?) would help put the study in context.

      We also need a more in-depth discussion of the computational model. What are all the $>100$ parameters doing, and what informs the seemingly strange mutational model that changes parameters by 3 orders of magnitude?

      I am confused about how the rates of transitions were inferred from the phylogeny. Here, one has a phylogeny inferred by some method (which needs to be described in more detail), and just the leaves are labelled. It is stated in the methods that BayesTraits was used to infer the transition rates. I realize this method is probably documented elsewhere, but a bit of a summary of how it works and how to interpret its results would (1) make the paper more self-contained and (2) if the algorithm is credible, make the results firmer.

      I am a bit skeptical of the authors' interpretation of the biological trend (of complex to simple leaf shapes) as being driven by neutral evolution. Why does one expect that the mutations generated by the random walk models described in the work are in fact neutral mutations?

      - If the entropy of simple leaf shapes is higher than that of complex leaf shapes, why did we have complex leaves at all? I suspect the authors might argue that this is due to selection. In that case, what allows these complex shapes to become simpler? Wouldn't they be losing the selective advantage that drove them to be more complex in the first place? Or maybe the idea is that the rates are inferred assuming some steady state that generates the phylogeny? I did not understand this point.

      Are the rates of transitions between leaf types inferred for the phylogeny assuming that the phylogeny is generated by the steady state of some Markov process? (I think the answer is no: in that case, how does one explain the initial condition?) If I take the mutation model (random walk) seriously, then shouldn't I expect that this steady state obeys detailed balance? In that case I should have $p_i r_{i\to j} = p_j r_{j\to i}$ for each of the occupancies $\{ p_i\}$ and transition rates $r_{i\to j}$ for the shape categories. How close are the rates inferred from the phylogenies to obeying detailed balance? Presumably, the Markov chain fitted to the simulation data obeys detailed balance because the mutation model itself does?

      I find it hard to take the discussion of development seriously without some consideration of mechanics. Presumably, the mechanics are hidden in the computational leaf development model, but this model is not discussed in enough detail for the reader to know. It seems to me that the interesting question is: what are the {\it mechanical} constraints on development that drive the apparent trend in evolution towards simpler leaf shapes? Maybe it is something about the type of differential growth needed to make complex leaf shapes less robust to mutation. But in this case, I would assume that selection plays a role in the complexity of shape. In any case, a better understanding (or explanation) of the computational model is needed to make this interpretation.

      Some discussion of timescales is needed, especially when invoking neutral evolutionary arguments. If a neutral mutation occurs, its time to fix in a population of size $N$ is $\sim N$ generations. What are the relevant angiosperm population sizes and the number of mutations that separate branches on the tree? Are timescales remotely consistent with e.g., the age of angiosperms on Earth?

    2. Reviewer #2 (Public review):

      Strengths:

      The paper's underlying question is interesting, extending the authors' prior work on RNA along similar conceptual lines. The paper combines both image analysis of leaves and a computational analysis of a simple model of leaf development.

      Weaknesses:

      The entire paper is based on the Runion model. More intuition about the Runion model would be useful for a broader readership that cares about the evolutionary aspect of this, but may not know the developmental model in question. Obviously, this is prior well-established work, but 2 - 3 sentences highlighting the key structural aspects of such a model would be great. Currently, that intuition is found implicitly in a sentence on page 2 ("complex leaf shapes need more specificity in their GRNs than their simpler unlobed leaf shape"), but the reader is left wondering - is the Runion model a detailed mechanistic one with multiple interacting genes/proteins? If so, how many? Or is it just 2 - 3 genes but with complexity entirely in how long they are each expressed/when they are turned off, etc.

      The Runions model has nearly 100 free parameters. Random walks in 100-dimensional spaces have generic properties like a tendency to move toward regions of larger volume that have nothing to do with leaf biology. How do you disentangle the geometry of high-dimensional random walks from genuinely biological developmental bias? Would a toy model with 100 parameters and arbitrary phenotype categories also show "bias toward simplicity" if "simple" phenotypes occupy more volume?

      The discussion of Figure 4 (PCA of parameter space) uses "area" loosely when what's actually being measured is bin count in a 2D projection of a high-dimensional space. I would think that, in general, PCA projections can be misleading about volume in the full parameter space, but I can't tell if that's an issue in this case. Some comments/thoughts here would be useful.

      The classifier validation section is in the Methods section, but it seems critical to the whole story. The < 80% agreement with manual classification could propagate to the rest of the estimates in the paper. Again, some comments/thoughts here would be useful.

      The authors should explain Mut2 and Mut5 in the main paper with a sentence or two, at least schematically, because how you mutate is obviously very relevant to interpreting a paper about biases in variation.

      The two mutational schemes use additive perturbations to individual parameters. Real mutations presumably affect regulatory networks in more structured ways (e.g., changing binding affinities that affect multiple parameters simultaneously). How sensitive are the results to the assumption of independent single-parameter mutations?

      The connectedness argument is made using a 2D PCA projection. Is there a way to check this statement in the full parameter space or perhaps in higher-dimensional projections to test the robustness of this result? Connected components can merge/split under different projections.

    1. Reviewer #1 (Public review):

      This study presents a new model of phenotypic variation incorporating direct and indirect genetic effects, as well as a new implementation (RAINBOWR) for quantification, genomic prediction and GWAS. It includes a simulation study to test the model and implementation, and three applications to plant species.

      The abstract describes the main novelty and significance of the study as follows: "Recent studies have utilized high-resolution polymorphism data to enable genomic prediction (GP) and genome-wide association study (GWAS) of IGEs, but unified methods remain limited". I disagree with this statement (e.g., using ASREML: https://doi.org/10.1186/s12711-018-0409-7, using LIMIX: https://doi.org/10.1186/s13059-021-02415-x; etc.).

      The parameterisation of genetic effects in the model is not standard and complex. Hence, the simulation study is key, and the results need to be presented in a very rigorous manner. I have several points to make on this:

      (1) L172 says the estimated parameters are "close to" the real parameters. The results of the simulation study need to be quantitative (see https://www.biorxiv.org/content/10.64898/2026.03.10.710784v1.supplementary-material for example).

      (2) Figure 2h: the estimates seem to be biased, no?

      (3) Figure 2 in general: why isn't there a difference between cov and noncov? Do we not expect the inclusion or non-inclusion of a covariance term to affect the other genetic parameters and the results presented in Figure 2?

      (4) Does "total BLUPs were highly correlated between models with and without 𝜌" really validate the model?

      (5) As far as the GWAS is concerned, the results of the simulation study should include a figure showing whether the p-values are inflated (as observed in the grape application), and not just a ROC curve.

      The model only includes IID residuals, whereas the importance of including non-genetic social effects (IEE) has been demonstrated in many settings, and other IGE plant studies have used sophisticated spatially structured residuals (e.g. 10.1111/nph.12035). Can the authors justify why they considered only IID residuals? In the three applications presented, wouldn't it be appropriate to include spatially structured residuals and potentially other relevant covariates?

      It remains unclear why the authors chose such an unconventional parameterisation of the DGE IGE models for the questions asked in this study. It seemed appropriate to study frequency-dependent selection (previous paper), but for this study, focused on IGE quantification and GWAS, the classical models (e.g. early models by P. Bijma but also more recent models that allow for distance-dependent IGE) seem appropriate, and they are much simpler and easier to interpret, and have been validated in many settings). The Discussion paragraph L274-284 only strengthens my doubts.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Sato and Hamazaki have expanded upon previous work, describing quantitative genetic models for direct and indirect genetic effects and applied this to both simulated and real plant datasets of three different tree species. The methods are clearly described and accompanied by a number of R packages freely available to the wider community.

      Strengths:

      The main strength lies in the joint modelling of DGE, IGE and their covariance while also simultaneously modelling single-SNP fixed effects (including SNP interactions across neighbours) and a polygenic effect that goes beyond a simple kinship correction as found in many traditional GWAS models, to a compound kinship structure that accounts for DGE, IGE and their interaction.

      Weaknesses:

      There were some aspects that deserved more attention from the authors. For example, the authors found that a very large amount of phenotypic variation in citric acid content in grapes was explained by neighbour identity, along with over 1000 significant SNPs, yet there was little to no discussion of this result and how it could have arisen (apart from some mention of volatiles and ethylene - but without being explicit on the mechanism here). The simulation study also only considered the scenario of equal direct and indirect genetic variances, while previous studies, as well as the 3 real datasets presented in this study, show that DGE variance is almost always larger than IGE variance. A simulation study cannot be exhaustive, of course, but it seems more likely that in reality and for most traits, IGE will be more difficult to detect than DGE.

    3. Reviewer #3 (Public review):

      Summary:

      The authors aimed at studying the genetics of interactions between individuals, notably the genetic architecture of indirect genetic effects. For that, they mobilized a technique known as "genome-wide association" study. GWASs are typically formalized as linear mixed models (LMMs) with fixed effects to identify the oligogenic component of the genetic architecture (usually SNPs tested one by one, as done here), and with random effects to quantify the overall contribution of the polygenic component of the genetic architecture (using a kinship matrix). They used an LMM with a few corrections and improvements from one of their already-published model, assessed it on data they had already simulated in a previous work, and applied it to three datasets generated and originally analyzed by others, focusing only on direct genetic effects. The results on simulated data confirmed that it was necessary to adapt their previous model. The results on real data confirmed the presence of negative correlation between direct and indirect genetic effects (for two out of three species), as was already known from other studies. They found a few SNPs with significant, indirect effects, which led them to identify candidate genes, but they did not validate them.

      Strengths:

      The main strength of the manuscript lies in the question tackled by the authors, i.e., related to indirect genetic effects, with the ambition to go beyond the estimation of overall effects towards the distinction between polygenic and oligogenic components of genetic architecture. They also found, in an apple dataset, a significant IGE SNP that also happens to be in a DGE-associated region.

      Weaknesses:

      (1) Overall, the authors do not engage sufficiently with the existing literature, and do not provide strong evidence that their approach is more powerful or more interpretable than others. Hence, this work seems rather incremental.

      (2) The authors used an LMM that corresponds to a previous LMM they already published in 2021, with a few changes that appeared more like corrections than improvements. Their model raised several questions.

      (3) First of all, their previous model included the polygenic component of direct genetic effects (modeled as random with a kinship matrix), but not the polygenic component of indirect genetic effects. As a consequence, the initial model did not allow both direct and indirect genetic effects to be correlated, although this correlation is the hallmark of the topic: a negative correlation can lead to selection on direct effects only to deliver a negative genetic gain (Griffing, 1967). This was corrected in their new model here, so that it is similar in this respect to the other models. They highlighted that, on simulated data, their new model could "infer a trade-off between DGEs and IGEs", but that was the very goal of introducing the correlation parameter, so it was reassuring at least to know that they could estimate it on simulated data. On real data, they found evidence for it being negative, which was already the case in Cappa and Cantet (2008) for a tree species, in Haug et al (2023) for annual crops, in Montazeaud et al (2023) for A. thaliana, etc. They tested for significativity but did not provide any confidence interval. They showed the proportion of variance explained by the covariance, but did not discuss the sign or magnitude of this correlation.

      (4) Although the authors included a correlation parameter between DGE and IGE in their updated model, they did not specify if the residual errors were correlated, too. In fact, they did not even specify a distribution for them. It is already known that allowing for correlated errors may not change the estimates (Haug et al, 2021), but in some settings it can be important (Bergsma et al, 2008).

      (5) In appendix S4, they say that the "ordinal" model (I am not sure of what they meant by this word) "defines polygenic DGE and IGE by random effects without fixed effects for each SNP". However, this is not correct; see Baud et al (2021), for instance. In any LMM, it is straightforward to include a single fixed effect for a given SNP, and to do it one SNP at a time. Moreover, they claimed that "compared to the ordinal model (Equation S4), the proposed model (Equation 1) is more extensible to incorporate SNP-wise fixed effects while distinguishing variance-covariance matrices", without providing more evidence than this statement.

      (6) The authors seemed keen to convince us that the fact that their model is analogous to the Ising model of ferromagnetics was an advantage in itself. But why would it be? Beyond the mere analogy, it should be a matter of modelling choice, and thus be clearly motivated. For instance, they chose to assess the strength of the association between the trait in the focal individual (y_{k_i}) and the average (dis)similarity between the focal individual and all its neighbors (in neighborhood k), calling the latter "indirect genetic effect". Moreover, it is not clear if what they called "IGE" is \beta_{q,2}, u_2, both, or also \beta_{q,12}, etc? Furthermore, they should have used another term as this is not the same as the "indirect genetic effects" of the other models. In these models, what is called the indirect genetic effects can be modeled as depending on group size (see Hadfield and Wilson, 2007; Bijma, 2010). In which sense would the approach of the authors be better? How does it relate to the other models? Do they have more power? Is their term more interpretable?

      (7) Another way in which the authors' model may be different from the other models is in the way it models interactions between direct genetic effects and aggregate (dis)similarity between focal and neighbors. At the level of the polygenic components, other models simply have a (DGExIGE) term capturing the deviations from the additivity of DGE + IGE (e.g., Wright, 1985, in the multispecific context). Here, the authors indeed mentioned "interactions between polygenic DGEs and IGEs" and introduced the K_12 matrix, but it is not clear how different (or similar) it is from the more classical (DGExIGE) term. At the level of the oligogenic component, the authors introduced \beta_{q,12}, but it is not clear, to me at least, how it relates to K_12 and K_21.

      (8) The authors checked their model on simulated data for various levels of correlation between u_1 (GE) and u_2.

      (9) It is not clear why they have higher absolute errors with negative covariance than with a positive one.

      (10) As a causative IGE SNP, the authors considered one with a beta_{q,2} significantly different from 0. However, they also have two other coefficients, beta{q,_}1 and beta_{q,12}, for each SNP q. How is the FDR in RAINBOW controlled in such a case? This is not detailed.

      (11) In their simulations, the causative IGE SNPS were also causative DGE SNPs. However, this may increase power. From the manuscript title, one could assume that the authors' goal was to distinguish between the SNPs that are both DGE and IGE, versus the ones that are IGEs only.

      (12) From what I understood, the authors first estimated the (co)variance components once and for all on the model without any SNP, and they then used the values to fit the GWAS model one SNP at a time. This assumes that the inclusion of SNP effects modeled as fixed would not change anything regarding the (co)variance components, but this is not warranted.

      (13) The authors applied their model to three datasets of perennial plants.

      (14) They only used their model and did not provide evidence that their model gave a significant improvement compared to other models, such as the one of Baud et al (2021).

      (15) In Figures 3, 4 and 5, having an indication of which cases have a significant correlation between u1 and u2 would have helped.

      (16) Concerning the Aspen dataset, it is not clear why the authors claimed that "the negative effects of neighboring genotypes were amplified as trees matured" as the PVE_cov in Figure 3 in 2015 are not systematically more negative than those of Figure 3 in 2014.

      (17) When discussing their results, the authors should engage more with the literature estimating DGE-IGE correlations (see some of the references above).

      (18) Concerning the apple dataset, they mentioned that "metabolite accumulation in ripening fruits may be facilitated by volatile chemicals, such as ethylene", but they did not find any evidence for significant IGE SNPs localized close to a gene involved in ethylene production. Claiming that these are testable hypotheses should have been made earlier, in the introduction, than a posteriori in the discussion.

    1. Reviewer #1 (Public review):

      Marconcini et al. report results of an ambitious study on the genetic mechanisms that contribute to resistance of Drosophila flies to the toxin octanoic acid (OA). This study was motivated by two observations: first, Drosophila sechellia, a close relative of D. melanogaster, has evolved specialized feeding on fruits of Morinda citrifolia, which contain high concentrations of OA and second, that artificial selection on Drosophila simulans, a sister species of D. melanogaster, can generate higher resistance to OA. Previous studies had performed genetic mapping studies between D. simulans and D. sechellia that implicated certain genomic regions in resistance to OA and, in particular, implicated several Osiris gene paralogs as contributing to resistance, though the molecular mechanisms of resistance remain unclear. In this study, Marconcini et al. performed two major experiments. First, they performed evolution-and-resequence on Drosophila simulans populations exposed to OA for 50 generations and identified candidate regions with excessive shifts in allele frequencies as candidate regions containing OA resistance genes in D. simulans. Second, they performed a CRISPR knock-out screen in a D. melanogaster cell line to identify genes that contribute to OA resistance and susceptibility.

      Evolve-and-resequence yielded many candidate genomic regions with extreme allele frequency shifts, which may be regions containing OA resistance genes, or linked genes, or regions that happen to show a strong shift in all replicate populations by chance. As the authors note, detecting significant shifts in allele frequencies is a challenging problem, and the authors use two measures of allele frequency shifts (the Cochran-Mantel-Haenszel method and Bait-ER) and perform simulations under neutrality to estimate a reasonable significance threshold. I am not entirely convinced by this method of estimating significance levels, because the simulations involve assumptions that may not be met by the real populations. I would think that a permutation test would provide an assumption-free method of estimating significance levels. I have tried to think whether there is something about the design of these experiments that would preclude the use of permutation tests (which are used widely for genome-wide studies, such as QTL), but I can't think of one. Perhaps the authors are aware of a reason permutation tests would be invalid here, and if so, they should state this reason.

      There is overlap between regions detected by the two methods, but the methods disagree for many regions. The authors state that a "majority of prominent peaks were found by both methods," but I am unclear on what "prominent" means here. It would be more helpful to be more quantitative about the extent of overlap.

      The authors hypothesized that the response would be at similar genomic loci in all populations (line 222). It seems at least possible that epistatic interactions would lead to different combinations of alleles evolving in each population. I wonder if it would be possible to test whether there is heterogeneity in the responses across the replicate populations.

      The evolve-and-resequence method yielded many possible regions contributing to OA resistance in D. simulans, but perhaps too many regions to test directly or even to build sensible hypotheses about the genes involved. Thus, the authors performed a second experiment to try to narrow down the list of possible candidate genes. They performed a CRISPR knockout screen in a D. melanogaster cell line for genes that contribute to resistance or susceptibility to OA. The authors identify several limitations of this experiment, but they nonetheless identified several genes where knockouts contribute to OA susceptibility or resistance. Intersecting top hits with regions that experienced selection identified two "resistance" genes: kraken and Alkbh7. The selection hit at kraken is quite compelling, whereas the evidence at Alkbh7 is less strong because only two SNPs were marginally significant. Further functional assays, including gene knockouts in D. melanogaster and D. sechellia, provide some support for the claim that both of these genes can contribute to resistance to OA in flies.

      Beyond the few issues raised above, I do not have significant questions about methodology or the results. I do think, however, that the authors should be more conservative about the implications and significance of their results. For example, on line 139, the authors claim that this intersection approach provides a "powerful paradigm to investigate ecotoxicology." I am not sure I agree that the identification of two genes that may contribute to OA resistance, after a seemingly heroic selection experiment and CRISPR screen, suggests that this method is all that powerful. It seems that most of the genes that contribute to the selection response remain unidentified.

      Finally, given that one motivation of this project was to identify genes that contribute to evolved resistance to OA, I am surprised that the authors did not generate CRISPR alleles of kraken and Alkbh7 in D. simulans and then use these together with the existing alleles in D. sechellia to perform reciprocal hemizygosity tests to determine if these two genes actually contribute to evolved resistance in D. sechellia. This test is simpler to perform and may be more sensitive than the allelic replacement that the authors propose (lines 446-449).

    2. Reviewer #2 (Public review):

      Summary:

      The authors studied the resistance against octanoic acid, a compound of the noni fruit, in D. simulans, using experimental evolution and resistance/susceptibility in D. melanogaster cells. They identified novel candidate genes and performed functional tests.

      Strengths:

      The idea of using experimental evolution of a non-resistant species to develop resistance is interesting, and the idea of narrowing down a large list of candidate loci by CRISPR-based gene knockout in cell culture is innovative. The reviewer also liked the (easy) follow-up experiments to validate the results.

      Weaknesses:

      The reviewer is not convinced of the conceptual idea behind their approach: the intersection of the two approaches implicitly assumes that null alleles (or at least compromised alleles) should be selected during experimental evolution. The reviewer considers this unlikely, and the authors made no attempt to test this implicit hypothesis in their data. Along the same lines, it is not clear how to reconcile an upregulation of candidate genes in resistant flies with the knockout experiments.

      The experiments to validate the effect of candidate genes did not match the experimental evolution conditions.

      The statistical analysis suffers from some problems and an insufficient description of the analyses performed.

      Although D. simulans GWAS data are available, the authors did not make an attempt to estimate the effect of selected variants in the candidate genes in the GWAS data set.

      The reviewer would have liked to see more connection between the experimental evolution and the GWAS data. As some D. simulans genotypes have similar resistance to D. sechellia, it would have been interesting to test whether this genotype contributed to the observed resistance.

      At several places, the authors discuss the challenge of studying a polygenic trait, but at the same time, they claim to have detected and validated candidate genes. It would be helpful if the authors could discuss why they consider that their assays could really detect the contribution of single loci to the polygenic trait. In particular, when GWAS did not detect their candidate genes.

      It is not clear to the reviewer why the authors did not pay more attention to the highly significant peaks emerging from the experimental evolution study. Their functional validation would have been biologically more plausible.

      Impact:

      Given the obvious challenges of functional testing of polygenic traits and the clear limitations of the interpretation of the results, the study will be helpful for future studies aiming to characterize polygenic traits. Unfortunately, the results are just another piece of controversial results regarding resistance against octanoic acid, a trait that is rather easy to evaluate.

    1. Reviewer #1 (Public review):

      Summary:

      Regional differences in the brain's waste-clearance system may interact with neural activity to influence where amyloid-B accumulates. Using intrathecal GBCA administration to produce "Glymphatic MRI" in 96 subjects, the authors mapped cortical glymphatic influx and clearance and found distinct spatial patterns, with transcriptomic analyses linking better glymphatic function to neuronal cell types (through genes). In a subgroup with resting-state fMRI, regions with stronger resting-state activation generally showed higher contrast clearance, indicating a positive coupling between these processes. Notably, cortical regions where neural activity and glymphatic clearance were mismatched showed greater amyloid-β burden in a separate, publicly available PiB-PET dataset, suggesting that activity-clearance decoupling may contribute to regional vulnerability and neurodegeneration.

      Strengths:

      This is a rare and valuable dataset. Intrathecal contrast injection in ~100 subjects is quite a remarkable accomplishment alone, but the addition of resting-state fMRI, a correlative PiB cohort, and gene-expression pattern data is impressive.

      Weaknesses:

      This is a cross-sectional study, and we can't determine whether neural activity drives glymphatic clearance, whether glymphatic dysfunction alters neural activity, or whether both are shaped by a third factor. Language describing "flow", "influx", and "clearance" could be made more specific so the reader can more easily follow the methodological approach.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Li et al. investigated the relationships among regional cortical tracer dynamics following intrathecal gadolinium administration, neural activity, and amyloid-β deposition in humans. Using serial MRI acquisitions after intrathecal gadodiamide administration in 96 participants, the authors characterized regional signal enhancement and clearance patterns across the human cortex. They integrated these imaging measures with transcriptomic data (Allen Human Brain Atlas), resting-state fMRI outcomes, and an external amyloid PET dataset. The authors report that regions with more efficient tracer clearance are enriched for genes related to synaptic organization and neuronal cell types, that tracer clearance patterns are in parts spatially coupled to spontaneous neural activity, and that regional mismatch between neural activity and tracer clearance is associated with increased amyloid burden according to the PET dataset.

      Strengths:

      The study addresses an important and very timely question about the interaction among neural activity, cerebrospinal fluid dynamics (waste clearance), and regional vulnerability to neurodegeneration. Integrating serial post-contrast MRI, transcriptomics, resting-state fMRI, and amyloid imaging is ambitious and conceptually very interesting. The spatial characterization of cortical tracer dynamics is potentially valuable for the field, particularly given the increasing interest in human glymphatic imaging approaches and intrathecal contrast MRI, which provides an opportunity to assess CSF tracer dynamics without confounding tracer signal from the blood. The imaging preprocessing pipeline includes normalization of regional cortical signal intensity to a reference region within each session before calculation of longitudinal percentage change, which helps reduce inter-session variability within individuals for conventional T1-weighted imaging. The transcriptomic analyses linking tracer dynamics to neuronal and synaptic gene expression patterns are also interesting. In addition, the manuscript addresses recent literature on neurovascular coupling, glymphatic function, and amyloid vulnerability.

      Weaknesses:

      Several issues limit the strength of the conclusions. One concern relates to the interpretation of repeated post-intrathecal contrast MRI measurements as direct indicators of glymphatic influx and clearance. The approach presented by the authors measures regional signal changes following intrathecal gadodiamide administration, but does not directly visualize paravascular flow or establish that the observed signal dynamics specifically reflect glymphatic transport mechanisms. Although it is widely accepted that CSF influx occurs primarily along periarterial spaces as part of the glymphatic system, and the terminology "glymphatic MRI" is increasingly used in the literature, the physiological processes contributing to delayed parenchymal enhancement, including CSF-interstitial exchange mediated by convective bulk flow and/or extracellular diffusion, as well as transient and, in the case of linear gadolinium agents, even long-term tracer retention remain incompletely resolved. Importantly, tracer kinetics may not directly reflect interstitial fluid kinetics, as solute transport may also be influenced by compartmental and extracellular barriers, diffusion constraints, and tissue retention effects. As currently written, several sections of the manuscript appear to overstate what can be directly inferred from the imaging data. This issue may be particularly relevant given the intrathecal use of gadodiamide (Omniscan), a linear gadolinium-based contrast agent with known long-lasting tissue retention due to lower kinetic stability compared to macrocyclic agents. Sustained signal at later imaging time points may therefore not only reflect impaired glymphatic clearance dynamics may also be influenced by tissue retention of contrast material, particularly in the context of neurological disease. In addition, the participant cohort is heterogeneous and includes individuals with neuroinflammatory and neurodegenerative diseases, peripheral neuropathy, and motor neuron disease. Although the authors argue that the spatial tracer patterns are relatively preserved across neurodegenerative groups, this heterogeneity complicates interpretation of imaging data and raises the possibility that disease-related factors and altered tracer-tissue interactions contribute to the observed effects. Thus, the rationale for interpreting a greater tracer signal at 39h as evidence of impaired glymphatic clearance should be explained more carefully, particularly given the highly heterogeneous patient population.

      In addition, the analyses linking spontaneous neural activity and tracer clearance are based on a very small rs-fMRI subgroup (n = 15), limiting the generalizability. The interpretation of the "mismatch" analysis also requires caution. The mismatch index was computed from z-scored fALFF and tracer clearance and is subsequently associated with amyloid burden derived from the external PET dataset rather than from the studied participants themselves. Therefore, the observed spatial associations should be interpreted with greater caution rather than as evidence for a direct mechanistic relationship. The cross-sectional nature of the analyses also limits conclusions regarding the directionality and temporal sequence of the relationships between neural activity, tracer dynamics, and amyloid burden. Several statements in the Discussion currently imply stronger causal or biological conclusions than are directly supported by the data.

      Despite these limitations, the study presents an interesting dataset and proposes a framework for understanding regional vulnerability to protein accumulation in neurodegeneration. This work hopefully motivates further investigation into the important relationships among neural activity, CSF dynamics, and neurodegeneration in humans.

    3. Reviewer #3 (Public review):

      This manuscript addresses an interesting and timely question: whether regional glymphatic clearance in the human cortex is spatially coupled to neural activity and whether a mismatch between activity and clearance may help explain regional vulnerability to amyloid-β deposition. The authors use intrathecal gadolinium-based glymphatic MRI in 96 participants, derive cortical influx and clearance maps, integrate these with Allen Human Brain Atlas transcriptomic data, and then relate regional clearance to resting-state fMRI measures in a smaller subgroup. They further compare the resulting activity-clearance mismatch map with an open-source ¹¹C-PiB amyloid PET dataset. The overall concept is attractive because it attempts to connect glymphatic physiology, neuronal activity, and proteopathy at the regional level of the human brain, an important and understudied area.

      The main strength of the study is the use of direct intrathecal contrast-enhanced MRI to generate cortical maps of glymphatic tracer dynamics. This is a technically demanding approach and provides a richer spatial readout than indirect MRI proxies of glymphatic function. The authors show that the cortical tracer signal increases from 4.5 h to 15 h and then decreases by 39 h, allowing them to interpret the early signal as reflecting influx and the persistent signal at 39 h as impaired clearance. They further identify regional patterns, with faster influx in medial prefrontal/insular areas and slower clearance in dorsal prefrontal and parietal surface regions. The analysis is visually clear, and the use of cortical gradients is a useful way to reduce complex regional data into interpretable spatial axes.

      The multimodal integration is also interesting. The transcriptomic analysis suggests that regions with faster glymphatic clearance are enriched for synaptic organisation and neuronal activity-related pathways, while regions with slower clearance show enrichment for metabolic and mitochondrial pathways. The cell-type enrichment analysis further implicates excitatory and inhibitory neurons, oligodendrocyte lineage cells, microglia and, to a lesser extent, astrocytes. This provides a plausible biological bridge between regional neural activity and clearance function, and the sensitivity analysis using ReHo in addition to fALFF is a useful robustness check.

      However, the manuscript should be more careful in its causal interpretation. The study is cross-sectional and largely correlative in space. The finding that regions with higher spontaneous neural activity tend to show better glymphatic clearance is intriguing, but it does not establish that neural activity drives clearance in these participants. Conversely, it remains possible that better tissue integrity, vascular function, CSF access, cortical geometry, vascular density, or disease composition jointly influence both fMRI measures and tracer clearance. The authors do acknowledge some of these limitations, but the abstract and discussion should more consistently frame the findings as associations rather than evidence of an activity-clearance mechanism in humans.

      The most important limitation is the small size of the fMRI subgroup. Although the whole glymphatic MRI cohort includes 96 participants, the key activity-clearance analysis is based on only 15 individuals, including 11 with peripheral neuropathy and 4 with motor neuron disease. This is a very small and clinically heterogeneous sample on which to build a central conclusion about regional neural activity and glymphatic clearance. The authors show that the 39 h PC map in the fMRI subgroup resembles the whole-cohort map, which is helpful, but this does not address whether the fALFF-clearance relationship is robust at the individual level. The paper would be strengthened by reporting subject-level stability, leave-one-out analyses, and whether the association persists after excluding the four motor neuron disease cases.

      A second major concern is the interpretation of the amyloid analysis. The ¹¹C-PiB map is derived from an external open-source Alzheimer's disease dataset, not from the same participants who underwent glymphatic MRI and fMRI. Therefore, the association between activity-clearance mismatch and amyloid burden is a spatial correspondence across group-average maps, not an individual-level relationship. This is valuable for hypothesis generation, but should not be presented as evidence that a mismatch in the present cohort predicts amyloid deposition. The authors should clearly state that this analysis tests whether mismatch regions overlap with known amyloid-prone cortical regions, rather than directly linking mismatch to amyloidosis in individual participants.

      The definition of "mismatch" also needs clarification. The text defines the mismatch index as the negative absolute difference between z-fALFF and z-39h PC, and states that higher scores indicate greater mismatch. Because the index is negative, values closer to zero would normally indicate a smaller absolute difference rather than a greater mismatch. This should be checked carefully and corrected if necessary. More broadly, because a higher 39 h PC indicates worse clearance, the interpretation of match and mismatch categories is not intuitive. The authors should provide a clearer schematic and ensure that the mathematical definition, biological interpretation and figure labelling are fully aligned.

      Several technical confounds require more attention. Intrathecal gadolinium MRI is influenced by CSF dynamics, posture, sleep, circadian timing, renal clearance, age, intracranial pathology, and potentially diagnosis-specific differences. The authors acquired scans at fixed time points and noted that patients slept as usual, but individual sleep duration, sleep quality, posture, and daytime activity were not objectively measured. Given that the central claim concerns glymphatic clearance, these are not minor confounders. The authors should consider adjusting for age, sex, diagnosis, vascular risk factors, and relevant clinical variables where possible, and be more explicit about how heterogeneous disease indications may influence cortical tracer kinetics.

      The statistics are generally good. However, many correlations are performed across 400 cortical parcels, which are not independent biological samples. The paper would benefit from clearer separation between participant-level inference and region-level spatial inference. For example, the fALFF-clearance and mismatch-amyloid analyses are regional map correlations, not correlations across individuals. This should be clearly stated throughout. The authors should also report effect sizes and confidence intervals more consistently, and explain how multiple comparisons were controlled across transcriptomic, cell-type, fMRI, ReHo and amyloid analyses.

      The transcriptomic analysis is useful but should be presented as indirect. AHBA data come from six post-mortem brains; only the left hemisphere was used, and the donors were healthy and younger than the clinical cohort. Therefore, these data capture intrinsic regional gene-expression patterns rather than disease-state expression in the same individuals. The authors should avoid implying that the transcriptomic findings directly explain glymphatic function in their participants. The current discussion partly acknowledges this, but the framing in the abstract and results could be more cautious.

      There are also several points of presentation that should be improved. The manuscript should consistently distinguish glymphatic influx, glymphatic clearance, CSF tracer retention, and waste clearance. A 39 h residual gadolinium signal is a useful proxy for delayed clearance, but it is not the same as direct measurement of amyloid or tau clearance. The language around "waste clearance" and "amyloidosis" should therefore be precise. The authors should also clarity whether "higher clearance" corresponds to lower 39 h PC across all analyses, as this inversion is easy for readers to misinterpret.

    1. Reviewer #2 (Public review):

      Summary:

      The authors have used 1477 sequenced trios with available gene expression data in the offsprings to discover eQTLs that act in a parent-of-origin specific manner. The classified their associated SNPs are tested for enrichment for GWAS hits, drug target genes, etc.

      Strengths:

      The manuscript presents an impressive analysis of a very rich data set of parent-of-origin eQTLs. To my knowledge, it is one of the largest studies of its kind and most analyses are sound and the results are of interest to many in the field and potentially beyond. The different ideas of follow-up analyses are useful and make sense.

      Weaknesses:

      While in general the analyses are well-conducted, I noticed a major issue with the POE eQTL classification, which puts into question most of the downstream analysis. In the light of this problem, all claims of individual discoveries (apart from those in Table 1) should be removed. The enrichment analyses remain valid and are useful.

    1. Reviewer #1 (Public review):

      Summary:

      The authors tackle a long-standing question in developmental theory: given a gene-regulatory network that includes extracellular signaling, which topologies are even capable of transforming an initial spatial profile into a genuinely new pattern? Building on the classical reaction-diffusion framework in one dimension, but imposing biologically motivated constraints, they prove that every one-signal sub-network must be either Hierarchical (H), self-activating (L+), or self-inhibiting (L-). They further demonstrate that only three composite classes of full networks - pure H, a coupled L+ L- "Turing" pair, and an L- module fed by an intracellular positive loop ("noise-amplifying")-can create non-trivial spatial transformations. Analytical criteria and illustrative simulations are provided, together providing a closed taxonomy, which is supposed to be relevant for real systems.

      Strengths:

      (1) Useful classification framework. Reducing a vast number of possible gene circuits to three canonical pattern-forming motifs is a valuable organizing insight for both theorists and experimentalists.

      (2) Practical interpretability. Given a reaction network diagram, one can now decide (assuming the model applies to real systems) whether spatial patterning is even possible, saving experimental effort on in silico screens that could never succeed.

      Weaknesses:

      (1) After the resubmission, I still have concerns regarding the formal definition of "non-trivial transformations" (P1/P2) and its application to noisy or multi-dimensional systems. The criteria rely on counting "new" critical points (maxima/minima). In their response, the authors argue that the diffusion operator instantly smooths discontinuous white noise, allowing critical points to be properly defined. However, this very smoothing process passively generates a landscape of new, smooth local extrema from the initial noise. Consequently, trivial diffusive regularization could inadvertently fulfil the criteria for a "non-trivial" transformation, leaving the definition conceptually problematic. Furthermore, when extending the framework to 2D/3D, the manuscript assumes that starting from a central "spike" will robustly preserve radial symmetry, yielding concentric rings or shells. This overlooks the fundamental nature of macroscopic mean-field models like reaction-diffusion equations. The realization of the final multidimensional pattern depends strictly on the stability of the solution against ubiquitous perturbations (including angular modes) rather than solely on the deterministic symmetry of the initial condition. It remains unclear how the current framework accounts for spontaneous symmetry breaking in cases where these angular modes become unstable, challenging the assumption that radial symmetry will strictly dictate the outcome. We note that the authors' use of noise as an initial condition does not resolve this fundamental issue. Reaction-diffusion equations inherently describe mean-field dynamics, meaning that microscopic fluctuations are continuously present in any real system, regardless of whether explicit stochastic terms are written into the equations. Ultimately, if a symmetric mean-field solution is structurally unstable to these inherent fluctuations, it simply cannot be realized in nature.

      (2) Theoretical limitations in the application of Linear Stability Analysis (LSA): I remain uncertain about the framework's reliance on LSA to categorize macroscopic transformations, especially those arising from large initial perturbations (spikes). In their rebuttal letter, the authors justify this by assuming the perturbation remains small over a short time interval. However, because the study aims to describe stationary, asymptotic states, applying a linear approximation that relies on transient t->0 conditions to predict long-term global stability is not fully resolved.

      (3) In the previous round of the review, I suggested that a biomolecular sink, such as A+B -> AB reaction, could break the approach. In their response letter, the authors defend their approach by arguing that such reactions can be accommodated by their abstract constraints (R1-R5) as long as the signs of the Jacobian elements remain invariant. However, the problem I see here is not the sign of the interactions, but the severe loss of spatial homogeneity.

      When a macroscopic initial perturbation (a "spike" of morphogen) is introduced into a domain with a strong bimolecular sink, it will inevitably cause massive local depletion of the consumed substrate near the source. Consequently, the background state of the system will rapidly evolve into a profile with macroscopic spatial gradients long before any spontaneous pattern-forming instability takes over. Mathematically, this dictates that the system no longer possesses a homogeneous steady state, and the Jacobian matrix becomes explicitly space-dependent, which should break the classical LSA approach.

      Discussion:

      The study offers a solid conceptual organization of pattern-forming networks. However, the theoretical bridge between infinitesimal linear stability and macroscopic, non-linear pattern emergence still presents some uncertainties. The way the current framework formally treats noise, multi-dimensional symmetry breaking, and large initial perturbations leaves some questions open regarding its broad analytical applicability to real biological tissues.

    2. Reviewer #3 (Public review):

      Pattern formation is responsible for generating the spatial organization of cells, tissues, and organs during embryogenesis. It operates within a multifactorial system including initial conditions, gene regulatory networks, extracellular signals, mechanical forces, stochastic noise and environmental inputs, and finally ensures the functional anatomy of an organism.

      This study focuses on the one central aspect in pattern formation: how spatial heterogeneity arises from an initial condition and evolves into a more complex or distinct spatial pattern (non-trivial pattern formation as they termed). The authors made efforts to explore and characterize all possible ways to achieve the pattern formation by discussing how extracellular signals spread, how individual cells respond to those signals, and how those responses, in turn, modulate signal propagation.

      Finally, their comprehensive analysis summarizes that there are three classes of interactions between extracellular signal and intracellular responses, corresponding to previously known mechanisms that can generate spatial patterns: Difference in morphogen concentrations in space, noise-amplification, and Turing pattern.

    1. Reviewer #1 (Public review):

      The manuscript by Lux et al. addresses how T-cell acute lymphoblastic leukemia (T-ALL) cells migrate into the central nervous system (leptomeninges), specifically through VLA-4 and LFA-1 integrins. VLA-4 and LFA-1 are important regulators of normal T-cell migration into the CNS, so the authors tested whether they also mediate T-ALL infiltration. They generated an intracellular NOTCH1 T-ALL mouse model and then used CRISPR/Cas9 gene targeting to delete VLA-4 and LFA-1. They show that integrin-deficient T-ALL cells accumulate in the CNS compared to control T-ALL cells. The authors performed a time course experiment and found that although WT T-ALL cells accumulated in the CNS before DKO T-ALL cells, over time, DKO T-ALL cells outgrew the WT T-ALL cells. Subsequently, they performed bulk RNA-sequencing and revealed that Integrin beta 7 (Itgb7) was upregulated in the DKO T-ALL cells. To test whether Itgb7 was compensating for the loss of VLA-4 and LFA-1, the authors generated a triple KO (TKO). The TKO T-ALL cells migrated to the CNS; however, CNS accumulation between the TKO and the DKO was not significantly different. To evaluate if there is reduced exit of T-ALL DKO cells from the meninges, they inhibited T-ALL exit via the dorsal meningeal lymphatics by generating an AAV VEGF-trap encoding the binding domain of VEGFR3, and then co-injected WT: DKO cells weeks later. There was no effect on the WT:DKO T-ALL ratio or on the overall number of T-ALL cells in the CNS with meningeal lymphatics regression, suggesting that the DKO does not preferentially accumulate in the CNS, or that delayed exit results in DKO T-ALL accumulation in the CNS.

      Additionally, the authors tested whether DKO affected immune surveillance by injecting DKO:WT T-ALL cells into NRG mice. DKO T-ALL cells localized in the dura mater and were spread throughout the tissue, whereas WT T-ALL cells clustered near blood vessels. These observations lead the authors to hypothesize that differential access to nutrients or other signals may influence leukemic cell proliferation. However, EdU labeling revealed no differences, leading the authors to hypothesize that the unique stromal cell layer in the meninges supports the DKO proliferative advantage. Finally, the authors tested whether integrin blockade and chemotherapy might chemosensitize T-ALL cells in the CNS. After a single treatment with 5FU, DKO cells were depleted faster than the WT cells; however, a single treatment with integrin blockade was toxic. After combining 5FU with the integrin antibodies, the authors showed that T-ALL cells in the CNS were significantly more depleted than in treatment with either single therapy.

      These data highlight how challenging it is to identify regulators of T-ALL migration and adherence. This study highlights the importance of these experiments and the clinical need to identify the molecules that influence leukemic infiltration into the CNS.

      Overall, this study was well performed with appropriate statistical power to implicate integrins in T-ALL CNS infiltration and proliferation.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors set out to understand how T cell leukemia cells enter and persist in the CNS, with a particular focus on the role of adhesion molecules known to regulate normal immune cell trafficking. Contrary to expectations, they find that loss of two key adhesion molecules does not impair CNS entry but instead leads to increased accumulation of leukemia cells, which is associated with enhanced cell proliferation in this environment. These findings challenge prevailing assumptions about how leukemia cells interact with tissue niches and suggest a potential therapeutic strategy combining adhesion blockade with chemotherapy.

      Strengths:

      The study addresses an important and longstanding question in leukemia biology using well-designed in vivo models and multiple complementary approaches. The key observation is robust and consistently supported across genetic models and experimental systems. The authors systematically test alternative explanations, including altered entry, exit, and immune evasion, which strengthens the interpretation that proliferation differences underlie the phenotype. The work has potential translational relevance, particularly in highlighting a possible strategy to enhance the efficacy of anti-proliferative therapies in the CNS.

      Weaknesses:

      While the central phenotype is clear, the mechanistic basis remains incompletely defined. Addressing the following points would strengthen the manuscript.

      Major critiques:

      (1) The central claim that integrin loss enhances CNS accumulation via increased proliferation is not mechanistically resolved; current data are correlative (EdU incorporation, distribution patterns) and do not establish that integrin-mediated signaling directly restrains cell cycle progression in the CNS niche. The authors should perform functional perturbation of candidate pathways identified (e.g., TGF-β) using pharmacologic inhibitors or genetic approaches (dominant-negative receptor or CRISPR knockdown) in vivo or in ex vivo CNS-derived T-ALL co-culture systems to test whether blocking this pathway rescues the WT proliferation phenotype; if not feasible, the mechanistic claims should be toned down and clearly presented as hypotheses.

      (2) The relationship between altered spatial distribution and proliferation is suggestive but not directly demonstrated. The imaging data indicate differences in localization, but these observations are not quantitatively linked to cell cycle status. The authors could strengthen this point by incorporating spatially resolved proliferation analyses, such as combining EdU labeling with imaging or quantifying proximity to stromal or vascular niches, or alternatively by providing additional quantitative analysis of the existing imaging data.

      (3) The conclusion that CNS accumulation is not due to altered trafficking (entry/exit) is suggestive but not definitive, as early seeding dynamics are not directly assessed. Authors should perform short-term homing or early time-point competitive trafficking assays (e.g., CNS quantification at 6-48h post-transfer) to rigorously exclude differences in entry kinetics; if such experiments are not feasible, this limitation should be explicitly acknowledged in the discussion.

      (4) The therapeutic claim that integrin blockade synergizes with chemotherapy is promising but underdeveloped, as it lacks survival outcomes and a broader translational context. The authors should include survival analyses and, if possible, test combination treatment in a more clinically relevant setting (e.g., delayed intervention or alternative standard-of-care agents), or otherwise temper translational conclusions and discuss risks such as inducing proliferation in the absence of chemotherapy.

    1. Reviewer #1 (Public review):

      The manuscript by the Deppmann group is an important contribution to understanding how growth factor signaling is controlled at a per-cell basis, in contrast to bulk biochemistry results. Their system uses cell culture and single-cell signalling proteomics methods to measure responses of cells of different developmental stages (from E14 rat) with complex but relatively clear-cut phenotypes, allowing the effects of BDNF to be compared. This work validates the method for the discovery of future insights from less well-studied ligand-receptor investigations.

      Strengths include:

      (1) The methods are cutting-edge and powerful.

      (2) Clearly written. It leads the reader through the rationale of methodological steps.

      (3) Step-by-step data interrogation rather than leaping into complex models of analysis.

      (4) "sanity check" controls e.g., mimicking bulk culture expected signaling /expression changes.

      (5) Testing biologically of certain findings within the presentation of the results ( e.g., progenitors not responding to BDNF also not internalising TrkB).

      (6) Effort to make complex figures/data as understandable as possible.

      (7) Not overstating conclusions.

      (8) Important conclusion of receptor stoichiometry sets the potential for BDNF sensitivity, and that the intrinsic environment allows for a cell to engage that potential, something possibly thought but not demonstrated previously.

      Major points:

      (1) Apply appropriate statistics: Student's t-tests are used throughout. It would be more appropriate to utilise ANOVA, at least one-way, to compare across timepoints for a given phospho-protein within one treatment condition (e.g., pERK following BDNF stim), or even multiple t-tests. Also, multiple testing adjustments. are likely needed (not my expertise).

      (2) Some data points are n=2; for statistical rigour n=>3 would be appropriate.

      (3) They measured pTrkB with antibody targeting site Y816, which couples to PLCy/PKC/Ca2+, but not Shc (for PI3K/MEK pathways), why? Did they get any measurements using an antibody targeting the phosphorylation sites in the activation loop of the kinase? Could this explain the relatively low abundance of active TrkB, compared to the measured TrkB-dependent signalling outcomes? Especially considering the "unresponsive" cells. E.g. https://doi.org/10.1016/S0896-6273(00)00035-0.

      (4) Was TrkC ( or A) expressed in any TrkB population that could potentially mediate BDNF signaling?

    2. Reviewer #2 (Public review):

      In this study, Sewell et al. use a novel approach to understand cell-specific BDNF signaling in the developing spinal cord. Using cultured E14 spinal cord, the authors used a mass cytometry approach to identify the levels of TrkB and p75NTR receptor expression, as well as 19 signaling markers and cell identification markers, to delineate activation of BDNF signaling in different cell types within a complex population. They identified that the level of receptor expression, while necessary, is not sufficient to determine the activation of signaling cascades. It has been known for some time that TrkB, indeed all RTKs, have the capacity to activate certain canonical signaling pathways; however, not all these pathways are always activated upon ligand treatment. This study begins to identify the conditions under which specific signaling pathways are activated by ligand. Specifically, the type of cell and maturation state are critical for determining signaling. The cytometry approach allows the clustering of cell types according to expression of specific markers, and overlaying those clusters onto the expression status of TrkB and p75 receptors, as well as specific activated signaling proteins. This study provides greater insight into when specific signaling events can be activated by BDNF than was previously known.

      The comparison of levels of expression of TrkB and p75NTR is interesting to demonstrate which pathways may require one or both receptors for specific signaling responses.

      It is very interesting that progenitors do not respond to BDNF despite abundant expression of TrkB, although they responded to the rescue treatment with phosphorylation of Erk and Akt. The development of competence to respond to BDNF is an interesting question for future analysis, and the authors suggest some possibilities in their Discussion.

      The responses of glial cells in their culture preparation are also interesting. They see signaling responses to BDNF in astrocytes and "laden" microglia (presumably phagocytic). E14 spinal would not be expected to have a large population of glia at this stage of development, although the serum in their plating media would allow for the proliferation of the progenitors. Astrocytes are generally considered to have the truncated TrkB receptor, yet they see P-Erk, P-Akt, etc. in these cells in response to BDNF. This raises the question of which receptors are expressed in the glial populations and whether the responses in these cells are also maturation dependent, since the glia in their culture conditions are also likely to be immature.

      Some specific comments:

      (1) The authors should specify what is meant by "rescue" in the text. What is rescuing the cells from trophic deprivation when no BDNF is added? Is it the B27 and GlutaMax in the Maintenance media, and does this actually rescue the cells?

      (2) Figure 3 - K252a blocked activation in most, but not all, lineages, especially in mature neurons. Is some component of the P-Erk activation in these cells TrkB independent?

      (3) Figure 5 E, F - The correlation between receptor surface depletion and signaling is based on "surface-specific staining". Does the staining allow you to see internalized receptors to confirm that the receptors are internalized?

      (4) The drawbacks to the study - particularly capturing snapshots in time to represent signaling cascades, are fully acknowledged in the Discussion. The interplay between TrkB-T1, TrkB-FL, and p75NTR cannot be elucidated from this study, but again, that is acknowledged and will require a different approach.

    3. Reviewer #3 (Public review):

      This study addresses a fundamental and long-standing question in neurotrophin biology, how cellular context shapes the interpretation of a single trophic message, and tackles it with a technically demanding and well-executed single-cell mass cytometry approach. By simultaneously measuring 19 signaling effectors and 18 identity markers across a developmental gradient of spinal cord cell types, the authors substantially expand our understanding of BDNF signaling and provide a compelling demonstration of the limitations inherent to bulk biochemical readouts, which average across heterogeneous populations and obscure the discrete subpopulation behavior that the present data reveal.

      The finding that only 47-75% of cells respond at peak activation, that maturation state dictates both the magnitude and the qualitative "signature" of the response, and that identical receptor stoichiometries can yield divergent outcomes across cell types collectively constitute an important conceptual advance. The proposed framework of "prepared competence" is thought-provoking and likely to stimulate follow-up work.

      That said, several aspects of the data interpretation deserve more critical discussion. My specific comments are detailed below.

      (1) Interpretation of TrkB-independent ERK activation (lines 194-196).

      The authors state that the residual pERK induction observed in TrkB-negative ("None") cells and the incomplete suppression of pERK by K252a support the established notion that BDNF signaling is not mediated solely through TrkB. This interpretation is presented without sufficient mechanistic detail and, in its current form, is difficult to follow. If BDNF-induced ERK activation is not mediated by TrkB, which alternative receptors could account for it? Does this reflect signaling through p75NTR, transactivation of other receptor tyrosine kinases, or another mechanism altogether? Likewise, the partial resistance of pERK to K252a is interpreted as evidence of an additional regulatory layer, but the underlying activity is not specified. Is the authors' hypothesis that a distinct pool of ERK is engaged independently of Trk activity? If so, what kinase activity is proposed to drive it? These results are intriguing yet puzzling and merit a more critical and explicit discussion of the candidate mechanisms.

      (2) The "progenitor paradox" in light of prior work on PC12 cells (lines 207-208).

      The observation that TrkB-expressing progenitors remain insensitive to BDNF is presented as a paradox and interpreted through the lens of impaired internalization. This interpretation would benefit from explicit discussion in the context of the classical work on PC12 cells (Segal and colleagues, among others), which established that plasma membrane-restricted Trk receptors engage the Ras-MAPK pathway with rapid, short-duration kinetics that drive proliferation rather than differentiation, whereas internalized Trk receptors sustain MAPK signaling and promote differentiation. Under this framework, the apparent signaling silence of progenitors could, in fact, reflect transient plasma membrane signaling that the time points sampled in the present study (5 min onward) may not capture. The single-cell mass cytometry approach used here is, in principle, well-suited to resolving such rapid kinetics, and the authors are encouraged to address this possibility, both as an alternative interpretation of their data and as a potential extension of the study.

      (3) Astrocyte responsiveness and the TrkB isoform issue.

      The authors report that astrocytes are highly responsive to BDNF and exhibit robust ligand-induced depletion of surface TrkB, which they interpret as evidence of signaling-competent full-length TrkB (TrkB-FL) on these cells. However, it is well established that astrocytes predominantly express the truncated isoform TrkB-T1, which lacks the intracellular kinase domain and is thought to function in BDNF capture, clearance, and recycling at synapses rather than in canonical downstream signaling. The robust phosphorylation events observed in astrocytes are therefore difficult to reconcile with TrkB-T1-mediated signaling alone. Could these responses instead reflect transactivation of other receptors through neuron-astrocyte crosstalk, for instance, via ligands released by neurons in response to BDNF? Because the authors explicitly state that their antibody cannot distinguish TrkB-FL from TrkB-T1, this limitation directly impacts the interpretation of the astrocyte data and of the proposed isoform-switch hypothesis for progenitors. This caveat is briefly acknowledged but deserves more thorough discussion, ideally with explicit consideration of the alternative interpretations outlined above.

      (4) Pathways resistant to K252a inhibition.

      The authors note that K252a fails to fully abolish pERK induction in several lineages, but the specific pathways, differentiation states, and receptor stoichiometries that remain K252a-resistant are currently insufficiently described. A more systematic description would strengthen this section. In addition, it would be helpful to discuss whether the residual signal could reflect the proximity of the response to the detection threshold rather than a genuinely K252a-insensitive pool of activity. More broadly, K252a is a broad-spectrum tyrosine kinase inhibitor with well-documented off-target effects, and the present study relies on this single pharmacological tool to define Trk-dependence. The limitations of this approach, and the desirability of complementary inhibitors or genetic perturbations in future studies, should be acknowledged in the Discussion.

      (5) The 12-hour trophic deprivation paradigm as a potential confounder.

      All cells in the present study are trophically deprived for 12 hours prior to stimulation. This is a methodologically convenient choice, but sustained deprivation is not a neutral starting point: it activates stress-responsive pathways (JNK, p38, autophagy), alters receptor surface trafficking, and can sensitize cells to subsequent stimulation. Several of the reported observations - including the apparent synergy of p75NTR with TrkB on stress markers (p-c-Jun, p38) and the strong induction of trophic effectors immediately upon BDNF addition - could be amplified, or qualitatively altered, by the prior deprivation state, which does not reflect baseline in vivo physiology. The Rescue control, with complete medium, partially addresses this concern but is non-specific. The authors should explicitly acknowledge this limitation and, ideally, discuss the extent to which their conclusions about cell-type-specific signaling competence depend on the deprivation paradigm.

      (6) Direct comparison of pseudobulk data with conventional bulk biochemistry.

      The pseudobulk reconstruction of the single-cell data is presented as recapitulating canonical BDNF responses, but this comparison relies on general agreement with the published literature rather than on a direct, parallel measurement in the same cultures. Given that the central conceptual contribution of the manuscript rests precisely on departures from the bulk biochemical view of BDNF signaling, an explicit side-by-side comparison of the pseudobulk profile against a parallel bulk Western blot from sister cultures - for at least a subset of key markers such as pERK, pAkt, and pCREB - would substantially strengthen the validation of the platform. Such a comparison would reassure the reader that the discrete subpopulation behavior reported here is genuinely biological, and not in part a consequence of methodological differences between mass cytometry and conventional biochemistry (e.g., differences in fixation kinetics, epitope accessibility, or sensitivity to low-abundance phosphoproteins).

      (7) Manuscript organization and balance between main and supplementary figures.

      The manuscript presents an exceptionally rich dataset, but the current organization - seven main figures supported by thirteen supplementary figures, several of which are explicitly labeled as extensions of main-text figures - makes it difficult to follow the argument without continuous cross-referencing between documents. I would encourage the authors to consider a substantive reorganization with the following suggestions: (i) Figure S2 and Figure S3, which respectively define the threshold-based "responsiveness" criterion and assess its robustness, are foundational to the central 47-75% responsiveness claim and would be better integrated into the main text, for example as additional panels of Figure 2; (ii) the methodological and quality-control components of Figure S1 and Figure S2 would be more naturally placed within the Methods section; and (iii) the four "Extension" figures (S4, S7, S12, S13) contain considerable redundancy with the corresponding main figures and could be consolidated, with only the most diagnostic panels retained. Concurrent trimming of the denser main figures (Fig. 4, 5, and 6 each carry six or seven panels) would further improve readability.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths:

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The authors also succeed in describing how single-cell recordings can interface with task-design to help mitigate the impact of confounded neural activity when searching for NCCs.

      The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors - as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG, it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. Additionally, the authors provide a compelling case for single-celled research in consciousness science, despite the dominance of theories situated at the system and circuit level of analysis. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses:

      Overall, I feel positive about this paper. The authors have addressed my comments from my previous review and I see no significant weaknesses in the current version.

      Comment on previous version:

      No comments - congratulations to the authors!

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with their own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review on the knowledge acquired by using invasive recordings in humans. This included population level measurements in vision and in other sensory modalities, as well as single neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC as for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      No major weaknesses.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review, and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must read for anyone working in the field of consciousness research.

      Comment on previous version:

      The authors have addressed all my concerns. Once again, my compliments for a nice piece of work.

    3. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current sometimes contradicting evidence. As such, the manuscript is important as call for a concernted better exploration of NCCs using iEEG in the future.

      Comments on latest version:

      The current version of the manuscript is clear and complete. Kudos to the authors for their thorough revisions.

    1. Reviewer #1 (Public review):

      Summary:

      This study uses optogenetics to activate CA3 while recordings from CA1 neurons and characterizing the excitation/inhibition (E/I) balance. They observe use-dependent alterations in the E/I balance as a result of STP and they develop a model to describe these observations. This is a very ambitious paper that deals with many issues using both experimental and modeling approaches.

      Strengths:

      This paper examines important principles regarding the manner in which synaptic circuitry and use-dependent synaptic plasticity can transform inputs and perform computations.

      Weaknesses:

      There are three issues that cause concern regarding the applicability of their slice recordings to physiological conditions and that make some aspects of their results difficult to interpret. First, they state that 2 mM added external calcium mimics calcium levels in CSF, but this is not the case. This will influence the plasticity they observe. Second, they indicate that there is a 2% decrease in activated fibers per stimulus and attribute this to ChR2 desensitization. Such use-dependent decreases in fiber activation are expected to build during their repetitive activation experiments and artifactually influence their results. Third, they do not know the responses of individual CA3 cells to stimulation. They do not know if each cell fires reliably during repetitive activation and whether each cell only fires once.

    2. Reviewer #3 (Public review):

      Summary:

      This work shows experimentally and computationally that single CA1 neurons can perform mismatch detection on patterned CA3 inputs and that STP and EI balance underlie this detection.

      Strengths:

      It has been known that STP can enhance the EPSP when the corresponding presynaptic input exhibits abrupt changes in firing rate. This work provides experimental evidence and further computational support for the hypothesis that the basic computation through STP is useful for detecting abrupt changes in the spatial pattern of synaptic inputs at the Schaffer collaterals. Further, their results indicate the novel view that mismatch detection is most efficient when gamma-frequency bursting inputs exhibit mismatches between theta cycles. The authors included novel results in the revised manuscript to show that the effective frequency range of gamma oscillation is broad, including both slow and fast gamma bands.

      In the initial submission, the dependence of mismatch detection performance on model parameters and experimental settings, such as pattern overlaps and other network parameters, was not sufficiently explored. In the revised manuscript, the authors extensively studied these points and summarized the novel results in Fig. 9. Furthermore, the authors clarified that jitters in input spikes can improve detection performance in some cases. These results show the robustness of their results against variations in external and internal conditions.

      Weaknesses:

      While this study shows an intriguing example of combined experimental and computational studies, some analytic results, for instance, regarding the complex contributions of jitters to detection performance, could have clarified the underlying mechanism deeper and further strengthened the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Using a computational modeling approach based on the Drift and Diffusion Model (DDM) introduced by Ratcliff and McKoon in 2008, the article by Shevlin and colleagues investigates whether there are differences between neutral and negative emotional states in:

      (1) The timings of the integration in food choices of the perceived healthiness and tastiness of food options in individuals with bulimia nervosa and healthy participants

      (2) The weighting of the perceived healthiness and tastiness of these options.

      Strengths:

      By looking at the mechanistic part of the decision process, the approach has potential to improve the understanding of pathological food choices.

      Comments on revised version:

      I went carefully through the answers of the authors to my last concerns - they answered all my points. I am grateful that they obtained consistent results with the different analyses.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript examines the frequency-dependent effects of transcutaneous tibial nerve stimulation (TTNS) on bladder function in healthy volunteers, supported by a conductance-based computational model of lower urinary tract (LUT) neural circuitry. The authors show that 1 Hz TTNS modestly hastens the urge to void, while 20 Hz TTNS delays it - a finding with potential therapeutic relevance for underactive bladder (UAB). A computational model incorporating spinal, brainstem, and peripheral circuit elements provides a mechanistic framework suggesting brainstem-mediated pathways underlie these frequency-dependent effects. The revised manuscript addresses the majority of concerns raised in the initial review.

      Strengths:

      Novelty. Demonstrating a low-frequency excitatory effect of TTNS in humans is genuinely new. The possibility of inverting the therapeutic effect of an established neuromodulation intervention by simply adjusting stimulation frequency is clinically meaningful and opens a plausible treatment avenue for UAB.

      Integrated approach. Combining a controlled human pilot study with a systems-level neural model is a notable strength. The model is physiologically grounded and serves well as a proof-of-concept tool for exploring mechanistic hypotheses.<br /> Improved reproducibility. The addition of a public GitHub repository with documented code, supplementary figures detailing electrode placement and stimulation parameters, and removal of the externally derived Figure 3 all meaningfully improve transparency.

      Improved statistics. The shift to Bayesian modelling with ROPE analysis is well-justified given the small sample size and more appropriate than frequentist testing in this context.

      Improved presentation. Unit standardization, figure label corrections, and replacement of imprecise terminology (e.g., "paradoxical", "analytically") make the revised manuscript considerably clearer.

      Remaining Concerns:<br /> Afferent-efferent disconnect. The human study measures urgency (an afferent sensory endpoint), while the model's primary output is contraction duration (an efferent motor endpoint). The authors have added discussion of this mismatch, but should state more explicitly that the two lines of evidence are complementary rather than directly comparable, and that the mechanistic link between them remains a hypothesis.

      Clinical contextualization of effect size. The excitatory effect of 1 Hz TTNS is modest. A brief reference to what a minimally clinically important difference might look like in UAB or urodynamics research would help readers gauge the translational significance of the finding.

      Overall Appraisal:<br /> The authors have achieved their stated aims: providing proof-of-concept human evidence for frequency-dependent TTNS effects and a plausible neural circuit explanation. The manuscript is now appropriately cautious in its claims. The open-source computational model is a useful community resource. This work is best understood as a well-scoped proof-of-concept study that credibly motivates further investigation.

    2. Reviewer #2 (Public review):

      Strengths:

      The main strength of the work is to call attention to a new possibility of inverting the effect of TNS in humans by manipulating stimulation frequency, opening new indications for the therapy. This is highly relevant because of the recent popularity of TNS and its non-invasiveness, which lends itself to rapid testing and evaluation for new conditions and high willingness to adopt. The authors convincingly demonstrate a modest excitatory effect on bladder sensation with low-frequency TNS, which clearly warrants further investigation.

      The high-level design of the hypotheses, concepts, and experiments are clearly articulated in both the methods and in particularly clear diagrams, letting the reader focus their attention on the most important findings.

      It is rare to develop a new computational model of the lower urinary tract at a systems level, and even more so for it to incorporate circuits in the spinal cord and brainstem centers, and this work undoubtedly advances the field's ability to engineer such systems. Further, because the model is comprised of linked conductance-based point-neurons, it is an excellent tool to investigate how an arguably plausible wiring diagram for neural control of the LUT could result in stimulation frequency dependent effects on pelvic efferents. It is a proof of concept demonstrating how their mechanistic hypothesis of TNS could be implemented neurophysiologically by the nervous system. Further, the model is shared openly, which conforms to good modeling practices.

      Weaknesses:

      The main drawback of the work is the overinterpretation of the results. The human study and computational model are both proof-of-principle. The human study effect size is small and the sample size is modest; the computational model is poorly validated and does not generate physiologically typical urodynamic responses when simulating even healthy nominal LUT conditions. Thus, both the existence of a TNS 1Hz inhibitory effect (human study) and the mechanistic interpretation of its origin (simulations) remain provisional. For example, despite some caveats later in the work, the abstract stating there is a "frequency-dependent effect of TNS via the ability to alter urge perception and down-regulate bladder activity, corroborating model predictions," could easily be misleading, since a) the reduction in time of first urge with 1Hz stimulation was quite small relative to overall void time, b) reported intensity was essentially not impacted, and c) the model does not directly make predictions about these experiment outcome measures. Similar overreaching statements appear in the second to last paragraph of the introduction, the first paragraph of the discussion, and so on throughout the paper. Many of the analyses are bespoke to the idiosyncrasies of the dataset rather than field standards, making spurious results also more likely and the effects provisional. One example is the use of robust linear regression to identify significance in the experiment between the 1Hz and control groups AND removing outliers before the analysis, since the typical approach is to use robust regression when the outliers are left in the data. Taken together, the potential excitatory effect and mechanism are interesting, and perhaps worth further investigation, but are considerably more tentative than stated.

      It remains ambiguous whether a TNS excitatory effect size shown (even if it ends up being repeatable) is clinically meaningful. The ROPE analysis is a reasonable start, but no attempt to connect the parameters chosen (e.g. 60s) to clinical outcomes were made. This is especially true given the washout results and lack of effect on perceived urgency.

      There remain several reasons to treat the model results questionable. First, as the authors now note, the model under normal conditions does not generate normal function; a voiding efficiency of 15% is severely underactive. Second, the 1 Hz stimulation simulation appears to create normal voiding, suggesting that the implementation of the neural control circuits may not produce results that would generalize to other experiments. Third, analysis focuses on the model outcome of "time to void", but this outcome is not reported for the experiment, so direct comparison is not possible.

    1. Reviewer #1 (Public review):

      [Editors' note: Given the minor nature of this revision, the editors have not sent this back to the original reviewers. The original reviews have been included.]

      In this study, the authors set out to determine how two classes of kinase inhibitors, which stabilise a disease-relevant enzyme in either an active (Type I) or inactive state (Type II), influence its organisation and interactions with microtubule filaments in cells. Using the state-of-the-art in-cell structural imaging approaches, they examine how these compounds affect the formation of protein filaments and their association with microtubules, and succeed in defining the underlying structural basis for these differences.

      A major strength of the work is the application of in-cell cryo-electron tomography combined with correlative imaging, which enables direct visualisation of protein organisation in a near-native cellular context. The data convincingly demonstrate that the Type I inhibitor compound stabilising the active state promotes extensive LRRK2 filament formation and microtubule bundling, whereas compounds stabilising the inactive state markedly reduce these interactions. The structural analysis further provides insight into how conformational states relate to filament organisation, including modelling of previously unresolved regions of the protein.

      These findings are internally consistent and align well with prior biochemical and structural studies, many of which were performed by the same team.

      There are, however, some limitations that should be noted. The experiments rely on overexpression of the I2020T mutant form of the LRRK2 protein, which is a rare variant, in a single cell type (293T cells), which may not fully reflect endogenous behaviour or wild-type LRRK2 in a physiological context. In addition, while the imaging data are compelling, the functional consequences of the observed filament formation and microtubule association remain unclear.

      The study therefore provides strong descriptive and structural insight, but more limited evidence linking these observations to cellular or disease-relevant outcomes.

      Overall, the authors largely achieve their aims, and the results support their central conclusion that different classes of kinase inhibitors have distinct effects on protein organisation in cells. The work represents an important advance in understanding how small molecules can reshape protein architecture in a cellular environment, with potential implications for therapeutic strategies. The methodological approach will also be of broad interest to the field, as it highlights the power of in-cell structural biology to study dynamic protein assemblies that are difficult to capture using traditional approaches.

    2. Reviewer #2 (Public review):

      Summary:

      Mutations in Leucine-Rich Repeat Kinase 2 (LRRK2) are a major cause of Parkinson's disease. LRRK2 PD-related mutations all result in increased kinase activity. Therefore, LRRK2 has been the focus of the development of kinase inhibitors. So far, two classes of kinase inhibitors have been identified: type 1 LRRK2-specific inhibitors that stabilize LRRK2 in a closed active-like conformation and broad-range type 2 inhibitors that stabilize LRRK2 in an open inactive-like conformation. Basiashvili et al. used here in cell structural biology to study the effect of both type 1 and type 2 inhibitors on the localization and structural conformation of LRRK2-I2020T.

      Strengths:

      They showed that Type 1 and not Type 2 inhibitors induce LRRK2 filament/ on microtubules. Furthermore, they were able to build a structural map of full-length LRRK2 I2020T bound to a Type 1 inhibitor in a closed kinase confirmation. Together, this work thus confirms the data of previous studies that showed that LRRK2 Type 1 and 2 inhibitors differently affect filament formation.

      Previous Weaknesses:

      All conclusions are fully supported by the provided data. However, as the authors indicated themselves, the physiological relevance of LRRK2 microtubule binding is questionable. Furthermore, although the authors used a full-length LRRK2 protein, like in previously published structures, the resolution of the N-terminal domains is rather poor. Therefore, it also remains unclear what we learn from this structure compared to the previously published structures.

    3. Reviewer #3 (Public review):

      Summary:

      This paper describes new insights into the effects of type-I and type-II LRRK2 inhibitors on HEK293T cells that over-express GFP-labeled LRRK2-I2020T. Using correlative light microscopy and cryo-electron tomography, a type-I inhibitor leads to the extensive decoration of microtubules with LRRK2, which is not seen for a type-II inhibitor. Subtomogram averaging reveals that LRRK2 binds to the microtubules in a closed-kinase conformation, with density for the N-terminal arms.

      Strengths:

      The paper is well written; the CLEM and cryo-ET appear to be done to a high standard. Consequently, I have only minor comments.

      Weaknesses:

      The resolution of the subtomogram averages is somewhat limited, but the authors have adequately limited the number of degrees of freedom in the fitting of their atomic models by only allowing rigid-body transformations of separate parts of LRRK2.

      The authors should include FSC curves between the rigid-body fitted atomic models and the various sub-tomogram average maps.

      Comment on the current version from the Reviewing Editor:

      I do note that Ext Data Fig 8 does not yet contains the requested model-vs-map FSC curves. I guess this is an oversight and trust that the authors will remedy this during the production process. They might also want to explain what the black, red, green and blue FSC curves are in the current figure (or only show the black (solvent-corrected FSC) curve, together with the requested model-vs-map curve.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors aim to identify the neural circuit mechanisms underlying dystonic crisis, a severe and life-threatening manifestation of dystonia, and to explore potential therapeutic targets. The authors combine retrospective clinical data from pediatric patients with mechanistic experiments in a genetic mouse model of dystonia. They focus on inhibitory cerebellar nuclei neurons (iCNNs), testing whether these neurons can trigger dystonic crisis and whether their modulation can alleviate symptoms. Using optogenetics, anatomical tracing, and deep brain stimulation (DBS), the authors propose that iCNNs drive dystonic crisis via projections to the centrolateral (CL) thalamus and that this pathway can be therapeutically targeted.

      Strengths:

      A major strength of the study is its integrative approach, bridging human clinical observations and mechanistic animal experiments. The clinical analysis provides suggestive evidence linking cerebellar abnormalities and inhibitory signaling to dystonic crisis, which motivates the subsequent experimental work. In the mouse model, the authors use cell-type-targeted optogenetic manipulation to show that activation of iCNN pathways induces dystonic crisis-like episodes, while inhibition alleviates spontaneous crises. These bidirectional manipulations provide strong support for a causal role of iCNN activity in modulating disease severity. The identification of a monosynaptic projection from iCNNs to the CL thalamus, combined with DBS experiments showing therapeutic effects, further strengthens the proposed circuit mechanism and highlights translational relevance.

      The behavioral effects reported are robust and reproducible across animals, and the use of both activation and inhibition paradigms is a notable strength. The DBS experiments are particularly compelling in demonstrating that modulation of a downstream node can mitigate symptoms induced by upstream circuit activation, supporting the functional relevance of the identified pathway.

      Weaknesses:

      However, several limitations temper the strength of the conclusions.

      First, the specificity of the genetic and optogenetic manipulations is not absolute. The Ptf1a-based strategy targets iCNNs but also labels other neuronal populations and projections, raising the possibility that off-target effects contribute to the observed phenotypes. Although the authors argue that light spread and anatomical considerations make this unlikely, more discussion on evidence of circuit specificity would strengthen the claims.

      Second, the behavioral definition and quantification of "dystonic crisis" in mice, while carefully described, remain somewhat subjective and may not fully capture the complexity of the human condition. Additional quantitative or automated behavioral analyses could increase confidence in the interpretation of these episodes and facilitate comparison across conditions. If difficult to add, please at least discuss this aspect.

      Third, while the anatomical tracing suggests a projection from iCNNs to the CL thalamus, the functional contribution of this specific synaptic connection is inferred rather than directly demonstrated. The DBS experiments support involvement of the CL but do not establish whether the iCNN→CL pathway is necessary or sufficient for the observed effects. More direct circuit-level manipulations would be required to fully validate this mechanism. If difficult to perform these experiments, please at least discuss the importance of such future studies.

      Finally, the translational relevance, while promising, remains somewhat speculative. The clinical data are retrospective and correlative, and the therapeutic implications of targeting this pathway in humans will require further validation.

      Overall, the authors have achieved their primary aim of identifying a cerebellar inhibitory circuit that can drive and modulate dystonic crisis in a mouse model. The results support their central conclusions, although some mechanistic aspects remain incompletely resolved. The study provides a valuable contribution to the field by highlighting a previously underappreciated role of inhibitory cerebellar output neurons and suggesting a new circuit-based framework for understanding and treating severe dystonia.

    2. Reviewer #2 (Public review):

      Summary:

      The role of the cerebellum in producing and modifying dystonic motor phenotypes has been of increasing recent interest to understand the pathophysiology of movement disorders, as well as to develop novel pharmacological and surgical interventions to treat these disorders. Previous rodent and human imaging studies have shown that in genetic, drug-induced, and injury-acquired dystonia, cerebellar dysfunction and output from the deep cerebellar nuclei have correlated with the development of dystonia symptoms. In some genetic dystonia patients, the strength of connections between the cerebellum, thalamus, and cortex could explain reduced penetrance or severity of symptoms in these genetically defined dystonia patients. Altogether, these studies have pointed to abnormal output from the cerebellum as a driver of abnormal motor output. Some studies have even gone as far as to suggest that no cerebellum is better than a cerebellum with abnormal output (see PMID 8491286). This indicates a critical need to understand the neural circuits underlying dystonia development, how the cerebellum drives symptom onset or severity, and if the cerebellum could be therapeutically targeted for the benefit of patients with dystonia.

      Hipolito et al. use rigorous mouse genetics-based approaches to understand how a specific cell type, inhibitory projection neurons from the cerebellar nuclei, can drive dystonic phenotypes, especially severe dystonic phenotypes. The authors demonstrate a number of novel findings that further support a critical role for disturbed cerebellar output in driving dystonic phenotypes, and that disrupting this disturbed output may provide a novel therapeutic approach for dystonia. Specifically, the authors define a novel role for inhibitory neurons of the cerebellar nuclei in driving disease, and these neurons have not previously been observed to have monosynaptic connections into a specific nucleus of the thalamus. Disruption of these connections via deep-brain stimulation alleviated severe dystonic crisis with quick onset, and repeated stimulation sessions possibly had a long-term disease-modifying effect. Overall, these findings present novel insight into the circuits and mechanisms by which inhibitory neurons of the cerebellar nuclei influence dystonic states, and how these may be a viable therapeutic target for severe dystonia. My specific comments are below:

      Strengths:

      The manuscript uses rigorous mouse genetics techniques to provide fundamental insight into the role of inhibitory projection neurons of the cerebellar nuclei in influencing dystonic states. Solid experimental evidence is used to step-by-step illustrate circuit-level consequences of inhibitory projections of the cerebellar nuclei, and whether these can be manipulated for therapeutic benefit.

      Weaknesses:

      There are mild weaknesses in the approach around proving the specificity of the vGlut2 knockout, the long-term effects of silencing inhibitory projections, as well as the degree to which activation specifically drives dystonic crisis. These are addressed in my specific comments below.

    1. Reviewer #1 (Public review):

      [Editor's note: This version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed all concerns raised by the reviewers; no further changes are required at this point.]

      Summary:

      The manuscript by Yang et al. investigates the relationship between multi-unit activity in the locus coeruleus, putatively noradrenergic locus coeruleus, hippocampus (HP) sharp-wave ripples (SWR) and spindles using multi-site electrophysiology in freely behaving male rats. The study focuses on SWR during quiet wake and non-REM sleep, and their relation to cortical states (identified using EEG recordings in frontal areas) and LC units.

      The manuscript highlights differential modulation of LC units as a function of HP-cortical communication during wake and sleep. They establish that ripples and LC units are inversely correlated to levels of arousal: wake, i.e. higher arousal correlates with higher LC unit activity and lower ripple rates. The authors show that LC neuron activity is strongly inhibited just before SWR detected during wake. During non-REM sleep, they distinguish "isolated" ripples from SWR coupled to spindles and show that inhibition of LC neuron activity is absent before spindle-coupled ripples but not before isolated ripples, suggesting a mechanism where noradrenaline (NA) tone is modulated by HP-cortical coupling. This result has interesting implications for the roles of noradrenaline in the modulation of sleep-dependent memory consolidation, as ripple-spindle coupling is a mechanism favoring consolidation. The authors further show that NA neuronal activity is downregulated before spindles.

      Strengths:

      In continuity with previous work from the laboratory, this work expands our understanding of the activity of neuromodulatory systems in relation to vigilance states and brain oscillations, an area of research that is timely and impactful. The manuscript presents strong results suggesting that NA tone varies differentially depending on coupling of HP SWR with cortical spindles. The authors place their findings back in the context of identified roles of HP ripples and coupling to cortical oscillations for memory formation in a very interesting discussion. The distinction of LC neuron activity between awake, ripple-spindle coupled events and isolated ripples is an exciting result and its relation to arousal and memory opens fascinating lines of research.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, authors studied the synchrony between ripple events in Hippocampus, cortical spindles and Locus Coeruleus spiking. The results in this study together with the established literature on the relationship of hippocampal ripples with widespread thalamic and cortical waves, guided authors to propose a role for Locus Coeruleus spiking patterns in memory consolidation. The findings provided here, i.e. correlations between LC spiking activity and Hippocampal ripples, could provide basis for future studies probing the directional flow or the necessity of these correlations in the memory consolidation process. Hence, the paper provides enough scientific advance to highlight the elusive yet important role of Norepinephrine circuitry in the memory processes.

      Strengths:

      Authors were able to demonstrate correlations of Locus Coeruleus spikes with hippocampal ripples as well as with cortical spindles. Specific strength of the paper is in the demonstration that the spindles that activate with the ripples are comparatively different in their correlations with Locus Coeruleus than those which do not.

    3. Reviewer #3 (Public review):

      This manuscript examines how locus coeruleus (LC) activity relates to hippocampal ripple events across behavioral states in freely moving rats. Using multi-site electrophysiological recordings, the authors report that LC activity is suppressed prior to ripple events, with the magnitude of suppression depending on ripple subtype. Suppression is stronger during wakefulness than during NREM sleep and least pronounced for ripples coupled to spindles.

      The study is technically sound and addresses a timely and important question regarding how LC activity interacts with hippocampal and thalamocortical network events across vigilance states. While the findings are interesting, they remain observational in nature.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript examines whether scene meaning guides overt attention in rhesus macaques. Two monkeys freely viewed naturalistic indoor scenes, including laboratory or housing scenes described as familiar and other indoor scenes described as unfamiliar. The authors compare fixation locations with matched non-fixated control locations using predictors derived from center proximity, image salience, and a DeepMeaning model intended to capture the spatial distribution of semantic informativeness. They report that meaning predicts fixation selection beyond salience and center bias, that meaning and salience interact, that familiar scenes produce broader exploration of low-meaning regions, and that the influence of meaning increases with attentional engagement.

      Strengths:

      A major strength of the study is its use of natural free-viewing behavior in macaques. The experimental approach takes advantage of intrinsic gaze allocation rather than relying on a more artificial task, which makes the work a useful bridge between human scene-viewing studies and future neurophysiological studies in nonhuman primates.

      The statistical analyses are extensive. The authors model fixated and matched non-fixated samples with Bayesian generalized linear mixed models, including center proximity and salience as important controls, examined interactions among predictors, and reported diagnostics for multicollinearity and model convergence. These analyses support the basic observation that the human-derived meaning maps are associated with macaque fixation allocation beyond the particular center and salience terms included in the model.

      The question is interesting and timely. If meaning-like scene structure can be operationalized for macaque viewing, this would provide a useful behavioral foundation for future work on the neural mechanisms that link scene analysis, gaze allocation, and natural behavior.

      Weaknesses:

      The main weakness is interpretive. The manuscript often treats the DeepMeaning map as though it measures scene meaning for the monkey, but the map is ultimately human-derived. Some of the examples make this issue especially salient: regions such as clocks, phones, dining tables, or other human artifacts may be meaningful to human observers, but it is not clear that they have semantic meaning for macaques. If meaning-based guidance is argued to emerge through experience, then unfamiliar human indoor scenes that the monkeys have never encountered cannot straightforwardly be meaningful to them in the same sense that they are meaningful to humans. Predictive success for these scenes may therefore indicate sensitivity to visual or object-level structure correlated with human-rated meaning, rather than macaque semantic understanding.

      A related concern is that the DeepMeaning predictor may capture forms of visual salience, objectness, or high-level image structure not captured by the particular low-level salience model. For example, a clock or phone may attract gaze because of shape, contrast, face-like configuration, object boundaries, or other mid-level features rather than because it carries semantic meaning for a macaque. The present analyses show that this model is predictive, but they do not by themselves establish that the predictive variable is semantic meaning rather than visual structure beyond Itti-Koch-style salience.

      The manuscript relies heavily on fitted model parameters and derived maps, with relatively little return to the raw behavioral data. The main claims would be easier to evaluate if the authors showed more direct fixation-density maps, scene-by-scene examples, and aggregate raw relationships between fixation behavior and map values. At present, much of the argument rests on interpreting fitted coefficients, without enough behavioral visualization to show what the monkeys actually did across the stimulus set.

      It is also unclear whether model performance was evaluated on held-out data. The comparison to repeated viewing of the same images is useful as a behavioral benchmark, but a second viewing may itself be affected by familiarity or memory for the image. This makes it a potentially imperfect estimate of a noise ceiling for first-pass fixation predictability. Cross-validation or held-out prediction, ideally across held-out images as well as trials, would make the predictive claims more convincing.

      Although the authors describe multicollinearity as negligible, Figure S2B-C appears to show some nontrivial correlations among predictors. These correlations may matter for interpretation even if variance inflation factors fall below conventional thresholds, especially when the signs of fitted effects point in directions that may be expected from the input correlations, such as relationships involving meaning and familiarity. The manuscript would benefit from reporting these correlations quantitatively and relating them to the fitted effects.

      The familiarity analysis is interesting but would benefit from further control. Familiar scenes are photographs of the monkeys' housing and laboratory environments, whereas unfamiliar scenes are other indoor environments. These categories may differ not only in familiarity but also in clutter, spatial layout, object density, color distribution, luminance, contrast, edge density, texture statistics, or the distributions of salience and meaning values. Without additional characterization of the image sets, the conclusion that familiarity itself broadens exploration should be treated cautiously.

      The engagement effects also appear less consistent across the two monkeys than some of the summary language suggests. The monkey-specific results should be emphasized, and claims about engagement strengthening meaning-based guidance should be stated in proportion to the cross-animal evidence.

      Finally, the manuscript sometimes uses language that sounds more mechanistic than the behavioral data can support. The negative interaction between meaning and salience is an interesting result, but terms such as competitive integration in a shared priority map go beyond what can be concluded from overt fixation selection alone. The study lacks a causal or perturbational manipulation, such as image inversion or another transformation that preserves local features while altering semantic organization. The result would be clearer if described first as a model-based association or subadditive interaction in gaze allocation, with the priority-map interpretation presented as a plausible account rather than a direct conclusion.

    2. Reviewer #2 (Public review):

      Summary:

      In prior work, the authors developed an ML algorithm that computes spatial maps of "meaning": image regions that are likely to be given semantic labels by human observers. They also previously showed that "meaning" predicts fixations in humans and human infants. Here, these observations were extended to macaque monkeys, testing the hypothesis that meaning is a phylogenetically preserved driver of overt attention across primates.

      Strengths:

      The paper reports that fixated locations had higher values of meaning compared to nearby, non-fixated locations. Specifically, it shows that meaning values - as inferred from a neural network model - are useful in differentiating these two classes of locations, beyond the established effects of image salience and centrality on gaze. The reported results were consistent in both monkeys.

      Weaknesses:

      It is difficult to understand what, precisely, is meant by meaning from this paper, although the prior work from this group may offer some insight. Given that, it is not clear if "high-meaning" image locations tend to be objects, for example, or faces, or other such behaviorally relevant image features. Indeed, the utility of the meaning maps was not evaluated against other algorithms that consider more complex natural scene information. This is a particular concern as the paper does not demonstrate that meaning predicts where the viewer will look within the image; instead, it shows that meaning is one of the variables that differentiates fixated locations from nearby non-fixated locations. Because this is not a causal study by necessity, caution is also needed in interpreting the results. In our view, the most parsimonious interpretation may not be that meaning guides gaze in monkeys, but instead that people tend to name things that primate brains evolved to fixate on at the expense of neighboring locations.

    3. Reviewer #3 (Public review):

      Summary:

      This novel study asks whether meaning-based guidance of overt attention, well-established in humans through the "meaning map" framework, extends to non-human primates. The authors recorded eye movements from two rhesus macaques freely viewing naturalistic indoor scenes and modeled fixation selection using DeepMeaning maps, Itti-Koch salience maps, and center proximity. They report that scene meaning robustly predicts fixation selection after controlling for salience and center bias, that meaning and salience interact competitively rather than additively, and that the influence of meaning is modulated by scene familiarity and attentional engagement. The cross-species extension of the meaning map approach is a valuable contribution, and the Bayesian GLMM framework with variance partitioning is well-suited to the question.

      Strengths:

      (1) The cross-species extension itself is novel and well-motivated. Nobody has applied the meaning map framework to NHP gaze behavior before. Even with the interpretive caveats I raise below, creating this methodological bridge between human scene perception research and NHP circuit neuroscience is a valuable contribution.

      (2) The statistical framework is strong. The Bayesian GLMM with posterior distributions, HDIs, and probability of direction is more informative than frequentist alternatives. The variance partitioning with ΔR² is the right approach for disentangling predictor contributions. Random intercepts for scene are appropriate. The convergence diagnostics (R-hat = 1.00, ESS > 8000 across all models) are exemplary.

      (3) Transparent individual-subject reporting. With N = 2, reporting each monkey separately rather than pooling or averaging is the correct choice, and the authors do this consistently. The individual differences are visible because the reporting is honest.

      (4) The experimental design is excellent. 200 scenes is a substantial stimulus set by NHP standards. The inclusion of both familiar and unfamiliar environments, the repeated-viewing design for reliability estimation, and the 5-second free viewing window that yields ~15 fixations per trial all reflect thoughtful design.

      (5) The familiarity and engagement analyses go beyond the basic demonstration. Even with the limitations we identified, asking how behavioral context modulates the meaning-gaze relationship is more ambitious than simply showing that the correlation exists. These analyses generate testable predictions for future work.

      (6) Data and code sharing commitment. The authors plan to release raw data, preprocessing, and analysis code on OSF and GitHub.

      Weaknesses:

      (1) The authors' central claim is that meaning-based attentional guidance is an "evolutionarily conserved component of primate vision." This claim rests on the finding that macaque fixation patterns correlate with DeepMeaning maps. However, DeepMeaning is trained on human ratings of local scene meaning using a vision-language transformer (CoCa) pretrained on billions of human image-text pairs. What the model captures, then, is the spatial distribution of visual structure that humans judge to be semantically informative. The authors acknowledge that DeepMeaning represents "structured visual representations of scene regions containing identifiable objects and informative relationships" (lines 261-262), but this acknowledgment actually highlights the problem: regions containing identifiable objects and informative spatial relationships would plausibly attract fixations in any visual system with object-selective neurons and a bias toward structured content, regardless of whether the observer is processing "meaning" in any semantic sense. That is, the correlation between macaque gaze and DeepMeaning maps is consistent with shared object-level visual processing, but doesn't uniquely implicate shared semantic processing. The critical adversarial test from Hayes & Henderson (2022a)-where meaning maps detected the removal of semantic content via diffeomorphic scrambling while deep saliency models did not-has not been applied to macaque viewing behavior. Importantly, such a test would require new data collection (showing monkeys scrambled scenes), which may not be feasible. A more tractable approach with the existing data would be to compare DeepMeaning against some other model that captures mid-level visual structure without semantic supervision, though this would be a weaker test. Given these constraints, I would ask the authors to (a) acknowledge this limitation explicitly and temper the evolutionary conservation claim accordingly-for example, framing the result as evidence that macaques and humans share attentional biases toward visually structured scene regions, with the semantic interpretation remaining an open question-and (b) note the diffeomorphic scrambling experiment as an important future direction for establishing whether macaque attention is guided by semantic content per se.

      (2) The familiar/unfamiliar scene comparison confounds long-term familiarity with systematic differences in scene content. Familiar scenes are photographs of the vivarium and laboratory; unfamiliar scenes are restaurants, bedrooms, kitchens, and offices. These two categories almost certainly differ in visual complexity, object density, spatial layout, clutter, and the types of objects present. The familiar environments (vivarium caging, lab equipment) are likely more spatially repetitive and lower in object diversity than, say, a restaurant or residential kitchen. Any difference attributed to "familiarity" could therefore reflect these systematic content differences. The negative interaction between meaning and familiarity (Monkey V: β = −0.19; Monkey I: β = −0.19), which the authors interpret as familiarity broadening exploration, could instead reflect the fact that vivarium/lab scenes have a different distribution of meaning values or a different relationship between meaning and salience than human domestic environments. The authors should address this confound directly. At minimum, comparing the distributions of meaning and salience values across the two scene categories would help the reader evaluate whether the familiarity effect can be separated from content effects. Ideally, the authors would include a subset analysis using only scenes matched on feature distributions or include scene-level summary statistics of the meaning and salience maps as covariates in the familiarity model.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigate the relationship between feedback responses and trial-to-trial learning. In their paradigm, participants were constrained to a channel trial, and a cursor was visually perturbed. Using a channel-perturbation-channel structure, the authors obtain feedback responses to the perturbation and the learning response that ensues. In Experiment 1, the authors demonstrate that temporal dynamics of the learning response (LR) are poorly linked to temporal dynamics of the feedback response (FBR). The LR responses are yoked to the start of the movement, even in cases where the FBR is very delayed. Then, in Experiments 2 and 3, the authors dissect FBR and LR responses into two components: (1) a phasic component that has a peak point mid-movement and then declines, and (2) a tonic component that grows over the movement time course and remains stable during the holding period. The authors provide evidence that LR responses are better predicted from the tonic component of the FBR than the phasic component. The idea that tonic FBR components drive learning over phasic components departs from prior models of error-based learning and provides a new theory to understand sensorimotor adaptation.

      Strengths:

      (1) The paper is well-written, and the contribution is important and timely. The authors provide clear experiments that change the way we conceptualize how trial-to-trial learning is driven by feedback responses to error.

      (2) The paper provides solid evidence to demonstrate that feedback (FBR) and learning (LR) responses are not linked by a fixed delay, in contrast to prior models.

      (3) The paper also introduces the concept that both tonic and phasic components of the FBR differentially influence the learning response. The paper provides solid evidence that the tonic forces maintained during holding still have an impact on the learning that proceeds on the next trial. This has implications for models of sensorimotor adaptation and our understanding of the physiology of learning.

      Weaknesses:

      While some conclusions are strong, I feel that the conclusions regarding FBR and LR relationships need additional analysis. All these concerns are elaborated below. Broadly speaking, there is a concern that some conclusions reached by the authors are linked to the particular phasic/tonic model they use to parse FBR and LR responses. Other models are not considered and could lead to differing results. Furthermore, it is assumed that LRs are scaled FBRs. This assumption excludes the possibility that LRs could be driven by FBRs and other mechanisms, which would alter the way the regression analyses are constructed. As described below, model-free analyses are warranted to corroborate the main findings. Further, the role that phasic-FBR plays in the adaptation process is understated in the Discussion despite evidence to the contrary in Figure 8. Much of the analysis is done on trial-averaged and participant-averaged responses, inflating R2 values. More analysis should be done at the trial level to better examine model performance and accuracy. And while valuable, the authors' experimental approach differs from standard force-field experiments that were initially used to test feedback error learning hypotheses. The paper could benefit from a Limitations section to discuss associated limitations.

      Main Concern 1:

      The decomposition of FBR and LR into phasic/tonic components is based on a specific model (i.e., Equation (1)). The notion that tonic FBR predicts phasic/tonic LR is based on responses estimated from the model. Thus, it is unclear whether critical findings (e.g., LR responses are predicted by tonic FBR) are true of the "data" or true when the "data are analyzed in the context of their model". In other words, had the authors proposed a different model to decompose the LR/FBR into tonic/phasic components, would they obtain different results?

      There are many possible alternatives:

      (A) In Equation (1), the phasic and tonic components are assumed to add linearly at all times to obtain the force profile. But the phasic and tonic components could be applied at separate times. The tonic component could be invoked during holding, and the phasic component could be invoked during moving. This type of model will differ from the current version, especially in how the peak force during the moving period is assigned to the phasic/tonic components.

      (B) Another possibility is that the tonic and phasic components do indeed operate at the same time (like in Equation (1)), but they are separate, independent controllers. In the author's model, the tonic component is dependent on the phasic component.

      (C) Another possibility is that the tonic and phasic components are linked, but not by an integral.

      (D) Another possibility is that the phasic component is not a Gaussian function of time.

      Concern 1-1:

      While it is not possible to explore the entire model space described above, the authors should consider whether other phasic/tonic model classes could lead to qualitatively different results. The authors could also consider other phasic/tonic models if appropriate, and demonstrate that Equation (1) is superior based on an information criterion like AIC or BIC.

      Concern 1-2:

      I recommend that the authors pursue model-free, empirical analyses to support their findings. This would decrease the reliance on the "correctness" of a particular model. One logical choice would seem to be empirically estimating the phasic component as the peak force during the moving period and the tonic component as the average force during the holding period. In this model-free estimation of phasic and tonic commands, is it still the case that tonic FBR alone predicts LR components?

      Concern 1-3:

      Building on Concern 1-2, a clear case where the concern about using a model alone to estimate phasic and tonic components is in the across-subject variability analysis in Figure 7. Here, LR and FBR are compared to one another only in the context of the tonic-phasic model in Experiment 1. The result is that only the tonic FBR predicts the tonic LR. But investigating Figures 7b and 7c, it would appear that the peak force applied during the FBR during the moving period (which should reflect the phasic component in large part as in Figure 4a) would predict the peak (or average) force applied during the LR. Thus, the conclusion that tonic FBR only predicts tonic LR may be driven by how the model estimates tonic/phasic FBR/LR rather than a true property of the data. A model-free analysis, as suggested in Concern 1-2, would be helpful in addressing this concern.

      Main Concern 2:

      Analyses in Figures 4g, 4h, 6c, and 6d are based on relating LR and FBR components with no intercept: y = ax; the LR component is a scaled FBR component. It is unclear if the authors' conclusion would vary had a different model been used. For example, suppose that LR on trial n is partly determined by the FBR and also the sensory error (e) on trial n-1 (where c1 and c2 are constants):<br /> LR(n) = c1 FBR(n-1) + c2 e(n-1)

      Another model could suppose that the LR on trial n is due to the FBR on trial n-1, and also a non-specific adaptive component that is independent of both FBR and the sensory error:<br /> LR(n) = c1 FBR(n-1) + c2

      Concern 2-1:

      For these alternate models, y=ax (i.e., zero intercept) is not an appropriate relationship between LR and FBR components. Had the authors allowed a non-zero intercept in Figs. 4g, 4h, 6c, and 6d, will they still observe that only tonic FBR predicts LR components? In other words, would R2 improve for phasic FBR relationships with a non-zero intercept?

      Concern 2-2:

      Why was a non-zero intercept allowed for the between-subject analyses in Figure 7, but not for similar analyses in Figures 4 and 6?

      Main Concern 3:

      The main results in Figures 4g, 4h, 6c, and 6d are based on an R2 value that is calculated on a linear fit to the mean response averaged across participants and trials. This raises the concern that the R2 value is being inflated, and it also misses the rich trial-to-trial variation and subject-to-subject variation that could be used to examine the model's accuracy. A couple of concerns here:

      Concern 3-1:

      As can be seen from the horizontal and vertical error bars in Figures 4g and 4h, there is considerable variability across participants. While not shown, it is almost certainly the case that there is considerable variability across trials within a participant (as alluded to in the Fig. 8 analyses). The authors should evaluate their model performance and report goodness-of-fit (or error) at the single-trial level. For example, the model could be fit to individual trial data, and the R2 values from the trial fits could be used for comparing the various relationships in Figures 4 and 6. Another idea would be to keep the alpha, beta, T and sigma estimates obtained from the average data, and then apply these parameters to individual trial responses and report the model error. Do phasic FBR commands similarly predict LR components at the trial level, or do trial-level analyses corroborate the current conclusions on tonic FBR superiority?

      Concern 3-2:

      The authors report on Line 200 that the R2 values of 0.635 and 0.698 have modest predictive power. It would be helpful for the authors to statistically compare the R2 values between Figures 4g and 4h. One idea would be to obtain an R2 value for each individual participant. Then the distribution of R2 values across participants could be compared between the different relationships in Figure 4g/4h (e.g., via a t-test). This would help to better support the idea that Figure 4h shows better model fits than Figure 4g. These analyses could also be conducted for the relevant parts of Figure 6 (Experiment 3). The authors should consider allow a y-intercept in this process as they do in Figure 7.

      Main Concern 4:

      The authors compare tonic and phasic FBR predictive power in Figure 4. There are other places where the analyses in Figures 4g and 4h should be repeated:

      Concern 4-1:

      Tonic and phases FBR responses appear to vary in Experiment 1 (Figure 2c), but the authors do not test whether they predict the LR component magnitudes in Figure 2d. Analyses in Figures 4e,4f, 4g, and 4h should be added to the Experiment 1 analysis.

      Concern 4-2:

      While I understand the rationale behind computing differences in Figure 6 to isolate the second-shift effect on FBR/LR, the authors should still perform the primary investigation in Figures 4e, 4f, 4g, and 4h on the FBR and LR responses in Figures 5b-g (without subtracting the "Maintained" component). In other words, before analyzing the contributions of the second shift in Figure 6, the authors should repeat their analysis in Figure 4 applied to the FBR and LR responses in Figure 5 (without subtracting off the maintained response). How well does Equation (1) and y=ax capture the FBR and LR responses in Figures 5b-g?

      Main Concern 5:

      Given current practices in human sensorimotor adaptation, the current n=10 (or n=12) group sizes appear limited in size, raising concerns on statistical power.

      Concern 5-1:

      The authors should consider a power analysis or provide some other justification to support their chosen sample sizes.

      Concern 5-2:

      It is unclear why cross-correlation analyses in Figure 2e, 3d, and 5h have error bars, but no other FBR or LR time courses have error bars. Error bars should be provided in Figures 2b, 2c, 2d, 3b, 3c, 5b, 5c, 5d, 5e, 5f, 5g, 6a, and 6b.

      Concern 5-3:

      The subject counts are reported as n=10 for Experiment 1, n=12 for Experiment 2, and n=12 for Experiment 13, but the subject-to-subject analysis in Figure 7 says n=33.

      Main Concern 6:

      I agree that the author's model suggests that LR responses are most strongly predicted by the tonic FBR component. But I feel the narrative and Discussion surrounding this point are too strong. They paint the picture that only tonic FBR is important in learning. To do this, the role that phasic FBR plays is discounted, and mixed results concerning tonic FBR are overlooked. I feel that the Discussion should be broadened to acknowledge that the authors find evidence that both tonic and phasic FBR appear to influence the learning response, with tonic FBR making the stronger contribution in this task. Here are key areas that require attention:

      Concern 6-1:

      Importantly, the authors downplay their result in Fig. 8h, that the phasic FBR predicts phasic LR in their Results on Line 350. This argues against the idea that only tonic FBR influence LR parameters. On Line 485, the authors state that "trial-by-trial variability in LR amplitude was explained by the tonic component of the FBR, but not by the phasic component (Fig. 8)." This is not correct. Both the tonic and phasic components of the FBR altered LR components in Figure 8.

      Concern 6-2:

      Again, it is stated on Line 502, that the phasic FBR component "had only a modest effect on the LR". This again seems to underplay the result. The authors should amend their Results and Discussion to better acknowledge that their data support a role for both tonic and phasic FBR contributions to LR, but the tonic component appears to make a larger contribution in their model.

      Concern 6-3:

      While the role of phasic FBR in determining LR amplitude appears to be understated, the role of tonic FBR is, on occasion, overstated. The Discussion should mention that there is mixed evidence for the role of tonic FBR in LR parameters. For example, in their between-subjects analysis in Figure 7f, the authors do not find that phasic LR can be predicted by tonic FBR. Thus, across subjects, no component of the FBR appears to predict phasic LR.

      Concern 6-4:

      To better investigate the role that both phasic FBR and tonic FBR may play in adaptation, it would be advisable for the authors to consider this hypothesis. As it stands, tonic LR or phasic LR is regressed only onto tonic FBR or phasic FBR individually. In Figures 1 (Experiment 1), 3 (Experiment 2), and 5 (Experiment 3), the authors could regress tonic LR and phasic LR onto both phasic FBR and tonic FBR simultaneously. Models where LR = c1 phasic-FBR + c2 tonic-FBR could be considered and compared against univariate models, LR = c phasic-FBR and LR = c tonic-FBR using AIC or BIC to determine whether a mixed model that predicts LR with both phasic and tonic FBR is warranted.

      Irrespective of the result, the authors should be careful (Concerns 6-1 and 6-2) to state that when levels of tonic-FBR were controlled in Figure 8 (which is likely the cleanest way to look at the role phasic FBR plays in learning), phasic-FBR showed a clear influence on LR.

      Major Concern 7:

      On Line 577, it states the "hand was automatically returned to the starting position". Does this mean that the robot moved the hand back to the start location? If so, was the hand ever released from a force channel in between the perturbation trial and the following channel trial? A concern is that the holding forces from the perturbation trial could "bleed over" into the forces applied during the subsequent channel trial if the subject always remains in a channel trial in between the trials. Suppose we label the 3-trial structure as Channel 1 (C1) - Perturbation (P) - Channel 2 (C2). The authors should confirm that the holding forces on P are not correlated with baseline force (i.e., the channel force prior to movement onset) in C2. I do not expect there to be a strong correlation given that the learning responses in Figs. 2d, 3c, and 5e-g appear near-zero at t=-400ms, but this should still be verified.

      Major Concern 8:

      In Supplementary Figure 1, there appears to be an error in the "Amplitude of phasic LR (N)". In Supplementary Figure 1f, the phasic LR magnitudes appear in line with Supplementary Figure 1d, but there is a mismatch in the magnitudes for the phasic LR in Supplementary Figures 1e & 1d (the phasic LR magnitudes appear to be too low in Supplementary Figure 1e, peaking at around 0.1N when they should peak at around 0.15N).

      Major Concern 9:

      The authors should provide a Limitations section, highlighting unanswered concerns listed above, mixed results, and differences from prior work. These are touched upon in the Discussion section (particularly in Perspectives for future studies) but should be expanded further. At a minimum, the authors should consider including a discussion of the following points:

      Differences from prior work:

      9-1: There are methodological differences between this work and past studies highlighted by the authors. It could be that there are multiple error-based learning mechanisms that drive the FBR. Here, the authors find that visually-driven FBR responses do not drive LRs at a "common temporal shift". Instead, LRs are broadly expressed at the start of the movement (regardless of when the FBR was timed). However, tasks that have other components (e.g., a proprioceptive error) might invoke different learning mechanisms. For example, proprioceptive-driven FBRs might invoke LRs that have different temporal properties than visually-driven FRBs.

      9-2: As noted by the authors, Reference [10] studied FBR-driven learning in muscle commands, as opposed to forces. Muscle responses may have differing temporal and/or magnitude (for phasic/tonic) components that qualitatively differ from the force-based conclusions made here. Thus, the learning mechanisms at the muscle level may differ from those observed at the force level.

      9-3: While the tonic FBR is a strong predictor of the learning response in this experiment, most of the experimental conditions are done where the cursor remains deviated from the target throughout the trajectory and into the holding period. This differs from past work on feedback error learning, where feedback was veridical, and the cursor (and hand) ended on the target. This persistent displacement from the target during the prolonged holding period may influence the learning process and could enhance the tonic-FBR contribution to learning.

      9-4: The authors state in the present study that subjects were told not to use "explicit strategies" and move as straight as possible to the target. For past work, participants were able to use explicit strategies during feedback and learning responses. It could be that the lack of (or reduction in) explicit responses alters single-trial learning mechanisms relative to past work.

      Alternate models:

      9-5: No alternate models are considered here for the tonic-phasic relationship. Other models could relate these two processes differently, which could lead to different conclusions.

      9-6: It is assumed that both the tonic and phasic controllers are active at the same moment in time and sum linearly to generate the overall force output. Other models could have applied each "controller" to different phases of the reach in a differential manner (e.g., two separate controllers, a moving controller and a holding controller operating at different moments in time).

      9-7: It is assumed here that the LR should be a scaled FBR: y = ax. Conclusions made here could change if the LR is due to multiple processes, FBR-driven learning only being one of them. Other models where the LR is driven by both FBR and the sensory error were not considered here.

      Mixed results:

      9-8: While tonic FBR was a good predictor of phasic LR at the group-level (e.g., 4g), it did not predict phasic LR between subjects (Fig. 7f) and in fact tended toward a negative relationship.

      9-9: Phasic FBR predicts Phasic LR at the trial-level (Figure 8h) but not as well at the subject-level (Figure 7d).

      9-10: Overall, with the exception of Figure 8, most analyses look at the relationship between LR and tonic FBR or phasic FBR separately. In Figures 4c, 4d, 6c, 6d, and 7d-g, the authors look at the marginal effect of tonic or phasic FBR on learning, but do not control for variations in the other FBR component (e.g., they look at phasic FBR on tonic LR, but do not control for tonic FR). The only analysis that controls for the other component is in Figure 8, suggesting that both tonic and phasic FBR contribute to LR.

      Minor concerns

      (10) I'm not sure I follow the cross-correlation analysis in Figure 3. Overall, to me, both the FBR in Figure 3b and the LR in Figure 3c look quite similar in their temporal profiles, irrespective of the shift magnitude. The authors state on Line 158 that their cross-correlation analysis "...revealed that the overall shape of the cross-correlation function changed systematically with error magnitude". However, to me, in Figure 3d, the shape of the many curves looks similar.

      What is confusing to me here is including a phasic movement period and a tonic holding period inside the cross-correlation. The tonic "static" component during the holding period will likely greatly influence how well the cross-correlation is able to match the phasic peaks during the LR/FBR moving periods. In other words, the reach consists of a "movement" and a "holding" period. But the cross-correlation is blending the two together, and thus, I am not sure how reliable this measure will be for truly estimating the temporal shift between conditions. For example, if you look at the shaded gray area in Figure 3b, the "Movement period" looks almost identical in temporal properties. The "peaks" and "troughs" happen at nearly the same moment in time across all conditions. The onset of the FBR at approximately 200 ms is also identical across shift magnitudes. Thus, to me, the temporal properties of the FBR seem very similar during the moving period (where the FBR is responding to the error). But including the holding force (the tonic force after the 600ms period) seems to be causing the cross-correlation function to estimate differences at very high lags. If these differences are being driven solely by the holding forces, I am not sure this is meaningful.

      It seems that the authors might want to repeat this analysis, excluding the holding force period from the calculation of the cross-correlation coefficients.

      (11) It would appear that the authors have a significant main effect of their ANOVA (p=0.028) in Fig. 3f, but no post-hoc tests are reported to indicate which group means differ.

      (12) When plotting FBR, a [0,600]ms period is shaded as the movement period. On Line 580, it says that feedback was provided on peak movement speed. Was any feedback provided as to the movement duration? If not, did participants complete the movement within the 600 ms window labeled as movement speed? Were movements during perturbation trials longer than non-perturbed trials?

      (13) Over what time period is Equation (1) fit to the data? Is it the [-200,700]ms window shown in Figure 4a? A concern is that including too much of the "holding period" in the model fit will cause the model to be biased toward fitting the holding period well and not the moving period. This, in turn, might lead to better estimates for the beta parameter than the alpha parameter. In addition to clarifying the fitting process, the authors should also include R2 values for the moving and holding periods separately.

      (14) The procedure is clear from Figure 1e, but it would be helpful on Line 91 to explain that "collapsing" FBR and LR across rightward and leftward means that the FBR and LR were negated for one of the directions (prior to collapsing).

      (15) Are the "Amplitude of tonic LR (N)" supposed to be negative in Figures 6c and 6d?

      (16) Overall, the parameter distributions in Figures 4e and 4f are similar to those in Supplementary Figures 1c and 1d. The FBR amplitudes look nearly identical. Only the Phasic LR amplitudes in Supplementary Figure 1d appear to be larger than the Phasic LR amplitudes in Figure 4f. Can the authors provide an intuition for why the phasic LR contributions increase when T and sigma parameters are allowed to vary between participants?

      (17) There are two points where the authors should consider softening their language:

      17-1: The authors state at multiple points (e.g., Line 154) that "...the waveforms of LRs remained largely similar across conditions, while their amplitudes showed only modest modulation with cursor shift magnitude". However, in Figure 3c, the LR amplitude for the 0.4 cm shift is approximately 0.2 N, and the LR amplitude for the 3 cm shift is approximately 0.3 N - a 50% increase. The authors should consider softening the language here to appreciate the variations in LR amplitude.

      17-2: On Line 258, it is stated that the FBR during holding "diverged only slightly" for the 16 cm condition in Fig. 5b. This seems too strong a statement. The "Maintained" FBR holding force is about 0.2 N, and the reverse is about 0.1 N. Thus, the "Maintained" condition is doubled. While I agree that the LR diverges more than the FBR (i.e., 5b vs. 5e), I think the language choice here should be more careful.

    2. Reviewer #2 (Public review):

      Summary:

      The authors find a strong trial-level relationship between tonic feedback responses and tonic learned responses.

      Strengths:

      The authors have performed several well-conducted experiments and thoughtful analyses to test the relationship between feedback responses and subsequent learned responses. The strength of the paper is the experimental control to probe this relationship and, eventually, oppugn the feedback error learning hypothesis.

      Weaknesses:

      In general, the processes studied in this manuscript and the past work have not explained the underlying mechanisms for the observed phenomena. Without knowing the mechanisms, the results are largely observational/correlational when linking feedback responses to learned responses, and there are no strong alternative hypotheses to explain the results. Most of the larger comments below stem from this theme, including:<br /> (i) what causes the phasic and tonic portions of the feedback response,<br /> (ii) justifying the phasic learned response,<br /> (iii) what are some alternative hypotheses that can explain the current results and past literature?

      Suggestions to improve the paper are below.

      (1) As mentioned above, it appears that there is limited mechanistic understanding of the underlying processes. For the feedback response, there is clearly a phasic and tonic component. It is not until one gets to the discussion that a potential mechanism is proposed, where presumably the phasic response may be velocity dependent, and the tonic response may be position dependent. On a somewhat related tangent, these responses somewhat mirror muscle spindles, which are known to have velocity and position-dependent responses, leading to the phasic and tonic firing during muscle stretch experiments.<br /> a) Can the authors provide more discussion on the work that they currently cite, which studied position and velocity dependent responses?<br /> b) Relatedly, did the authors put any thought into developing a model, using error inputs from the experimental trials, that can capture the feedback responses? For example, dF/dt * tau = a*pe + b*ve - cF + e, where F = force response, tau is a time constant to generate the force, a is a gain on position error (pe), b is a gain on velocity error (ve), c relates to the leak, and e is Gaussian noise. The leak would be needed to explain the equilibrium / steady state at the end of the trial. It could be very insightful if this, or some other similar flavour of model, could explain the phasic and tonic components of the feedback response. The advantage of a model in this form is that there are experimental inputs and the process evolves over time, rather than fitting static curves to the data.

      (2) Aligned with past literature, the authors have characterized the early and late phases of both the feedback responses and learned responses as phasic and tonic. It is clear from the data that the feedback response data are composed of a phasic and tonic phase. However, it is less clear from the data in many of the figures that there is an actual phasic response in the learned response. Further, from a modelling perspective, it is conceivable that the fitting algorithm would partition the variance between the two components of equation 1, even though there may only be one true underlying process. This may also explain why there was no correlation between tonic feedback responses and phasic learned responses in Figure 7F.<br /> a) Can the authors provide more rationale on why the learned response would also have a phasic response? Is the assumption here that since the feedback response had a phasic response, the learned response should as well?<br /> b) Can the authors fit the learned response with only the tonic portion of the equation? Then, perform model comparison between the phasic+tonic learned response model and the tonic only learned response model using AIC/BIC, to justify whether or not a phasic portion of the model is needed to explain the data.<br /> c) Can the authors comment on the possibility that the learned response may just rise and then decay over time, without being the outcome of two distinct processes?

      (3) The nicely controlled experiments do well to provide evidence against the feedback error learning hypothesis, which alone is a valuable contribution to the literature. However, the authors do not provide a strong alternative hypothesis. There is a proposal of alternative hypotheses. For example, on lines 494-498, referring to state estimation, which the authors then state could not explain all the results in the preceding paragraph. It would be beneficial to further bolster the possible explanations. Perhaps further discussion details on what the mechanisms are for the feedback responses (e.g., position or velocity dependence), and what states (position error, velocity error, motor commands, etc.) transfer into the learned response. Are they stored? Are they the outcome of a continuous process? This may be difficult given the current state of understanding in the literature, but it could substantially improve the paper.

    3. Reviewer #3 (Public review):

      I believe that the paper is excellent and very well executed. I have several reservations about the meaning of the tonic component of the feedback responses and about the more general interpretation from a computational standpoint. These aspects may not require extensive adjustments, but some key points could be discussed or better justified:

      (1) It is true that most papers view adaptation as a trial-by-trial update and that several models summarise motor errors by a scalar quantity for a model fit. The importance of feedback control in visuomotor control has also been overlooked, as several studies explicitly instructed not to correct. I also agree about the fact that the temporal aspects of sensory encoding and control are often neglected in motor adaptation studies. However, there have been some developments about adaptive control in the context of force field learning to express the error signal and learning rule based on continuously evolving state variables as those formulated in online control models (Crevecoeur et al., 2020, eNeuro 7(1); Kalidindi and Crevecoeur, 2023, Curr Opin Neurobiol, 83, 102810). Could the authors consider discussing whether this framework could or not be consistent with the current dataset?

      (2) The choice of a cursor jump may require more in-depth justification. From an experimental standpoint, it is clear from the authors' data that a cursor jump does evoke an aftereffect and hence the developments are clearly validated empirically. The nature of the adaptive response is less clear: indeed, cursor jumps can be represented as an external perturbation to a variable that may be independent of the hand (e.g. Kasuga et al., 2022, J Neurophysiol, 127 (2), 354-372). In contrast, a visuomotor rotation requires a change in state space representation parameters (it is not clear which ones) that is more closely related to the update of an internal model. Could the authors explain why they believe that a learning response to a cursor jump is consistent with adaptation in general?

      (3) The relationship between the tonic component of the feedback response and the learning response is very clear from an experimental perspective again. However, I would suggest being very cautious about the interpretation of this effect. My concern is that it is not clear that this tonic response is irrelevant from a behavioural standpoint, and I am left wondering what the correlation with the learning response truly means. Indeed, in real-life conditions, there should be no net force produced in the end during a static phase, as the force during stabilisation is by definition zero; only the net force produced against constant external loads is required. There can be co-contraction but not net resultant force, unless external forces are applied. So if the tonic response vanishes in real conditions, should there be no learning response? This aspect is also relevant if one attempts to generalise the findings to force field learning: since velocity-dependent force fields vanish during stabilisation, how can there be a tonic component?

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the major comments raised in the previous round of reviews, yet some inherent issues necessarily remain unresolved.]

      The manuscript shows that different traits of adults and larvae correlate with Red List status. The authors argue that this shows a big gap in the conservation of amphibians and that the traits of all life stages should be taken into account in amphibian conservation. Specifically, amphibian conservation should do more for the habitats where the larvae live.

      The manuscript is well written and easy to understand. The methods are sound.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors tried to examine whether there are differences in the association between functional traits and extinction risk in adult and tadpole stages in Chinese anurans.

      Strengths:

      Overall, I think the basic idea of the study is interesting and important. It can be applied to other taxa with complex life cycles throughout the animal kingdom.

      Original weaknesses:

      I do not think the authors achieve their aims, as the results only partially support their conclusions. The study has several drawbacks that need to be clarified or revised, including the unclear threat categories for tadpoles, model selection and model averaging, the potential problem of AIC, and the omission of other important species traits.

    1. Reviewer #1 (Public review):

      Summary:

      The authors have leveraged publicly available single-cell RNA sequencing datasets from isolated islets downloaded from the PANC-DB resource to study the transcriptional profile of insulin-producing beta and glucagon-producing alpha cells from pancreas donors with, or at-risk (islet autoantibody positive) of Type 1 diabetes and donors without diabetes. Their rationale is that any remaining beta cells in these donors with T1D have resisted the autoimmune attack and can therefore provide insights into the transcriptional pathways that mediate this protection. They have developed robust bioinformatic pipelines to address this hypothesis. Their analyses identify beta (and alpha) cells clustered by their differential transcriptional profiles and gene regulatory networks (GRNs), which are present in varying proportions in individuals with and without T1D. The Differentially expressed genes (DEGs) identified align with previously reported datasets. The use of the SCENIC tool, a pipeline for GRN inference using transcriptomic data, involves scoring transcription factor (TF) activity with a rank-based approach, which is considered robust to technical artefacts and adds a novel perspective to this study. Through GRN analysis and regulon score generation, the authors identify a specific cluster of beta cells, cluster 3 (C3), that is enriched in individuals with T1D. This cluster was also slightly enriched in individuals without diabetes (ND) who were > 35 years of age. Their data aligns, supports and extends upon many earlier studies identifying key protective genes, e.g. CD274 (PD-L1) and HLA-E. Together, this provides insights into the transcriptional profile of beta cells that have resisted immune-mediated destruction, which could help with the design of stem cell-derived islet therapies and guide targeted immunotherapy drug trials in the future.

      Strengths:

      This largely agrees with and extends previous studies from a range of groups using different tissue repositories. This strengthens the validity of the conclusions. The identification of key GRNs associated with preserved beta cells could also aid in the future design of cell and immunomodulatory-based therapies.

      Weaknesses:

      The regulon scores are hypothesis-generating, not proof of the mechanism by which beta cells are protected. The observation that C3 is enriched in ND >35y could indicate that it is a regulon associated with beta-cell senescence, for example. In the context of T1D, this regulon could reflect beta-cell senescence or stress, which incidentally co-occurs with survival and, as such, is not necessarily a true reflection of survival characteristics. The authors could perhaps expand upon this possibility in a revision.

      The authors have leveraged valuable datasets to generate a detailed profile of residual beta cells in Type 1 diabetes and have successfully achieved their study aims. The findings are largely consistent with and extend the existing literature, highlighting key regulatory networks, some of which are supported at both the RNA and protein level (e.g., IRF1). However, a key interpretative consideration is that GRN-derived regulon activity does not distinguish between causal and reflective biological states. In particular, it remains unclear whether these networks represent mechanisms of immune protection or instead reflect underlying beta-cell states such as stress adaptation or senescence. Clarifying this distinction will be important for understanding the functional significance of these regulatory programs and their potential therapeutic relevance.

    2. Reviewer #2 (Public review):

      Summary:

      This work identifies a novel beta cell population primarily present in the islets from individuals with Type 1 Diabetes (T1D). This population is defined by increased expression of previously described transcription factors, including IRF1, BCL6, JUNB, and CEBPD. The authors postulate that the activation of these genes in beta cells during immune infiltration could be protective against beta cell destruction. This hypothesis aligns with experiments in NOD mice identifying a protected beta cell population. Overall, this work provides a hypothesis for how some beta cell populations survive immune infiltration in T1D.

      Strengths:

      This work uses a clever analysis approach, defining regulons using SCENIC and using these to recluster the data. This approach identified a novel beta cell population enriched in islets from individuals with Type 1 Diabetes that was very stable to different clustering resolutions. The authors also took many potentially confounding technical factors into account, removing ambient RNA and doublets, and often controlling for batch effects using pseudobulk approaches.

      In addition to identifying a novel cluster in one published single-cell dataset, the authors also downloaded additional single-cell datasets that included cytokine treatment of human beta cells to validate the presence of this population in other datasets. In these datasets, the authors were able to identify a similar population of cells, labeled by similar transcription factors.

      Weaknesses:

      While the authors use a sophisticated approach to identify a novel beta cell subpopulation, more analysis needs to be done to ensure this cluster is biologically meaningful. First, the authors did not take the duration of diabetes into account in this analysis. The duration of diabetes is important because there are different levels of immune infiltration at different stages of diabetes. It would also be important to consider age at diagnosis, as the progression of disease is very different in early vs late onset populations.

      Additionally, more exploration of potential confounding factors should be done when looking at the novel population vs other populations in the dataset. This would be further strengthened by adding analysis from datasets that more directly measure transcription factor activity, like single-nucleus ATAC-seq from the different disease states.

      Finally, these data can't distinguish the response to the environment (i.e., cytokines) and protective programs. Especially given the similar program in alpha cells, the response to the environment seems likely. More analysis should be done, looking for a similar signature in other populations in the data.

    3. Reviewer #3 (Public review):

      Summary:

      The authors used a gene regulatory network inference-based clustering approach with existing scRNAseq data sets from cadaveric donors with T1D, auto-antibody positive, and non-diabetic donors and found a regulatory network associated with b-cell survival that is associated with increased expression of genes controlled by interferon regulatory factor 1.

      Strengths:

      Using established data sets of RNAseq previously performed, the authors identify an interesting population of surviving b-cells in T1D that express a key antiviral transcription factor (IRF1), antiviral genes such as GBPs and iFIT, and decreased expression of a limited number of genes that have been associated with the identity of b-cells.

      Selective expression in T1D and not observed in islets from control or auto-antibody positive donors.

      Expression changes, TFs identified are also identified in human islets treated with cytokines.

      The lack of changes in genes associated with ER stress or the response of endocrine cells to ER stress.

      Weaknesses:

      The authors do an excellent job of identifying characteristics of the donors/islets in the methods; however, this needs to be addressed in the Figure Legends and Results. Specifically, the length of exposure to cytokines is critical in evaluating the comparisons made in this study.

      Is it possible to evaluate sex as a variable in this analysis, and if yes, does one still observe similar changes in identity gene expression and IRF1-dependent gene expression?

      Length of disease and evidence for the C3 populations? Does one observe the C3 population in alpha cells of islets with long-standing disease or in the samples that had too few b-cells to perform the analysis? Temporally, 24 h was used for ATACseq and 48 h for cytokine treatment. These are very late exposures, suggesting that secondary and tertiary effects are being compared.

      Activation of stress response genes has been correlated with impaired cytokine signaling in islets (human and rodents), limiting the number of endocrine cells that are cytokine responsive. Was this observed in the authors' analysis?

      Recent studies have identified induction of antiviral and antibacterial genes in islets in response to short exposures to IL-1, TNF, IFN's that are consistent with the C3 expression profile observed by the authors. While this work has mostly been performed in rodent islets, it has also been observed in human islets, and may be useful in comparing additional transcripts that may contribute to the observed profiles.

    1. Reviewer #2 (Public review):

      I have completed a thorough review of this paper, which seeks to use the large datasets of species occurrences available through GBIF to estimate variation in how large numbers of plant and animal species are associated with urbanization throughout the world, describing what they call the "species urbanness distribution" or SUD. They explore how these SUDs differ between regions and different taxonomic levels. They then calculate a measure of urban tolerance and seek to explore whether organism size predicts variation in tolerance among species and across regions.

      The study is impressive in many respects. Over the course of several papers, Callaghan and coauthors have been leaders in using "big [biodiversity] data" to create metrics of how species' occurrence data are associated with urban environments, and in describing variation in urban tolerance among taxa and regions. This work has been creative, novel, and it has pushed the boundaries of understanding how urbanization affects a wide diversity of taxa. The current paper takes this to a new level by performing analyses on over 94000 observations from >30,000 species of plants and animals, across more than 370 plant and animal taxonomic families. All of these analyses were focused on answering two main questions:<br /> (1) What is the shape of species' urban tolerance distributions within regional communities?<br /> (2) Does body size consistently correlate with species' urban tolerance across taxonomic groups and biogeographic contexts?

      Overall, I think the questions are interesting and important, the size and scope of the data and analyses are impressive, and this paper has a potentially large contribution to make in pushing forward urban macroecology specifically and urban ecology and evolution more generally.

      Despite my enthusiasm for this paper and its potential impact, there are aspects that could be improved, and I believe the paper requires major revision.

      Some of these revisions ideally involve being clearer about the methodology or arguments being made. In other cases, I think their metrics of urban tolerance are flawed and need to be rethought and recalculated, and some of the conclusions are inaccurate. I hope the authors will address these comments carefully and thoroughly. I recognize that there is no obligation for authors to make revisions. However, revising the paper along the lines of the comments made below would increase the impact of the paper and its clarity to a broad readership.

      Major Comments:

      (1) Subrealms

      Where does the concept of "subrealms" come from? No citation is given, and it could be said that this sounds like an idea straight out of Middle Earth. How do subrealms relate to known bioclimatic designations like Koppen Climate classifications, which would arguably be more appropriate? Or are subrealms more socio-ecologically oriented? From what I can tell, each subrealm lumps together climatically diverse areas. It might be better and more tractable to break things in terms of continents, as the rationale for subrealms is unclear, and it makes the analyses and results more confusing. The authors rationalized the use of subrealms to account for potential intraspecific differences in species' response to urbanization, but that is never a core part of the questions or interpretation in the paper, and averaging across subrealms also accounts for intraspecific variation. Another issue with using the subrealm approach is that the authors only included a species if it had 100 observations in a given subrealm, leading to a focus on only the most common species, which may be biased in their SUD distribution. How many more species would be included if they did their analysis at the continental or global scale, and would this change the shape of SUDs?

      (2) Methods - urban score

      The authors describe their "urban score" as being calculated as "the mean of the distribution of VIIRS values as a relative species-specific measure of a response to urban land cover."

      I don't understand how this is a "relative species-specific measure". What is it relative to? Figures S4 and S5 show the mean distribution of VIIRS for various taxa, and this mean looks to be an absolute measure. Mean VIIRS for a given species would be fine and appropriate as an "urban score", but the authors then state in the next sentence: "this urban score represents the relative ranking of that species to other species in response to urban land cover".

      That doesn't follow from the description of how this is calculated. Something is missing here. Please clarify and add an explicit equation for how the urban score is calculated because the text is unclear and confusing.

      (3) Methods - urban tolerance

      How the authors are defining and calculating tolerance is unclear, confusing, and flawed in my opinion.

      Tolerance is a common concept in ecology, evolution, and physiology, typically defined as the ability for an organism to maintain some measure of performance (e.g., fitness, growth, physiological homeostasis) in the presence versus absence of some stressor. As one example, in the herbivory literature, tolerance is often measured as the absolute or relative difference in fitness of plants that are damaged versus undamaged (e.g., https://academic.oup.com/evolut/article/62/9/2429/6853425?login=true).

      On line 309, after describing the calculation of urban scores across subrealms, they write: "Therefore, a species could be represented across multiple subrealms with differing measures of urban tolerance (Fig. S4). Importantly, this continuous metric of urban tolerance is a relative measure of a species' preference, or affinity, to urban areas: it should be interpreted only within each subrealm".

      This is problematic on several fronts. First, the authors never define what they mean by the term "tolerance". Second, they refer to urban tolerance throughout the paper, but don't describe the calculation until lines 315-319, where they write (text in [ ] is from the reviewer):

      "Within each subrealm, we further accounted for the potential of different levels of urbanization by scaling each species' urban score by subtracting the mean VIIRS of all observations in the subrealm (this value is hereafter referred to as urban tolerance). This 'urban tolerance' (Fig. S5) value can be negative - when species under-occupy urban areas [relative to the average across all species] suggesting they actively avoid them-or positive-when species over-occupy urban areas [relative to the average across all species] suggesting they prefer them (i.e., ranging from urban avoiders to urban exploiters, respectively).<br /> They are taking a relativized urban score and then subtracting the mean VIIRS of all observations across species in a subrealm. How exactly one interprets the magnitude isn't clear and they admit this metric is "not interpretative across subrealms".

      This is not a true measure of tolerance, at least not in the conventional sense of how tolerance is typically defined. The problem is that a species distribution isn't being compared to some metric of urbanness, but instead it is relative to other species' urban scores, where species may, on average, be highly urban or highly nonurban in their distribution, and this may vary from subrealm to subrealm. A measure of urban tolerance should be independent of how other species are responding, and should be interpretable across subrealms, continents, and the globe.

      I propose the authors use one of two metrics of urban tolerance:

      (i) Absolute Urban Tolerance = Mean VIIRS of species_i - Mean VIIRS of city centers<br /> Here, the mean VIIRS of city centers could be taken from the center of multiple cities throughout a subrealm, across a continent, or across the world. Here, the units are in the original VIIRS units where 0 would correspond to species being centered on the most extreme urban habitats, and the most extreme negative values would correspond to species that occupy the most non-urban habitats (i.e., no artificial light at night). In essence, this measure of tolerance would quantify how far a species' distribution is shifted relative to the most highly urbanized habitat available.

      (ii) % Urban Tolerance = (Mean VIIRS of species_i - Mean VIIRS of city centers)/MeanVIIRS of city centers * 100%<br /> This metric provides a % change in species mean VIIRS distribution relative to the most urban habitats. This value could theoretically be negative or positive, but will typically be negative, with -100% being completely non-urban, and 0% being completely urban tolerant.

      Both of these metrics can be compared across the world, as it would provide either absolute (equation 1) or relative (equation 2) metrics of urban tolerance that are comparable and easily interpretable in any region.

      In summary, the definition of tolerance should be clear, the metric should be a true measure of tolerance that is comparable across regions, and an equation should be given.

      (4) Figure 1: The figure does not stand alone. For example, what is the hypothesis for thermophily or the temperature-size rule? The authors should expand the legend slightly to make the hypotheses being illustrated clearer.

      (5) SUDs: I don't agree with the conclusion given on line 83 ("pattern was consistent across subrealms and several taxonomic levels") or in the legend of Figure 2 ("there were consistent patterns for kingdoms, classes, and orders, as shown by generally similar density histograms shapes for each of these").

      The shapes of the curves are quite different, especially for the two Kingdoms and the different classes. I agree they are relatively consistent for the different taxonomic Orders of insects.

      Comments on revised version:

      I believe their response is thorough and thoughtful. I still disagree with them on some fundamental points of their methodology. However, I would prefer to let my review and their response stand as is. This will allow engaged readers to see both sides of the arguments and judge for themselves whether they believe the revisions are sufficient and if my concerns are valid.

    1. Reviewer #1 (Public review):

      Summary:

      This study demonstrates, through a series of EEG and MEG experiments, that the human brain automatically categorizes words from alphabetic and non-alphabetic languages, and it unpacks the neural mechanisms of this process from multiple angles. The work examines not only univariate repetition-suppression (RS) effects, but also how repeating or alternating languages influences the representational similarity of words within and across language categories.

      Strengths:

      The univariate RS effects across multiple experiments lend support to some of the main conclusions.

      Comments on revised version.

      The authors have made appropriate revisions and supplements in response to the issues I raised, which has largely resolved my concerns.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates how the human brain categorizes visual words from distinct writing systems (alphabetic vs. non-alphabetic). Using a repetition suppression paradigm combined with electroencephalography and magnetoencephalography, the authors conducted nine experiments with independent participants to identify the neural network underlying language-based categorization, characterize its temporal dynamics, and test whether this process operates independently of linguistic properties such as semantic meaning and pronunciation.

      Strengths:

      The study employs a well-validated design with clear control conditions and systematically manipulates key variables including writing system, language familiarity, and native language background. The use of nine experiments with independent participant samples strengthens the reliability and replicability of the results. The work combines EEG and MEG, cross-validating findings across imaging modalities to support the reported neural effects. A combination of univariate, multivariate, and connectivity analyses is used to characterize neural responses and network interactions. Results are consistent across multiple language groups and for both familiar and unfamiliar languages, supporting the generalizability of the identified neural mechanism beyond specific languages or prior experience.

      Comments on revised version.

      Earlier versions of the manuscript framed these findings as more directly reflecting the social-categorization function of language. In the revised manuscript, the authors now more carefully distinguish language-based word categorization from broader claims regarding social categorization and explicitly acknowledge that the current experiments do not directly test social evaluation or intergroup processes. These revisions improve the conceptual precision of the work and address my major concern from the previous review.

      The additional methodological clarifications and supplementary analyses also strengthen the manuscript. Overall, I believe the revised version provides solid evidence for rapid language-based categorization of visual words across different writing systems.

  2. Jun 2026
    1. Reviewer #1 (Public review):

      Summary

      This manuscript addresses an important question in auditory neuroscience and neuroprosthetics: whether cortical responses to cochlear implant stimulation resemble those evoked by natural acoustic stimulation, or whether electrical stimulation engages a distinct cortical representation. The authors use high-density intracranial EEG recordings in rats to compare responses to pure tones in normal-hearing animals with responses to single-channel cochlear implant stimulation in deafened animals. They combine analyses of event-related potentials, high-gamma activity, trial-by-trial variability, PCA/TCA-based dimensionality reduction, and decoder-based measures of stimulus information.

      Strengths

      A major strength of the study is the question it addresses. Understanding how electrical cochlear stimulation is represented centrally is highly relevant for cochlear implant design, fitting strategies, and rehabilitation. The comparison between acoustic and electrical stimulation, including within-animal comparisons in a subset of cases, is valuable because it directly addresses whether implant-evoked activity can be interpreted within the framework of normal acoustic tonotopy.

      The methodological approach is also a strength. Dense cortical surface recordings provide simultaneous access to spatial and temporal features of auditory cortical responses. The combination of PCA, TCA, and decoder analyses gives complementary views of the data, and the information-transfer analysis provides an interesting way to ask whether representations learned from acoustic stimulation generalize to electrical stimulation.

      Weaknesses:

      The main weakness is that the evidence for spatial organization remains difficult to interpret. In Figure 2, the authors argue that both tone-evoked and cochlear implant-evoked responses are spatially organized, but the slope analyses are not significant for the cochlear implant condition. The revised vector-strength analysis supports the presence of non-random spatial structure, but this is not the same as demonstrating a clear graded cochleotopic organization. The manuscript would be strongest if it consistently distinguished between non-random spatial structure, coarse topography, and true graded tonotopy or cochleotopy.

      A related issue is that some figure titles and interpretive statements still appear stronger than the data justify. For example, the TCA results in Figure 7 are described as revealing topographically organized latent spatial factors, but the statistical support appears strongest for normal-hearing high-gamma responses, with weaker or non-significant results in other conditions. These data remain interesting, but they would be better framed as evidence for weak or coarse spatial structure rather than robust topographic organization across all modalities.

      The decoder analyses are improved, especially with the added tone-to-tone control. This control supports the conclusion that poor acoustic-to-CI transfer is not simply a failure of the TCA/LDA pipeline. However, the analysis remains model-dependent, and the absolute information transfer values are low. It would be helpful either to include an analogous analysis using raw ERP/high-gamma features or to explain more explicitly why the TCA-based approach is the appropriate primary test. The data support poor generalization between acoustic and implant-evoked cortical responses, but claims about perceptual qualities should remain speculative because perception is not directly measured in these experiments.

      Finally, although methodological reporting is much improved, some verification remains indirect. The authors provide useful implantation criteria and cite prior validation of their deafening approach, but the manuscript would be clearer if it explicitly distinguished between validation performed in the present animals and validation based on previous cohorts. This distinction is important because surgical variability, implantation efficacy, and deafening completeness can influence the interpretation of cochlear implant experiments.

      Comments on revised version.

      The revised manuscript is considerably improved. The authors have clarified several methodological details, added a statistical framework that better accommodates both paired and unpaired animals, provided a clearer account of animal cohorts, added peripheral ECAP/forward-masking data to support the cochlear specificity of implant stimulation, and included a useful positive control for the cross-modal decoder analysis. These additions make the manuscript stronger and help readers interpret the main findings more confidently.

      The results support the conclusion that acoustic and cochlear implant stimulation evoke cortical responses with different properties. In particular, acoustic responses support better single-trial stimulus decoding than cochlear implant responses, and decoders trained on acoustic responses transfer poorly to implant-evoked responses. The evidence for spatial organization is more nuanced. The cochlear implant condition shows evidence of non-random spatial structure, but not a clear graded cochleotopic map. The normal-hearing condition is also less visually clear than might be expected from prior tonotopy studies, although the added analyses and comparisons to previous work help contextualize this result. Overall, the study makes a valuable contribution, provided that the claims about spatial organization and perceptual interpretation remain appropriately cautious.

      The revision addresses several important concerns from the original version. The use of mixed-effects models better matches the partially paired experimental design. The expanded Methods improve reproducibility. The new cohort schematic helps clarify which animals contributed to behavioral and neural datasets. The ECAP forward-masking measurements add useful peripheral validation, and the within-modality decoder control strengthens the interpretation of the poor cross-modal transfer result. Together, these changes substantially improve the manuscript.

      The work is likely to be of interest to auditory neuroscientists, cochlear implant researchers, and neuroengineers. Even where some conclusions require cautious wording, the dataset and analytical framework may be useful for future studies aiming to relate cortical responses to implant programming, perceptual learning, or closed-loop neuroprosthetic approaches.

      Overall, the revised manuscript is stronger and addresses an important problem with useful methods and analyses. The results most convincingly show that acoustic responses support better single-trial decoding than acute cochlear implant responses, and that acoustic-trained decoders generalize poorly to implant-evoked activity. The evidence for robust spatial organization, especially in the cochlear implant condition, is more limited and should be presented with appropriate caution.

    2. Reviewer #2 (Public review):

      Summary:

      This article reports measurements of iEEG signals on the rat auditory cortex during cochlear implant or sound stimulation in separate groups of rats. The observations indicate some spatial organization of cochlear implant stimuli, but that is very different from cochlear implants.

      Strengths:

      The study includes interesting analyses of the sound and cochlear implant representation structure based on decoders.

      Weaknesses:

      The observation that responses to cochlear implant stimulation (stimulation) is spatially organized is not new (e.g. Adenis et al. 2024)

      The claim that spatial and temporal dimensions contribute information about the sound is also not new there is a large literature on this topic.

      The analyses supporting the claim that there is a mismatch between cochlear implant and sound representation are still unclear, particularly in Fig. 8.

    3. Reviewer #3 (Public review):

      Summary:

      Through micro-electroencephalography, Hight and colleagues studied how the auditory cortex in its ensemble respond to cochlear implant stimulation compared to the classic pure tones. Taking advantage of a double implanted rat model (Micro-ECoG and Cochlear Implant), they tracked and analyzed changes happening in the temporal and spatial aspects of the cortical evoked responses in both normal hearing and cochlear-implanted animals. After establishing that single trial responses were sufficient to encode the stimuli properties, the authors then explored several decoder architectures to study the cortex ability to encode each stimuli modality in a similar or different manner. They conclude that a) intracranial EEG evoked responses can be accurately recorded and did not differed between normal hearing and cochlear-implanted rats; b) Although coarsely spatially organized, CI-evoked responses had higher trial-by-trial variability than pure tones; c) Stimulus identity is independently represented by temporal and spatial aspect of cortical representations and can be accurately decoded by various means from single trials; d) and that Pure tones trained decoder can't decode CI-stimulus identity accurately.

      Strength:

      The model combining micro-eCoG and cochlear implantation and the methodology to extract both the Event Related Potentials (ERPs) and High-Gammas (HGs) is very well designed and appropriately analyzed. Likewise, the PCA-LDA and TCA-LDA are powerful tools that take full advantage of the information provided by the cortical ensembles.

      The overall structure of the paper, with a paced and exhaustive progress through each step and evolution of the decoder is very appreciable and easy to follow. The exploration of single trial encoding and stimulus identity through temporal and spatial domains is providing new avenues to characterize the cortical responses CI stimulations and their central representation. The fact that single trials suffice to decode the stimulus identity regardless of their modality is of great interest and noteworthy. Although the authors confirm that iEEG remains difficult to transpose in clinic, the insights provided by the study confirm the potential benefit of using central decoders to help in clinic settings.

      Weakness:

      The conclusion of the paper, especially the concept of distinct cortical encoding for each modality, is unfortunately partially supported by the results as the authors ignored fundamental limitations of CI related stimulation.

      First, the authors stimulated in a Monopolar mode which, albeit being clinically relevant, notoriously generates a high current spread in rodent models. Comparing the averaged BF maps for iEEG (Fig-2A, C), BFs ranged from 4 to 16kHz with a predominance of 4kHz BFs. The lack of BFs at higher frequencies might reveal a potential location mismatch between the frequency range sampled at the level of the cortex (low to medium frequencies) and the frequency range covered by the CI inserted mostly in the first turn-and-a-half of the cochlea (high to medium frequencies). Looking at Fig-2F (and to some extend 2A) most of CI electrodes elicited responses around the 4kHz regions and averaged maps show a predominance of CI-3-4 across cortex (Fig-2C, H and Sup Fig. 3) from areas with 4kHz BF to areas with 16kHz BF. It is doubtful that CI-3-4 are located near the 4kHz region based on Müller's work (1991) on the frequency representation in the rat cochlea. Moreover, Supplemental figure 3 shows that only a couple of CI electrodes are predominately represented at the level of the cortex. Thus, it seems possible that current spread ended stimulating indistinctly higher turns of the cochlea or even the modiolus in a non-specific manner, greatly reducing (or smearing) the place-coding/frequency resolution of each electrode, which in turn could explain the coarse topographic (or coarsely tonotopic according to the manuscript) organization of the cortical responses.

      Second, although the authors acknowledge that post-lingual CI users always have an adaptation period, their conclusion is based on measurements that are relatively "early" in the CI-use timeline so to speak since iEEG were collected a) acutely right after mono-aural implantation and stimulation, b) under anesthesia, c) using unmodulated pulse train fixed at 900pps regardless of the electrode used and thus lacking any temporal information shifts in relationship to electrode cochleotopic placement. Basically, all CI electrodes had the same rate whereas you would expect basal CI electrodes to be amplitude modulated at higher frequencies than apical electrodes.

      As much as the reviewer likes the overall approach with the use of PCA-LDA and TCA, and agrees that information transfer seems inexistant at time of measurement, authors should be more careful in their strong conclusion that two distinct encoding exist. The non-overlapping between sound and electric stimulation representations might exist only transiently and this should be acknowledged a bit more in the discussion. Without repetition of iEEG measurement at later period with chronic use of the CI, it is not possible to definitively claim that two distinct, non-overlapping coding co-exist at all times.

      Nevertheless, the reviewer wants to reiterate that the study proposed by Hight et al. is well constructed, relevant to the field and that the overall proposal of improving patient performances and help their adaptation in the first months of CI use by studying central responses should be pursued as it might help establish new guidelines or create new clinical tools.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates the impact of Pink1 loss on glial function and neuronal health in a Drosophila model, highlighting the role of mitochondria-organelle contacts and key genes such as Ccz1, Vps13, Mon1, and Rab7. The work provides insights into cellular processes underlying neurodegenerative diseases, with a focus on glia-neuron interactions.

      Comments on revised version:

      I have reviewed the revised manuscript and the authors' responses to previous comments. The authors have addressed the key concerns raised by the reviewers, including validation of the Mz-GAL4 line and additional control experiments. The remaining issues caused by experimental constraints are understandable in this study.

      However, several concerns remain. Notably, some key results were removed due to the use of inadequately characterized fly lines, and the lack of follow-up experiments to address these issues raises concerns regarding the validity and reliability of the findings. Furthermore, the absence of experiments examining Rab7-mediated membrane trafficking or the interactions between mitochondria and lysosomes in the Pink1 mutant presents a limitation. These missing elements reduce the clarity and interpretability of Figure 5 for readers.

      On a positive note, the data showing that reducing Vps35/Vps13 enhances neuronal function and rescues Pink1 mutant phenotypes in ensheathing glia contributes meaningfully to the overall narrative.

      Despite these limitations, this research addresses an important question in neuroscience using the Drosophila model. It provides a novel perspective on Parkinson's disease and neurodegeneration by exploring mechanisms underlying Pink1 loss and suggesting a role for mitochondria-organelle interactions in ensheathing glia, potentially regulated via Vps35/Vps13-mediated pathways.

      Overall, the current version presents a clear and meaningful contribution to the field.

    2. Reviewer #2 (Public review):

      Summary:

      This study proposes a novel role for ensheathing glia (EG) in a Pink1-model of Parkinson's disease and shows that this cell population exhibits the highest number of DEG in a pre-symptomatic stage. In the olfactory system, there seems to be morphological changes in this cell-type that resembles an 'activated' state and the authors further show that the neuronal loss of Pink1 is responsible for this defect. The authors go on to show that manipulation of Pink1 in EG also leads to some defects in the visual system and in the dopaminergic neurons (DAN) that innervate the mushroom body (MB), and performed a screen based on the 'on-transient' defect of the ERG to identify potential genes that may modulate the function of EG in synaptic regulation. They focus on several genes related to vesicle trafficking including Vps13, and Vps35 and performed some additional experiments in the visual system and MB to propose the role of vesicle/lipid trafficking in EG as an important factor for PD pathogenesis.

      Strengths:

      The study proposes functional and mechanistic connections between several genes that have been linked to PD (PINK1, VPS35 and VPS13A/C). I feel that the data presented in Figure 1-Figure 3C are performed with rigor and are convincing/novel. The selection of Drosophila to study the questions is also a strength and the lab has extensive experiences in this field and model organism.

      Weaknesses:

      In this revised manuscript, a number of concerns raised by this and the other reviewer was addressed. The authors now admitted that some of the genetic reagents used in their screen and follow up assays were inappropriately utilized, and changed the latter half of the paper (Fig 3D-F4) quite significantly (e.g. now only 1 gene is considered as a hit in Fig3D, analysis of several genes in Fig4 have been removed and replaced by some experiments performed on Vps35). The transition between Figure 3D and Figure 4 is quite abrupt, and they don't seem to follow up on the CG17660 (the single hit from their screen, which is not further validated so it is not clear whether this genetic reagent is clean or not) and the effect of Vps35 RNAi in synaptic phenotype. Therefore, there is still a weakness in Figure 3D-Figure 4, which weakens the paper, especially since the new model diagram the authors provided in Figure 5 is not really investigated at the molecular level.

    1. Reviewer #1 (Public review):

      Summary:

      In this study entitled "Linking Germline Telomere Removal to Global Programmed DNA Elimination in Tetrahymena Genome Differentiation" Nagao and colleagues examine the fate of germline chromosome ends during somatic genome differentiation in the ciliate Tetrahymena thermophila. During sexual reproduction, a new somatic genome is created from a zygotic, germline-derived genome by extensive programmed DNA elimination events. It has been known for some time that the terminii of the germline chromosomes are eliminated, but the exact process and kinetics of the elimination events has not been thoroughly investigated. The authors first use germline-specific telomere probes to show that the loss of these chromosome ends occurs with similar timing as other DNA elimination events. By comparative analysis of the assembled germline and somatic genomes, the authors find the ends of each of the germline chromosomes are composed of few hundred kilobases of micronuclear limited sequences (MLS) that are removed starting around 14 hours after the start of conjugation, which initiates sexual development. They then develop an in-situ hybridization assay to track the fate of one end of chromosome 4 while simultaneously following the adjacent macronuclear destined sequence (MDS) retained in the new somatic genome. This allows the authors to more clearly show that these adjacent chromosomal segments are initially amplified in the developing genome before the terminal MLS is eliminated. Finally, they mutate the chromosome breakage sequence (CBS) that normally separates the MLS terminus from the adjacent MDS region as show that strains that develop with only one mutant chromosome can produce viable sexual progeny, but it appears that both the MLS and the MDS from the mutant chromosome are lost. If both chromosome copies have the CBS mutation, the cells arrest during development and do not eliminate many germline limited sequences and fail to produce viable progeny. Overall, this study provides many new insights into the fate of germline chromosome ends during somatic genome remodeling and suggests extensive coordination of different DNA elimination events in Tetrahymena.

      Strengths:

      Overall, the experiments were well executed with appropriate controls. The findings are generally robust. Importantly, the study provides several novel findings. First, the authors provide a fairly comprehensive characterization of the size of the MLS at the end of each germline chromosome. They also report on the highly repetitive composition of these chromosome terminii. Second, the authors develop a novel method to study the fate of chromosome terminii during development and use it conclusively track the elimination of these terminii. Third, the authors show that the elimination of these terminii appears to occur concurrently with most other DNA elimination events during somatic genome differentiation. And fourth, the authors show that failure to separate these eliminated sequences from the normally retained chromosome alters the fate of these adjacent MDS and loss of the cells ability to produce viable progeny. The authors initially hypothesized that DNA elimination may be blocked due to inappropriate silencing of genes in the MDS region when the CBS is mutant, but gene expression analysis showed that this is not the case.

      Weaknesses:

      After revising the manuscript based on the initial reviewers' critique, most weaknesses have been addressed. On weakness remaining is that since the authors only mutated the end of one germline chromosome, it is not clear whether the elimination of the MDS adjacent to the terminal MLS on chromosome 4 when the CBS is mutated is a general phenomenon, i.e. would happen at all chromosome ends, or is unique to the situation at Chromosome 4R. Knowing whether it is a general phenomenon or not would provide important insight into the authors findings. The authors did attempt to look at other chromosome ends, but technical limitations currently stymie this effort.

      The other weakness is that it remains unclear how failure to carry out DNA elimination appears to induce a checkpoint during development, but this open question is not unique to this study.

      Comments on revised version.

      The authors have significantly improved the study. The addition of the RNA-seq analysis allowed these researchers to show that their initial hypothesis - that loss of a CBS leads to inappropriate gene silencing in the neighboring MDS region - appears not to be the case. I do not have further suggestions for the authors.

    2. Reviewer #2 (Public review):

      Summary:

      Mochizuki and colleagues investigated how the germline (MIC) telomere was removed during programmed genome rearrangement in the developing somatic nucleus (MAC). Using an optimized oligo-FISH procedure, the authors demonstrated that MIC telomeres were co-eliminated with a large region of MIC-limited sequences (MLS) demarcated on the opposite side by a sub-telomeric chromosome breakage site (CBS). This conclusion was corroborated by the latest assembly of the Tetrahymena MIC genome. They further employed CRISPR-Cas9 mutagenesis to disrupt a specific sub-telomeric CBS (4R-CBS). In the uniparental progeny (mutant X WT), DNA elimination of the sub-telomeric MLS was not affected, but the adjacent MAC-destined sequence (MDS) may be co-eliminated. However, in the biparental progeny (mutant X mutant), global DNA elimination was arrested, revealing previously unrecognized connections between chromosome breakage and DNA elimination. It also paves the way for future studies into the underlying molecular mechanisms. The work is rigorous, well-controlled, and offers important insights into how eukaryotic genomes demarcate genic regions (retained DNA) and regions derived from transposable element (TE; eliminated DNA) during differentiation. The identification of chromosome breakage sequences as a critical architectural element of the genome separating TE-derived regions from functional genes is a key conceptual contribution.

      Strengths:

      New method development: Oligo-FISH in Tetrahymena. This allows high-resolution visualization of critical genome rearrangement events during MIC-to-MAC differentiation. This method will be a very powerful tool in this area of study.

      The conclusion is strongly supported by integrated analyses of PCR-based assays, as well as cytological, genomic, and transcriptomic data.

      Rigorous genetic analysis of the role played by 4R-CBS in separating the fate of sub-telomeric MLS (elimination) and MDS (retention).

    3. Reviewer #3 (Public review):

      Programmed DNA elimination (PDE) is a process that removes a substantial amount of genomic DNA during development. While it contradicts the genome constancy rule, an increasing number of organisms have been found to undergo PDE, indicating its potential biological function. Single-cell ciliates have been used as a prominent model system for studying PDE, providing important mechanistic insights into this process. Many of those studies have focused on the excision of internally eliminated sequences (IES) and the subsequent repair using non-homologous end joining (NHEJ). These studies have led to the identification of small RNAs that mark retained or eliminated regions and the transposons that generate double-strand breaks.

      In this manuscript, Nagao and Mochizuki examined the other type of breaks in ciliates that are healed with telomere addition. They specifically focused on the sequences at the ends of the germline (MIC) chromosomes, which have received relatively less attention due to the technical challenges associated with the highly repetitive nature of the sequences. The authors used the Tetrahymena model and developed a set of new tools. They used a novel FISH strategy that enables the distinction between germline and somatic telomeres, as well as the retained and eliminated DNA near the chromosome ends. This allows them to track these sequences at the cellular level throughout the development process, where PDE occurs. They also analyzed the more comprehensive germline and somatic genomes and determined at the sequence level the loss of subtelomeric and telomere sequences at all chromosome ends. Their result is reminiscent of the PDE observed in nematodes, where all germline chromosome ends are removed and remodeled. Thus, the finding connects two independent PDE systems, a protozoan and a metazoan, and suggests the convergent evolution of chromosome end removal and remodeling in PDE.

      The majority of sites (8/10) at the junctions of retained and eliminated DNA at the chromosome ends contain a chromosome breakage sequence (CBS). The authors created a set of mutants that modify the CBS at the ends of chromosome 4R. CBS regions are challenging for CRISPR due to their AT-rich sequences, making the creation of the 4R-CBS mutants a significant breakthrough. They used the FISH assay to determine if PDE still occurs in these mutant strains with compromised CBS. Surprisingly, they found that instead of blocking PDE, its adjacent retained DNA is now eliminated, suggesting a co-elimination event when the breakage is impaired. Furthermore, in biparental mutant crosses, no PDE occurred, and no viable progeny were produced, indicating that the removal of chromosome ends is crucial for proper PDE and sexual progeny development. Overall, the work demonstrates a critical role for 4R-CBS in separating retained and eliminated DNA.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this study, the authors propose that HSV-1 infection degrades the class I histone deacetylases HDAC1 and HDAC2. The MDM2 E3 ubiquitin ligase from the DNA damage response pathway is responsible for ubiquitinating these HDACs that are subsequently degraded via proteasomes. The authors hypothesize that HDAC degradation will cause hyperacetylation of viral chromatin and enable viral gene transcription.

      Strengths:

      The ubiquitination of HDAC1 & HDAC2 by Mdm2 and the mapping studies are clear.

    2. Reviewer #2 (Public review):

      Summary:

      The authors discovered that HDAC1/2 are degraded in HSV-1 and PRV infections. They attempted to establish a new mechanism by which HDAC1/2 are translocated to the cytoplasm to be degraded in HSV-1 infection, and the degradation causes changes in histone acetylation to affect the DDR pathway.

      Strengths:

      (1) Interesting findings of HDAC1/2 degradation during HSV-1 and PRV infection, and it may impact more than the virology field.

      (2) Significant work to identify the ubiquitin site in HDAC1/2 and K63 linkage.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without an additional round of formal review from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In the manuscript "Pathogen-Phage Geomapping to Overcome Resistance," Do et al. present an impressive demonstration of using geographical sampling and metagenomics to guide sample choice for enrichment in human-associated microbes and the pathogen of interest to increase the chances of success for isolating phages active against highly resistant bacterial strains. The authors document many notable successes (17!) with highly resistant bacterial isolates and share a thoughtfully structured phage discovery effort, potentially opening the door to similar geomapping efforts across the field. While the work is methodologically strong and valuable for the community, there are a few areas where additional clarification and analysis could better align the claims with the data presented.

      Strengths:

      (1) The manuscript describes a well-executed and transparent example of overcoming a major obstacle in therapeutic virus identification, providing a practical success story that will resonate with researchers in microbiology and medicine.

      (2) Many phage researchers have anecdotally experienced a similar phenomenon, that a particular wastewater treatment plant always seems to have the pathogens you need. Quantifying this with metagenomics modernizes and adds evidence to this phenomenon in a way that could help researchers reproduce this success in a methodical way.

      (3) The methodology of combining environmental sampling, viral screening, and host-range analysis is clearly articulated and reproducible, offering a valuable blueprint for others in the field.

      (4) The data are presented with appropriate analytical rigor, and the results include robust sequencing and metagenomic profiling that deepen understanding of local viral communities.

      (5) The 17 successes yielding 35 phages have a lot of phylogenetic novelty beyond what the Tailor labs have typically found with previous methods.

      (6) The work highlights a practical and innovative solution to an increasingly important clinical problem, supporting the development of personalized antiviral strategies.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Do and colleagues aims to develop a workflow for isolating and identifying bacteriophages with potential applications in phage therapy against antibiotic-resistant pathogens. The workflow integrates geΦmapping as a strategy to identify potential phage sources, ΦHD as a device for phage concentration, and RΦ as a phage library constructed from the initial sampling, resulting in the discovery of 36 new phages. The paper is overall interesting, and the proposed method appears robust and effective.

      Strengths:

      The methods proposed combined state-of-the-art strategies to solve an ever-increasing problem of antibiotic resistance. The methods are robust, and the controls are appropriate. The integration of environmental sampling, concentration strategies, and downstream genomic characterization is a clear strength and provides a potentially scalable framework for identifying candidate therapeutic phages. The manuscript is clearly written overall, and the results support the main conclusions.

      Comments on revised version:

      The manuscript has been adequately improved and adjusted according to the comments. There are minor points such as Table S10 is labelled in the top of the page as Table S11. Also, is a little unconventional to cite result figures and tables in the introduction.

      For the question 10, regarding why some of the most abundant vOTUs in the 5L sample were not detected in the concentrate. The answer does not satisfy, as it focuses on why very low abundant vOTUs will not be detected, but the question is why some of the most abundant vOTUs were not detected. This does not affect the results or interpretation made.

    1. Reviewer #1 (Public review):

      Summary:

      Gurnani et al. explore how dynamical properties of neural networks influence capacity for and mechanisms of learning. Specifically, they focus on Brain Computer Interface (BCI) learning, in which manipulations are applied to a decoder that maps neural activity onto computer cursors. This paradigm was introduced by Sadtler et al. 2014, and has become an influential part of the neuroscience motor learning literature. A particularly fascinating outcome of that body of work is the observation that "within-manifold" perturbations (WMPs), which preserve covariance structure in the neural population, are easier to learn than "outside-manifold" perturbations (OMPs), which break this. Since deep network parameter access is challenging (to say the least) in monkey experiments, the intuition for this split in learnability is ripe for modeling and theory work. Indeed, the authors here introduce a feedback-driven recurrent neural network model whose output drives a simulation of a neural decoder commonly used in BCI studies like the Sadtler paper. While there have now been several modeling studies exploring how neural networks could solve this task, the feedback control perspective gives the authors' new model an interesting niche. Overall, this is a thoroughly done and well-written modeling study, and a solid contribution to the literature on within- and outside-manifold perturbations.

      Strengths:

      Reframing the OMP and WMP learning from a feedback-driven dynamical systems perspective, not just a geometric one, is an interesting take. The controllability analysis (along with the clear difference in input-driven and recurrence-driven learning) is quite a cool result that helps better frame what might be happening in the primate brain during similar tasks.

      Weaknesses:

      Some of the more interesting aspects, especially the controllability) and the differences between input-driven and recurrence-driven learning could be further developed, either by showing more analyses or running more comparisons. A few sections could benefit from some additional clarity on the strength and significance of results.

    2. Reviewer #2 (Public review):

      Summary:

      The constraints on learning in the brain remain elusive. Using BCIs, Sadtler et al. demonstrated that the brain can rapidly learn new decoders that lie within the intrinsic neural manifold (short-term adaptation), while showing substantial difficulty learning decoders that lie outside the manifold. This finding suggests that neural manifolds impose constraints on learning. However, even among within-manifold decoders, there was considerable variability in learning rates that could not be explained solely by geometric factors.

      Here, Gurnani et al propose that, in addition to manifold structure, neural dynamics (i.e., the flow field across states) impose critical constraints on learning. To test this idea, the authors trained RNNs that received real-time feedback (e.g., position error signals) during a BCI task in which the network controlled a cursor. The authors showed that short-term adaptation to a new decoder is facilitated by plasticity in sensory inputs, and that pre-existing dynamics influence the speed of adaptation across different decoders. These findings may explain previously unresolved constraints observed in BCI learning and suggest an important role for neural dynamics in constraining sensorimotor learning in the brain.

      Strengths:

      Overall, the work is highly impactful and is likely to motivate a new generation of BCI and learning experiments combining large-scale neural recordings with latent dynamical systems analyses. The paper is clearly written, and I only have minor comments, primarily for clarification.

      Weaknesses:

      There are no major weaknesses. Please see below for minor comments.

      (1) If I understand correctly, most analyses do not distinguish between the preparatory phase and the movement phase. Given that the preparatory phase is largely controlled by feedforward input, I suspect that most of the dynamical constraints underlying learning variability arise during the movement phase. Is this correct? If so, could the authors clarify or directly test this distinction?

      (2) P4: Position vs. velocity decoders: It would be helpful to describe whether and how the choice of velocity versus position decoders influences whether perturbations are learnable, and whether input-driven constraints arising in this task are similar.

      (3) The variance criteria used to screen decoder perturbations may themselves covary with learning rate, behavioral asymmetry, and overlap with controllable subspaces. A quantification of this relationship would help contextualize the findings and inform the design of future BCI experiments.

      (4) To support the comparison between Figures 3 and 7, and the conclusion that Figure 3 better matches the experimental data, which is an important point of the manuscript, could the authors provide quantitative values from the experimental data (e.g., how large is the change in variance within oPCs, etc)?

      (5) Figure 8h: Is the variability in learning rates in models with different controller networks explained by the same dynamical constraints described in Figure 6? Demonstrating consistent dynamical constraints across model architectures would strengthen the paper's central conclusion.

      (6) Figure 8f: Why does feedforward controllability differ between conditions? This is mentioned in the text, but no explanation is provided.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates the role of the medial prefrontal cortex (mPFC) in generating goal-directed actions under threat, using a progressive behavioral paradigm, neural recordings, and optogenetic inhibition in mice. The authors demonstrate that while mPFC GABAergic neurons strongly encode cues, actions, and errors, particularly under high cognitive demand, this neural activity is not causally required for executing avoidance behaviors. By rigorously controlling for movement and arousal, the researchers found that much of the observed mPFC signaling actually reflects baseline behavioral states rather than the generation of the actions themselves. This dissociation between encoding and causality challenges traditional views of mPFC as an executive controller of action and provides a nuanced understanding of its role in evaluative and contextual processing.

      Strengths:

      The behavioral paradigm employed in this study is one of its greatest strengths, offering a rigorous, progressive, and well-controlled framework to dissect the neural mechanisms underlying avoidance under threat. This three-phase task design is particularly well-suited to tease apart the contributions of learning, discrimination, and cognitive load to both behavior and neural activity.

      By tracking movement (speed, rotations) and including it as a covariate in statistical models, the authors also underscore the need to control for movement and baseline activity when interpreting cortical signals, which is relevant for all studies of brain-behavior relationships, ensuring that behavioral changes are not due to general arousal or motor activity.

      Finally, the study combines multiple advanced techniques-fiber photometry, single-cell calcium imaging (miniscopes), and two distinct optogenetic inhibition methods-to provide a comprehensive look at both neural encoding and causal necessity.

      Comments on revised version.

      The authors adequately addressed all of the reviewers' comments and made great improvements to the manuscript, particularly enhancing the methods and figures to significantly improve clarity and readability.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Sajid et al. describes a comprehensive behavioral, imaging and optogenetic dataset investigating the role of the mPFC in avoidance and escape behaviors. Although many movement- and task-related variables are encoded by mPFC GABAergic neurons, the main conclusion is that they are unlikely to control behavioral output.

      Strengths:

      The manuscript is generally well executed and plausible in its conclusions. It provides an alternative viewpoint to many articles describing the involvement of mPFC to behavior, based on a complex multi-stage behavioral paradigm acquired and analyzed in an unbiased way.

      Weaknesses:

      This reviewer sees two weaknesses.

      (1) In some cases, the explained variance, marginal and conditional, is low, suggesting the models only modestly capture the complexity in the data.

      (2) The manuscript is challenging to read due to the comprehensive and unbiased presentation style.

      Comments on revised version.

      The authors did a good job at addressing the reviewers' comments. One minor additional suggestion is to add references for the statement in the last paragraph of the discussion for the mPFC lesion studies.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Morgan et al. studied how paternal dietary alteration influenced testicular phenotype, placental and fetal growth using a mouse model of paternal low protein diet (LPD) or Western Diet (WD) feeding, with or without supplementation of methyl-donors and carriers (MD). They found diet- and sex-specific effects of paternal diet alteration. All experimental diets decreased paternal body weight and the number of spermatogonial stem cells, while fertility was unaffected. WD males (irrespective of MD) showed signs of adiposity and metabolic dysfunction, abnormal seminiferous tubules and dysregulation of testicular genes related to chromatin homeostasis. Conversely, LPD induced abnormalities in the early placental cone, fetal growth restriction and placental insufficiency, which was partly ameliorated by MD. The paternal diets changed placental transcriptome in a sex-specific manner and led to a loss of sexual dimorphism in the placental transcriptome. These data provide a novel insight on how paternal health can affect the outcome of pregnancies, which is often overlooked in prenatal care.

      Strengths:

      The authors have performed a well-designed study using commonly used mouse models of paternal underfeeding (low protein) and overfeeding (Western diet). They performed comprehensive phenotyping at multiple timepoints including of the fathers, the early placenta and late gestation feto-placental unit. The inclusion of both testicular and placental morphological and transcriptomic analysis is a powerful non-biased tool for such exploratory observational studies. The authors describe changes in testicular gene expression revolving around histone (methylation) pathways that are linked to altered offspring development (H3.3 and H3K4), which is in line with hypothesised paternal contributions to offspring health. The authors report sex differences in control placentas that mimic those in humans, providing potential for translatability of the findings. The exploration of sexual dimorphism (often overlooked) and its absence in response to dietary modification is novel and contributes to the evidence-base for the inclusion of both sexes in developmental studies.

      Comments on revised version:

      The authors have done a great job addressing my concerns. The description of the data analysis and the figures are now much clearer. The inclusion of the potential links between the microbiome and male reproductive fitness is informative and improves the flow of the discussion.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated the effects of a low-protein diet (LPD) and a high sugar- and fat-rich diet (Western diet, WD) on paternal metabolic and reproductive parameters and feto-placental development and gene expression. They did not observe significant effects on fertility; however, they reported gut microbiota dysbiosis, alterations in testicular morphology, and severe detrimental effects on spermatogenesis. In addition, they examined whether the adverse effects of these diets could be prevented by supplementation with methyl donors. Although LPD and WD showed limited negative effects on paternal reproductive health (with no impairment of reproductive success), the consequences on fetal and placental development were evident and, as reported in many previous studies, were sex-dependent.

      Strengths:

      This study is of high quality and addresses a research question of great global relevance, particularly in light of the growing concern regarding the exponential increase in metabolic disorders, such as obesity and diabetes, worldwide. The work highlights the importance of a balanced paternal diet in regulating the expression of metabolic genes in the offspring at both fetal and placental levels. The identification of genes involved in metabolic pathways that may influence offspring health after birth is highly valuable, strengthening the manuscript and emphasizing the need to further investigate long-term outcomes in adult offspring.

      The histological analyses performed on paternal testes clearly demonstrate diet-induced damage. Moreover, although placental morphometric analyses and detailed histological assessments of the different placental zones did not reveal significant differences between groups, their inclusion is important. These results indicate that even in the absence of overt placental phenotypic changes, placental function may still be altered, with potential consequences for fetal programming.

      Comments on revised version:

      The authors have adequately addressed all my previous comments.

    1. Joint Public Review:

      [Editor's Note: The previous reviewers comments were felt to be addressed by the reviewers and myself and have improved the work.]

      In this study, the authors suggest that DuoHexaBody-CD37, a biparatopic CD37-targeting antibody, can induce direct cytotoxicity in diffuse large B-cell lymphoma (DLBCL) cells through antibody clustering and SHP-1 activation, independent of complement. They further propose that DuoHexaBody-CD37 inhibits cytokine-mediated pro-survival signalling, suggesting a broader role for CD37-directed therapy in disrupting tumour supportive signalling networks.

      A strength of the study is the systematic in vitro characterisation of signalling responses to DuoHexaBody-CD37 across both malignant and normal B-cells. The inclusion of phosphoproteomic profiling and mutant constructs provides mechanistic detail, and the findings may be of interest to researchers working on antibody therapeutics in lymphoma.

      However, the evidence supporting key mechanistic processes - particularly the specific subtype requirement for Fc receptor crosslinking - is incomplete and would benefit from further functional validation. While CD37 has been explored previously as a therapeutic target, this study does add mechanistic insight into direct cytotoxicity and cytokine modulation. Nevertheless, the exclusive reliance on in vitro systems makes the translational relevance unclear.

      Overall, the study provides valuable insight into CD37-mediated signalling in lymphoma cells, but the evidence remains incomplete to support broader conclusions about therapeutic impact. The additional mechanistic data included during revision are informative, but the precise basis of the observed cytotoxic effects remains incompletely defined.

    1. Reviewer #1 (Public review):

      The manuscript "Heterozygote advantage cannot explain MHC diversity, but MHC diversity can explain heterozygote advantage" explores two topics. First, it is claimed that the recently published by Mattias Siljestam and Claus Rueffler conclusion (in the following referred to as [SR] for brevity) that heterozygote advantage explains MHC diversity does not withstand an even very slight change in ecological parameters. Second, a modified model that allows an expansion of MHC gene family shows that homozygotes outperform heterozygotes. This is an important topic and could be of potential interest to the readership of eLife if the conclusions are valid and non-trivial.

      The resubmitted manuscript addresses several questions from my previous review. In particular, there is a more detailed description of how the code of Siljestam and Rueffler ([SR]) was used for the simulations and the calculation of the factor 2.7 x 10^43 that is the key to the alleged breakdown of the numerical reasoning presented by in [SR].

      Yet I think that important aspects of my critique of the first statement of the manuscript about the flaws of [SR] model remain unanswered. I guess the discussion becomes rather general about the universality and robustness of various types of models to parameter changes. My point is that none of the models is totally universal. The model in [SR] is not phenomenological as none of the parameters or functional forms were derived empirically. Instead, it is a proof of principle demonstration that inevitably grossly simplifies the actual immune response. The choice of constants and functions used in Eqs. (1-5) is dictated by the mathematical convenience and works in a limited range of parameter values. It is shown in [SR] that for 3 pathogens and reasonable "virulence " \nu, the alleles branch. These conclusions are supported by the analytically derived Adaptive Dynamics branching criteria (7), which, contrary to the statement is the cover letter (" It is clear from Fig. 4 of Siljestam and Rueffler that the branching condition is far from sufficient for high MHC diversity.") is perfectly confirmed by the simulation data shown in Fig. 4.

      The mathematical simplicity of the [SR] model generates various artifacts, such as the mentioned by the Author reduction of the "condition" by an enormous factor 2.7 x 10^43 and the resulting decrease in the "survival" induced by the addition of a new pathogen. This occurs at the very large value of \nu=20, whose effect is enormous due to the Gaussian form of (1), which, once again, was chosen for the mathematical convenience. In reality, a new pathogen cannot reduce the "survival" by such a factor as it would wipe out any resident population. So to compensate for such an artifact, the additional factor c_max was introduced to buffer such an excess. There is no reason to fix c_max once for an arbitrary number of pathogens, because varying c_max basically reflects the observation that a well-adapted individual must have a reasonable survival probability. At the same time, there are many ways in which the numerical simulation may break down when the survival rates become of the order of 10^(-43) instead of one, so it comes to no surprise that the diversification, predicted by the adaptive dynamics, does not readily occur in the scenario with an addition or removal of the 8th pathogen with a very high virulence \nu=20.

      I have doubts that the reported breakdown of the [SR] model with fixed c_max remains observable with less extreme values of m and \nu (say, for \nu=7 and m=3 plus or minus 1 used in Fig. 3 in the manuscript).

      So I still find the claim that " the phenomenon that leads to high diversity in the simulations of Siljestam and Rueffler depends on finely tuned parameter values" is not well substantiated.

    2. Reviewer #2 (Public review):

      Summary:

      This study addresses the population genetic underpinnings of the extraordinary diversity of genes in the MHC, which is widespread among jawed vertebrates. This topic has been widely discussed and studied, and several hypotheses have been suggested to explain this diversity. One of them is based on the idea that heterozygote genotypes have an advantage over homozygotes. While this hypothesis lost early on support, a reason study claimed that there is good support for this idea. The current study highlights an important aspect that allows us to see results presented in the earlier published paper in a different light, changing strongly the conclusions of the earlier study, i.e., there is no support for a heterozygote advantage. This is a very important contribution to the field. Furthermore, this new study presents an alternative hypothesis to explain the maintenance of MHC diversity, which is based on the idea that gene duplications can create diversity without heterozygosity being important. This is an interesting idea, but not entirely new.

      Strength:

      (1) A careful re-evaluation of a published model, questioning a major assumption made by a previous study.

      (2) A convincing reanalysis of a model that, in the light of the re-analysis-loses all support.

      (3) A convincing suggestion for an alternative hypothesis.

      Weakness:

      (1) The title of the study is catchy, but it is explained only in the very end of the paper.

    1. Reviewer #1 (Public review):

      In this study, the authors set out to determine how two classes of kinase inhibitors, which stabilise a disease-relevant enzyme in either an active (Type I) or inactive state (Type II), influence its organisation and interactions with microtubule filaments in cells. Using the state-of-the-art in-cell structural imaging approaches, they examine how these compounds affect the formation of protein filaments and their association with microtubules, and succeed in defining the underlying structural basis for these differences.

      A major strength of the work is the application of in-cell cryo-electron tomography combined with correlative imaging, which enables direct visualisation of protein organisation in a near-native cellular context. The data convincingly demonstrate that the Type I inhibitor compound stabilising the active state promotes extensive LRRK2 filament formation and microtubule bundling, whereas compounds stabilising the inactive state markedly reduce these interactions. The structural analysis further provides insight into how conformational states relate to filament organisation, including modelling of previously unresolved regions of the protein.

      These findings are internally consistent and align well with prior biochemical and structural studies, many of which were performed by the same team.

      There are, however, some limitations that should be noted. The experiments rely on overexpression of the I2020T mutant form of the LRRK2 protein, which is a rare variant, in a single cell type (293T cells), which may not fully reflect endogenous behaviour or wild-type LRRK2 in a physiological context. In addition, while the imaging data are compelling, the functional consequences of the observed filament formation and microtubule association remain unclear.

      The study therefore provides strong descriptive and structural insight, but more limited evidence linking these observations to cellular or disease-relevant outcomes.

      Overall, the authors largely achieve their aims, and the results support their central conclusion that different classes of kinase inhibitors have distinct effects on protein organisation in cells. The work represents an important advance in understanding how small molecules can reshape protein architecture in a cellular environment, with potential implications for therapeutic strategies. The methodological approach will also be of broad interest to the field, as it highlights the power of in-cell structural biology to study dynamic protein assemblies that are difficult to capture using traditional approaches.

    2. Reviewer #2 (Public review):

      Summary:

      Mutations in Leucine-Rich Repeat Kinase 2 (LRRK2) are a major cause of Parkinson's disease. LRRK2 PD-related mutations all result in increased kinase activity. Therefore, LRRK2 has been the focus of the development of kinase inhibitors. So far, two classes of kinase inhibitors have been identified: type 1 LRRK2-specific inhibitors that stabilize LRRK2 in a closed active-like conformation and broad-range type 2 inhibitors that stabilize LRRK2 in an open inactive-like conformation. Basiashvili et al. used here in cell structural biology to study the effect of both type 1 and type 2 inhibitors on the localization and structural conformation of LRRK2-I2020T.

      Strengths:

      They showed that Type 1 and not Type 2 inhibitors induce LRRK2 filament/ on microtubules. Furthermore, they were able to build a structural map of full-length LRRK2 I2020T bound to a Type 1 inhibitor in a closed kinase confirmation. Together, this work thus confirms the data of previous studies that showed that LRRK2 Type 1 and 2 inhibitors differently affect filament formation.

      Weaknesses:

      All conclusions are fully supported by the provided data. However, as the authors indicated themselves, the physiological relevance of LRRK2 microtubule binding is questionable. Furthermore, although the authors used a full-length LRRK2 protein, like in previously published structures, the resolution of the N-terminal domains is rather poor. Therefore, it also remains unclear what we learn from this structure compared to the previously published structures.

    3. Reviewer #3 (Public review):

      Summary:

      This paper describes new insights into the effects of type-I and type-II LRRK2 inhibitors on HEK293T cells that over-express GFP-labeled LRRK2-I2020T. Using correlative light microscopy and cryo-electron tomography, a type-I inhibitor leads to the extensive decoration of microtubules with LRRK2, which is not seen for a type-II inhibitor. Subtomogram averaging reveals that LRRK2 binds to the microtubules in a closed-kinase conformation, with density for the N-terminal arms.

      Strengths:

      The paper is well written; the CLEM and cryo-ET appear to be done to a high standard. Consequently, I have only minor comments.

      Weaknesses:

      The resolution of the subtomogram averages is somewhat limited, but the authors have adequately limited the number of degrees of freedom in the fitting of their atomic models by only allowing rigid-body transformations of separate parts of LRRK2.

      The authors should include FSC curves between the rigid-body fitted atomic models and the various sub-tomogram average maps.

    1. Reviewer #1 (Public review):

      The authors of this study developed a method to quantify calvarial bone marrow from MRI head scans, enabling the study of its composition in large datasets of adults, usually collected to study the brain. Bone marrow intensity can be semi-quantitatively measured in T1-weighted MRI scans due to the greater signal intensity of fat than watery red marrow. This is an ingenious use of the MRI-produced information for other important phenotypes, such as bone structure and marrow content. Different head types were tested for complying with the model, which is notable.

      The model was also successfully validated using several publicly available MRI resources - real data - in (1) a dataset consisting of 30 individuals that were scanned 10 times each at 3-day intervals, and (2) the monozygotic (MZ) twin data from the Human Connectome Project cohort. Then the authors applied this validated method to head-MRI scans from the UK Biobank (n=33,042) to extract information on the spatial distribution of bone marrow adiposity (BMA) in the calvaria, allowing a GWAS to identify associated genes.

      The authors revealed high heritability and identified 41 genetic loci significantly associated with the BMA trait, including six sex-specific loci. Of note, statistics estimate that 99% of BMA trait-influencing variants are shared with BMD (497 of 500 variants), which may mean these results demonstrate the biological relevance to bone health. Some of the BMA genes were found related to the Wnt pathway, including WNT16, WNT4, NXN; this is a "positive control", since the Wnt/β-catenin signaling pathway was suggested as an important determinant of BMA. Also, associations in genes (BMP4, DLX5, LGR4, LRP4, SFRP4) that are known to specifically influence adiposity, are encouraging. Integrating mapped genes with bone marrow single-cell RNA-seq data revealed patterns of adipogenic lineage differentiation and lipid loading.

      The study also investigated the genetic overlap between BMA and twelve (or 13) "brain and body" traits and identified significant genetic correlations with BMI, cognitive ability, and Parkinson's disease.

      In sum, since MRI head scans present a hitherto unexplored opportunity to address unresolved aspects of bone marrow biology, this study is both timely and innovative.

      There are, however, some assumptions, findings, and their interpretation, which require more critical focus.

      Sex-specificity is well described and studied here. Men have higher BMA than women, but post-menopausal women catch up in the BMA values. The authors believe that calvarial marrow has a number of features that make it particularly well-suited to the study of BMA process - which is clinically important in other bone sites. It has a simple "sandwiched" structure that they are able to model. This is true only to some extent: a condition called "Hyperostosis frontalis interna", of unknown etiology (described by Smith & Hemphill in 1956) - is characterized by irregular overgrowth of the inner table of the frontal bone (symmetric/bilateral). Although not of clinical significance, typically benign, studies report a prevalence of 12%; However, it's most common in postmenopausal women - where prevalences up to 49% in women over the age of 65 - have been reported. Thus, sexual dimorphism is obvious and the effect of estrogen is likely shared with whichever bone - and marrow - age-related pathology. So, for women not using HRT, this new layer of the bone might interfere with the calvarial BMA readings and in turn, affect the BMA-related analyses. The authors suspect that the effect of BMA on BMD may be biased in women; they should comment on those "with low BMD and high BMA" given that hyperostosis frontalis might be an issue. A strong effect of SNPs in the ESR1 chromosomal region might be akin to the above concern.

      Then, there is a perfect overlap of the BMA SNPs that are shared with BMD (497 of 500 variants), which may prove a "face validity" of the MRI-derived BMA. However, the BMD in the study was heel-derived eBMD - which is a good proxy for osteoporosis and is mostly driven by trabecular bone. Thus, there might be a concern that the BMA metrics capture some trabecular BMD.

      Next, integrating mapped genes with existing bone marrow single-cell RNA-sequencing data revealed patterns of adipogenic lineage differentiation and lipid loading. The problem here is that the scRNAseq studies of the Bone Marrow niche are overwhelmingly mouse. The authors might wish to justify why they are relevant to humans (in the absence of the human-specific scRNAseq).

      For genetic correlation analysis, the authors selected 7 body and 6 brain traits. The latter traits reflect cognition (general cognitive ability and educational attainment) and brain-related disorders. This selection might seem arbitrary. The interpretation of genetic correlation with cognitive ability, education, and Parkinson's disease was attributed to the recently discovered vascular channels that link calvarial bone marrow to the meninges. This is a fascinating hypothesis, which requires functional proof. However, there might be simpler explanations. Thus, the diploe and the inner table of the calvarium are drained by the same veins as the dura. From the anatomy textbook, we know that diploic veins connect the pericranial and endocranial venous system through the skull.

    2. Reviewer #2 (Public review):

      Summary:

      This study develops a new artificial intelligence method for high-throughput analysis of skull bone marrow from MRI data, which may be useful for large-scale biological analyses. Using this method, the authors then attempt to estimate skull bone marrow adiposity (BMA) using T1-weighted signal intensity from MRI scans of ~33,000 people, followed by genome-wide association analysis; however, the approach is inadequate because T1-weighted signal intensity is not validated for measurement of bone marrow adiposity. If it could be validated, the study would be an important advance in understanding of bone marrow adiposity and skeletal biology.

      Strengths:

      This paper is well-written, and the figures are nicely presented. The neural network method used for analysing skull bone marrow is innovative, and the authors validate this through several approaches. Therefore, the authors have achieved the aim of developing a method for large-scale analysis of skull bone marrow from MRI data.

      The GWAS is reasonably well-powered and addresses potential ethnicity differences, with one GWAS done across white males and females, and a separate GWAS in non-white participants. The methodology also conforms to common GWAS standards, including for mapping genetic variants to candidate genes. Moreover, the study further investigates the biological roles of these genes by analysing their expression in single-cell RNA sequencing data.

      Weaknesses:

      The fundamental weakness is that T1-weighted MRI signal intensity (T1W) is used as an estimate of BMA, but it has never been validated for this. The authors show that this T1W parameter measures something that is heritable and can be compared between subjects, but they don't show that it actually measures (or even estimates) calvarial BMA. There is an attempt to do so by comparing the T1W parameter with data from quantitative T1 images: the authors show a reasonable correlation with some of the quantitative T1 image data. However, this still does not show that the parameter is measuring BMA; it could be measuring some other biological characteristic, but this remains unclear. So, there is a need to validate the T1W parameter against an established measure of BMA, such as the bone marrow fat-fraction or proton density fat fraction measured from multi-echo MRI analysis.

      Without validating this BMA measurement method, it is not possible to interpret the GWAS or other findings reported in the study.

      A less critical weakness is that the GWAS has been done only on a single cohort, without replicating the findings in a follow-up cohort. For example, the authors could repeat their analysis on the remaining ~50,000 UK Biobank imaging participants for whom MRI data is now available. However, this would be pointless without knowing what biological characteristic(s) the T1W parameter is actually reflecting.

      [UPDATE, June 2026: since writing this review in September 2024, the reviewer has changed their opinion and now has confidence in the reliability of the T1W method used to estimate BMA. The reviewer would like to explain that their original critiques were based largely on previous discussions with a colleague with expertise in magnetic resonance and medical physics, who was extremely negative about use of T1W signal intensity to estimate BMA; this colleague’s criticisms may not have been objective, and clouded the reviewer’s overall impression of the present study. The reviewer and others have since completed BMA analysis using dual-echo MRI data in the UK Biobank; the findings of these studies, both for genetic and pathophysiological associations, are largely consistent with the findings of the present study, underscoring the reliability of the T1W-based BMA estimates.]

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript, "Estimating bone marrow adiposity from head MRI and identifying its genetic 2 architecture", brings together the groups of Drs. Kaufmann and Hughes in a tour de force work to develop an artificial neural network that localizes calvaria bone marrow in T1-weighted MRI head scans, with the goal of studying its composition in several large MRI datasets, and to model sex-dimorphic age trajectories, including the effect of menopause.

      Strengths:

      Bone marrow adiposity is a very active tissue with far-reaching implications for tissue crosstalk and human health than we had initially recognized. Although MRI has been used to measure BM, studies such as the one by these two groups are still lacking whereas very large datasets are analyzed using advanced AI machine learning tools coupled with genetic studies and a specific pathology. The groups had to develop new methods and new AI machine-learning tools for the imaging analyses.

      Weaknesses:

      Some aspects of the work that authors could add additional clarification.

      (1) Imaging Limitations: The authors provide an excellent overview and references supporting the use of MRI as a method for assessing marrow fat, particularly with some specific modifications. However, MRI images can be affected by various factors, including the presence of other tissues as well as specific MRI settings, which are much harder to precisely control when using different datasets.

      (2) The specific density of cranial bones as it relates to the types of bone marrow: Cranial bones are extremely dense structures, which naturally interfere with MRI imaging. While it is thought that cranial bones have mostly "red bone marrow", this is only true for a short time in humans. How sensitive is their system in differentiating between red and yellow BM?

      (3) Both items above are further complicated by aging, but aging is not a linear event as we have learned. There are specific bursts of aging in humans around the age of 45 and early 60s. How do the system and model predict or incorporate these peaks of aging? It seems from the data shown that aging is reflected more as a linear phenomenon. Is this because additional aging datasets are needed?

      (4) The authors describe in richness of detail their AI learning programming and how it extracted the data from datasets. The authors also show some important correlations with specific genes, SNPs. What is not clear is how conditions such as anemia for example. An expected finding would be that patients with chronic anemia have lower bone marrow (BM) signal intensity on MRI scans than healthy people. This is because the signal intensity of BM depends on the fat-to-cell ratio in the tissue. Furthermore, patients with a host of musculoskeletal disorders ranging from osteopenia to osteoporosis, sarcopenia, and osteosarcopenia will also have altered MRI scans. When using such large datasets how did the authors control or exclude these pathological conditions, or were all these conditions likely present?

      (5) Some of the genes and SNPs although significant showed very small correlations. What is their likely physiological significance?

      (6) The authors could use this excellent manuscript to expand their discussion to include the need for studies like theirs to be also complemented by multi-OMICS studies that will include proteomics and lipidomics of BM, bones, and muscles.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors investigate ubiquitylation of RPS27A/eS31 by the E3 ligase RNF25 in response to translational stress. Previous studies have identified RPS27A/eS31 ubiquitylation at Lys113 under conditions where translation factors are trapped in the ribosomal A-site. Here, the authors extend this work by testing whether additional translational stress conditions, including amino acid deprivation, induce RPS27A/eS31 ubiquitylation. They further show that GCN1 is required and explore a possible competition between RNF25 and GCN2 for GCN1.

      Strengths:

      This study expands on the range of stress conditions leading to RPS27A/eS31 ubiquitylation, reporting that it occurs in a variety of conditions associated with ribosome stalling, including amino acid deprivation. These observations are useful because they suggest that the RNF25 pathway may not require translation factors trapped in the ribosomal A-site, but may instead respond more broadly to translational perturbations associated with ribosome collisions.

      Weaknesses:

      The evidence supporting several of the major claims is incomplete, and additional controls and orthogonal approaches would greatly strengthen the evidence presented. In particular:

      (1) It is unclear whether the different conditions used to induce translational stress lead to ribosome stalling or collisions. The model presented by the authors seems to rely on ribosomal collisions, but this is not shown. In addition, further investigating amino acid deprivation beyond the removal of Arg or Lys would strengthen the paper.

      (2) Ubiquitylation of RPS27A/eS31 by RNF25 is used throughout the paper as a readout of RNF25 activity and is assumed to be on Lys113 based on previous work, but is not formally shown here.

      (3) Rescue experiments of the different mutants used in this study with wild-type and different domain deletions (i.e., ΔRWD for RNF25, ΔRWD-binding for GCN1) would help confirm specificity and strengthen the mechanistic claims.

      (4) The conclusion that RPS27A/eS31 ubiquitylation supports translation (Figure 4) is based entirely on polysome/monosome ratios, which are difficult to interpret without additional assays of translation output, elongation, or collision.

      (5) The idea that RNF25 competes with GCN2 for GCN1 binding is interesting, and related models have recently been proposed in RNA damage. The effect of GCN2 KO on RNF25-dependent ubiquitylation appears modest, and the data would be strengthened by rescue experiments with wild-type GCN2 and GCN2 mutants defective in GCN1 binding. The authors propose: "that the RNF25 pathway acts as a first line of defence to resolve ribosome collisions, outcompeted by GCN2 binding to GCN1 under acute stress." This model would suggest a further increase in RPS27A/eS31 ubiquitylation upon Arg/Lys deprivation in GCN2 KO cells, since this is the condition in which GCN2 is expected to be activated and engaged with GCN1 (i.e., when it would be competing with RNF25), but no further increase in RPS27A ubiquitylation is observed. It is therefore not clear that these data support the proposed model. Contributing to this may be the fact that many of these assays are performed in a USP16 KO background, which may make it difficult to assess changes in RPS27A/eS31 ubiquitylation.

      (6) Given that several RWD domain proteins can interact with GCN1, and that DRG2 KO appears to affect RPS27A/eS31 ubiquitylation (Figure S5), the data do not support the GCN2-specific title. The results are more consistent with a broader, incompletely characterized network of GCN1-associated RWD domain-containing proteins that seems to affect RNF25-dependent ubiquitylation rather than with a demonstrated RNF25-GCN2 competition mechanism. Further characterization of GCN2-dependent ISR activation (p-eIF2a and ATF4 WB) in the absence of RNF25 in Arg/Lys starvation will help shed light on the RNF25-GCN2 competition. The authors use K113R, but this is not shown to prevent RNF25 engagement with GCN1, so a RNF25 KO should be used.

      Overall, the study contains useful observations, but the mechanistic claims are not yet fully supported.

    2. Reviewer #2 (Public review):

      Summary:

      The authors show that deprivation of Arginine and Lysine induces a ~50% increase in the ratio of ubi-RPS27A to RPS27A, and this induction requires E3 ubiquitin ligase RNF25. The authors show ZAKalpha and EDF1 are not required for steady state or ribosome stalling-induced ubi-RPS27A, while GCN1 is required. The ratio of polysomes to monosomes is increased in RNF25 knockdown cells or when translation is activated by ISRIB in a RPS27A K113R mutant cell line. GCN2 KO cells indicate elevated levels of ubi-RPS27A, and overexpression of the GCN2 RWD domain reduces levels of ubi-RPS27A.

      Strengths:

      (1) The authors identified a novel pathway to sense amino acid deprivation, indicated by ubi-RPS27A, previously implicated in ribosome stalling.

      (2) The authors find antagonism between two proteins known to act downstream of GCN1, giving insight into how signaling occurs from an upstream sensor of ribosome stalling to multiple downstream pathways.

      Weaknesses:

      (1) The authors suggest that, based on increased Polysome/Monosome ratios, there is more disome stalling in RNF25 KD cells and RPS27A K113R cells treated with ISRIB, but this readout is very indirect and could be driven by other changes in the cell other than ribosome stalling.

      (2) While the authors propose that GCN2 and RNF25 compete for binding to GCN1, no evidence was shown that RNF25 binds to GCN1 in cells, nor that the interaction increases when GCN2 is absent.

      (3) The use of USP16 to enhance the detection of ubi-RPS27A in many experiments brings the question of whether USP16 KO may alter the protein levels of any known regulators of ribosome collisions? (i.e. ZNF598, GCN1, EDF1, ZAKalpha, etc.) If USP16 KO causes changes in other important regulators of collisions, the authors could be identifying genetic interactions with USP16 in their experiments throughout the paper.

      (4) In Figure 5E, the expression level of the GCN2 3K RWD domain looks to be lower than the WT RWD domain; perhaps this could be what is driving the smaller decrease of ubi-RPS27A seen with GCN2 3K vs WT.

    3. Reviewer #3 (Public review):

      Summary:

      This study examines the role of RNF25 in translational quality control. Previous work indicated that RNF25 is activated by ribosomes stalled with defective elongation or termination factors bound in the A-site. Here, the authors provide evidence that RNF25 is activated by other treatments that evoke ribosome stalling, including amino acid starvation, where the A-site may be empty, leading to ubiquitination of RPS27A in a manner requiring the ISR collision sensor Gcn1, but not EDF1 and ZAKα, involved in the RQC and RSR surveillance pathways. They present some evidence from polysome profiling that RNF25 and its ubiquitination of RPS7A help resolve ribosome collisions and support translation elongation in basal conditions. They further show that KO of Gcn2 increases RPS27A ubiquitination in basal conditions, but not in amino acid-starved cells, and that RPS27A ubiquitination was reduced on overexpressing the WT RWD domain of Gcn2 but not a variant harboring substitutions of residues predicted to bind Gcn1. Based on these findings, they propose a model that, in response to ribosome stalling induced by various stresses, Gcn1 recruits RNF25 via the latter's RWD domain to ubiquitinate RPS27A and thereby resolve ribosome stalling and promote continued elongation. If collisions increase even further, GCN1 recruits GCN2 instead of RNF25 to elicit the ISR.

      Strengths:

      The data is convincing that a variety of triggers leading to diverse stalled ribosomal states, including amino acid limitation, can activate RNF25, suggesting that activation of this pathway does not require the presence of trapped protein factors in the ribosomal A-site but is a more general response to ribosome collisions. It is also convincing that Gcn1 is required for RNF25 activation under all of these conditions, which is consistent with previous findings that Gcn1 is required for RNF25 function in the presence of trapped elongation or termination factors. The finding that EDF1 and ZAK are not needed for RNF25 activation in amino acid starvation conditions is of interest for EDF1, given the recent claim that it is required for full ISR activation.

      Weaknesses:

      The evidence presented from polysome profiling that RNF25 helps resolve naturally occurring ribosome collisions in basal conditions is not compelling, as eliminating RNF25 could be increasing the rate of initiation rather than increasing stalled ribosomes as the means of increasing the P/M ratio. The Rps27A-K113R mutation could have the same effect of increasing initiation, which could have been obscured by inhibiting the ISR with ISRIB.

      The evidence that RNF25 competes with Gcn2 for Gcn1 binding is also not compelling. While it's convincing that Rps27A-Ubi is elevated in basal conditions on eliminating Gcn2, loss of GCN2 would be expected to increase ribosome loading on mRNAs, potentially elevating the frequency of collisions and thereby stimulating RNF25 activity indirectly.

      It's also quite puzzling and left unexplained why they observed no further increase in Rps27A-Ubi on -Arg/-Lys starvation in the cells lacking Gcn2. Why wouldn't -Arg/-Lys starvation lead to further stalling and RNF25 activation in the absence of Gcn2? (Since Gcn2 KO increases Rps27A-Ubi in the presence +Arg/+Lys conditions, it can't be that Gcn2 is required for RNF25 function.) The same puzzling and unresolved observation was made in the cells lacking DRG2. One possible explanation for this conundrum is that low-level RNF25 abundance limits further activation.

      The quantitative effects of overexpressing the Gcn2 RWD domain on Rps27A-Ubi, constituting their other evidence presented to support the competition model, are quite small in magnitude.

    1. Reviewer #1 (Public review):

      Summary:

      This study identifies mutations in alpha-tubulin that suppress Tau-induced neurodegeneration using the C. elegans model of Tauopathy, suggesting a potentially interesting role for microtubule properties in modulating Tau toxicity. These missense mutations cluster in the C-terminal Tau-interacting helix 12 region of alpha-tubulin genes (tba-1, tba-2, and mec-12). Further analysis, particularly using the strongest suppressor tba-2, shows that it rescues Tau-induced behavioral deficits and neuronal loss without significantly altering bulk tau-phosphorylation, aggregation, or binding to soluble tubulin. The authors suggest that altered microtubule properties underlie the neuroprotective effects, and manipulating microtubule properties may have therapeutic potential.

      Strengths:

      The study is conceptually interesting as it shows that Tau-induced neurotoxicity can, in this model, be partially uncoupled from canonical pathological hallmarks such as Tau-hyperphosphorylation and aggregation. The identification of multiple independent mutations in the same structural region of three alpha-tubulin genes provides support for the functional relevance of helix 12 in modulating Tau-induced toxicity. The authors demonstrate significant rescue of behavioral deficits (using motility and manual thrashing assays) and neuronal loss in both WT-tau and FTLD-associated TauV337M in combination with mutant alpha-tubulins, suggesting a general mechanism for tubulin-regulated modulation of Tau-toxicity. Moreover, the correlation between mutant tubulin expression levels and the extent of rescue supports a causal relationship.

      Weaknesses:

      One of the major claims of this manuscript is that altered microtubule properties suppress Tau toxicity. The only supporting evidence in this context provided by the authors is reduced taxol-stabilized microtubule mass, which does not fully explain neuronal loss or the rescue of behavioral deficits. What remains unclear is whether these mutations alter microtubule dynamics, catastrophe, lattice stability, or axonal transport.

      The authors show that mutant tba-2 reduces total tau levels by ~45%. This level of reduction is likely significant but underexplored in the manuscript. Why are the Tau levels reduced? How is Tau getting cleared- is there enhanced autophagy or ubiquitin-proteasome pathway getting upregulated in tba-2 + Tau animals? Or one or more of the Tau species not detectable by the antibodies used in this study? The observation that the mec-12 mutant rescues Tau-induced phenotypes without altering Tau levels suggests that suppression can occur through Tau-independent mechanisms. This raises an important unresolved question regarding the extent to which suppression is Tau-dependent vs Tau-independent across different mutant alpha-tubulin genes, complicating the interpretation of the rescue phenotypes.

      Given that Tau primarily associates with the microtubule lattice in vivo, measuring interactions with soluble tubulin may not fully capture biologically relevant binding dynamics and therefore does not exclude the possibility that these mutations alter tau-microtubule interactions at the lattice level or may affect the binding of other MAPs/regulators, thereby altering stability or trafficking.

      A large body of conclusions is drawn from behavioral rescue and biochemical assays. This limits the understanding of how molecular changes in tubulin might affect cellular mechanisms of neuroprotection. Are there changes in the neuronal microtubule organization, Tau localization, or its redistribution in the mutant alpha-tubulin background? Are there differences in soluble vs oligomeric vs insoluble Tau in mutant tba-2 and mec-12 animals?

      The suppression of behavior in the co-pathology model is interesting but mechanistically insufficient, mainly because the underlying basis of suppression is not examined in these models. Moreover, it remains unclear whether tubulin-Tau genetically interacts with Aβ or TDP-43, and what cellular mechanisms account for the partial rescue observed in these co-pathology models.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Benbow et al. identifies, through a genetic screen, key tubulin mutants that, with high confidence, rescue tau-mediated ND phenotypes. This manuscript is well written, and the experimental results strongly support the authors' claims that these tubulin mutants can rescue ND-linked phenotypes in C. elegans while having little to no direct effect on Tau aggregation.

      Strengths:

      Benbow et al. use a relatively unbiased forward genetic screen to identify mutations associated with phenotypes that suppress tauopathy-related defects. The authors then logically focus on the various α-tubulin missense mutations identified in H12, which are known to localize to the external face of microtubules. The authors also carefully compare their established tauopathy-associated phenotypes in the WT TauH model, with and without specific α-tubulin mutations, using appropriate controls throughout. Lastly, the authors provide partial mechanistic insight into the α-tubulin mutant-mediated rescue, showing that these effects are independent of tau aggregation and tau phosphorylation, and instead suggest that the α-tubulin mutations may confer altered microtubule assembly properties based on the sedimentation assays.

      Weaknesses:

      While the claims are largely supported by the experimental outcomes, the authors at times do not provide enough detail in the text for readers to interpret the data sets independently. In addition, some claims appear to be slightly overstated relative to the data or the degree of error associated with those data.

    1. Reviewer #1 (Public review):

      Summary:

      The protein DELE1 is a critical component to signal mitochondrial stress to the cytosol: under stress conditions, a truncated form of DELE1, termed DELE1(CTD) accumulates in the cytosol as an oligomer, binds the HRI kinase, which triggers the integrated stress response.

      Leveraging the structural knowledge of the DELE1(CTD) oligomer, this study attempts to interfere with the oligomerization process, using an AI-designed protein that binds to the DELE1(CTD) oligomerization interface. The starting hypothesis is that such a binder shall selectively inhibit the DELE1-signalled mitochondrial stress response. The authors use established AI pipelines (RFdiffusion) to make a series of such binders, characterize them with biochemical methods and a crystal structure of the binder in its free state. When over-expressing the binders in HEK293T cells, the authors report that mitochondrial stress - induced with a drug - does indeed not lead to triggering the stress response, confirming their starting hypothesis.

      The work is an elegant demonstration of how AI-designed proteins can specifically interfere with cellular mechanisms.

      The conclusions of the work are mostly well supported by data; there are some mechanistic gaps, however, about the interaction mechanisms.

      Strengths:

      The study is a nice combination of (i) a clear structure-derived hypothesis on how to interfere with a signalling mechanism, (ii) state-of-the-art protein design tools, (iii) a mostly robust biochemical characterization, and (iv) cellular experiments to demonstrate the effects of the binders.

      Weaknesses:

      The crystal structure of the binder5, while confirming its AlphaFold model, does not provide direct evidence of the binding mode to DELE1. Direct structure determination, using crystallography (which may require cleaving the MBP domain) would make their mechanistic arguments stronger.

      The demonstration that the binders do not inhibit the DELE1-HRI interaction is interesting; however, the underlying mechanism, in particular where the DELE1-HRI binding occurs, is not explored.

      While this study opens perspectives on how to interfere with DELE1-signalling, it is unlikely that these binders are actually useful for medical applications (compared to small-molecule drugs), as acknowledged in the manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      Previous structural analyses of DELE1 by the authors revealed that the first α-helix within the TPR repeat domain provides the oligomeric interface of DELE1, and that DELE1 octamer formation is required for maximal ISR activation. Based on these findings, the authors designed peptides intended to bind this oligomeric interface and showed that these peptides interfere with DELE1 oligomerization in vitro and attenuate ISR activation in cultured cells.

      Strengths:

      The series of in-vitro data sets showing direct binding of the designed peptides to DELE1 and inhibitory effects on its oligomerization are convincing.

      Weaknesses:

      The physiological (or experimental) significance of inhibiting the DELE1-HRI-ISR pathway using these peptides has not been clearly demonstrated, particularly given that the very limited cell biological outcomes are tested in the current manuscript.

    3. Reviewer #3 (Public review):

      Significance of the findings and the strength of evidence:

      The article presented by Yang et al. describes the development of protein binders targeting the C-terminal domain of the protein DELE1, which is involved in the mitochondrial integrated stress response (mitoISR) pathway. It was shown earlier that DELE1 is imported into the mitochondria and cleaved by the inner mitochondrial membrane protease OMA1, resulting in an N-terminal and C-terminal domain, the latter being transported back into the cytosol, where it interacts and activates the kinase HRI. HRI, in turn, phosphorylates eIF2α, resulting in selective translation of mRNAs encoding proteins involved in stress signalling, such as the transcription factor ATF4. ATF4 activates expression of genes involved in amino acid balance, redox homeostasis and proteostasis. The C-terminal domain of DELE1 (DELE1CTD) was structurally and functionally characterized by earlier by cryo-EM by Jie Yang and co-workers. These studies suggest that it forms an octamer with D4 symmetry consisting of two tetramers arranged in a tail-to-tail arrangement. In this octamers two interfaces were identified, one between the monomers in the tetramers and one connecting the tetramers to form the octamer. In this earlier work, it was also shown by mutational studies that interrupting the first interface has an impact on the OMA1-DELE1-HRI-eIF2α-ATF4 pathway upon mitochondrial stress in human cells. To this end, the authors concluded in the current manuscript that it might be interesting and also of therapeutic interest to develop a protein binder that binds DELE1 and disrupts oligomer formation. The authors set up a de novo protein design approach using RFdiffusion to design a protein scaffold and ProteinMPNN to design the side chains to create protein binders targeting the α-helix α1 in DELE1CTD that is directly involved in the formation of the first interface forming the tetramer. As I am not an expert in protein design, I cannot judge the quality of this data. The candidates were evaluated by AlphaFold3 to confirm complexes formed between the designs and DELE1CTD. In the end, 12 designed protein binders were selected for further analyses. These proteins were recombinantly produced in E. coli and purified. The proteins DELE1 full-length (DELE1fl) and DELE1CTD were produced as MBP-fusion proteins to improve solubility and stability. Co-expression studies with mbp-delet1CTD revealed that 11 out of the 12 binders co-eluted with MBP-DELE1CTD from a size-exclusion chromatography column, indicating complex formation. Without the presence of the binders, MBP-DELE1CTD elutes as a higher oligomer, suggesting that the binders interfere with oligomerisation. Further analyses included the impact of the presence of selected binders on stress-induced ISR. The authors found that different binders had a slightly different impact on the outcome upon treatment with stressors, and also compared two different stressors. This was concluded by assessing the ATP4 protein level by immunoblotting. The interaction of selected binders with DELE1CTD was subsequently confirmed by co-immunoprecipitation experiments. To evaluate whether the impact of the binders is restricted to mitochondrial stress studies, eliciting endoplasmic reticulum stress showed no effect on ATF4 levels. The presence of the binders furthermore impaired recovery of tubulated mitochondria following mitochondrial stress induction, resulting in more fragmented mitochondria. The authors determined a crystal structure of one binder at a resolution of 2.6 Å and performed AlphaFold3 predictions to model the complex between binders and DELE1CTD. The interface is characterized by many hydrophobic residues. From this data, they concluded some interface mutants and tested those concerning their impact on the interaction. Indeed, mutation of these hydrophobic side chains to charged residues interfered with complex formation. Finally, the authors show that binder binding to DELE1CTD does not interfere with the binding of HRI kinase. Overall, the methodology applied is state-of-the-art, and the manuscript is well-written. The design of protein binders targeting DELE1 involved in mitochondrial stress signalling is interesting for basic science to study stress signalling, but also therapeutically. However, as ISR has a positive impact on disease development and ageing, but also a negative one, depending on the degree of activated ISR, a therapeutic use would need to be precisely applied. The study has some weaknesses, and particularly the structural data seems to have severe issues.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates epigenetic and three-dimensional chromatin alterations associated with primary trastuzumab resistance in HER2-positive breast cancer using integrated CUT&Tag, RNA-seq, and Micro-C analyses in JIMT1 (resistant) and SKBR3 (sensitive) cell models. The authors identify widespread remodeling of histone modification landscapes, chromatin compartment organization, and promoter-enhancer looping, highlighting SGK1 as a candidate epigenetically activated mediator associated with intrinsic resistance. The manuscript provides a technically solid and extensive multi-omic resource for the study of HER2-positive breast cancer resistance states.

      Strengths:

      The study integrates multiple state-of-the-art epigenomic and chromatin conformation approaches, including CUT&Tag, RNA-seq, and Micro-C, generating a comprehensive dataset that will likely be valuable to the field. The analyses are generally technically rigorous and well executed, and the manuscript is overall clearly written. The integration of chromatin architecture, enhancer activity, transcriptional regulation, and histone modification profiling provides an informative overview of large-scale epigenomic remodeling associated with resistant versus sensitive HER2-positive breast cancer states. The identification of SGK1-associated chromatin activation and enhancer rewiring is particularly interesting and supported by multiple orthogonal datasets.

      The inclusion of both intrinsic and acquired trastuzumab resistance models also strengthens the study conceptually, even if the biological interpretation remains somewhat complex.

      Weaknesses:

      The major limitation of the study is that many of the central mechanistic conclusions remain largely correlative. Although coordinated changes in chromatin architecture, histone modifications, enhancer activity, and SGK1 expression are observed, direct evidence demonstrating that these epigenetic alterations causally drive SGK1 activation or trastuzumab resistance is currently lacking.

      In addition, the interpretation of SGK1 as a broader trastuzumab-resistance driver is somewhat weakened by the analyses in the acquired resistant SKBR3_HR model, where SGK1-associated chromatin and transcriptional changes appear largely absent. This raises the possibility that SGK1 dependency may reflect a lineage- or model-specific vulnerability intrinsic to JIMT1 cells rather than a generalizable resistance mechanism.

      The study also remains descriptive in several sections. Numerous chromatin interactions and compartment changes are cataloged without sufficient biological contextualization or mechanistic integration. As a result, parts of the manuscript currently read more as a comprehensive epigenomic profiling resource than a fully mechanistic study of resistance biology.

      Finally, the translational impact is limited by the lack of patient-level validation linking SGK1 activation to trastuzumab response or clinical outcome in HER2-positive breast cancer cohorts.

    2. Reviewer #2 (Public review):

      Summary:

      Duan, Hua et al. used CUT&Tag and Micro-C to investigate that in primary trastuzumab-resistant HER2+ breast cancer cells, promoter H3K4me3 rather than H3K27me3 is strongly correlated with transcriptional activity. Resistant cells also exhibited more abundant promoter-enhancer loops and enriched cohesin at loop anchors, accompanied by shifts in A/B compartment status. Through multi-omics integration, the authors identified SGK1 as a key gene showing elevated promoter H3K4me3 levels, enhancer activation, strengthened chromatin loops, and upregulated transcription in resistant cells, and validated SGK1 as a potential therapeutic target. These findings reveal the coordinated interplay between three-dimensional chromatin architecture and epigenetic modifications, offering important insights into trastuzumab resistance in HER2+ breast cancer.

      Strengths:

      Previous investigations into trastuzumab resistance have largely focused on genetic mutations or individual epigenetic modifications. In contrast, this study moves beyond genetic or single epigenetic views by integrating histone modifications and 3D chromatin architecture into a unified framework, proposing a synergistic model of promoter H3K4me3, enhancer activation, and chromatin looping that underlies non-genetic resistance. It provides a new conceptual basis for understanding non-genetic resistance mechanisms. Secondly, using high-resolution epigenomic and conformational mapping together with bidirectional in vitro and in vivo functional validation, it establishes a solid link between epigenetic changes and phenotypes, and demonstrates that SGK1 inhibition suppresses tumor growth in a xenograft model, revealing clear translational potential.

      Weaknesses:

      (1) All findings are based on a single pair of cell lines, JIMT1 and SKBR3, which does not allow exclusion of cell line‑specific effects. The authors did not examine SGK1 expression levels, promoter H3K4me3 status, or relevant chromatin loops in tumor tissues from patients with clinical trastuzumab resistance. Consequently, whether the conclusions can be extrapolated to actual patient populations remains unclear, which limits the clinical relevance of the findings. It is recommended that the authors directly validate the key findings using tumor samples from patients with clinical trastuzumab resistance or analyze the correlation between SGK1 expression levels and disease-free survival or pathological complete response using data from public databases for HER2+ breast cancer patients, which would help address the current limitation of lacking clinical sample validation and the uncertainty regarding the association of SGK1 with patient prognosis and treatment response.

      (2) In the Discussion, the authors propose that SGK1 may assume the role of AKT to sustain mTOR activation, thereby bypassing the dependence on HER2 signaling following trastuzumab inhibition. Although this hypothesis is supported by published literature, the present study provides no direct signaling evidence, such as examining phosphorylation changes of SGK1, AKT, mTOR, or their downstream effectors.

    1. Reviewer #1 (Public review):

      The study by Epp et al. has indeed gotten a lot of attention. As so often in the fMRI literature, some voices had taken the results out of proportion as if this result would suggest that we cannot trust fMRI. This is so, while informed researchers are aware of the capabilities and challenges of BOLD as a measure of neural activity. The paper was discussed and criticized on many aspects from various angles. E.g. with respect to unestablished models of estimating CMRO2, the 40% figure is being overestimated by the mask definition, and expected neuronal and vascular effects underlying the discordance.

      The first publications of these discussions are being shared now. E.g. Chen et al. https://doi.org/10.1038/s41593-026-02288-y. The manuscript at hand augments this discussion. Specifically, the manuscript provides a direct statistical refutation of the recently proposed widespread physiological sign reversal between BOLD and CMRO2.

      By reanalyzing a high-profile dataset, the authors demonstrate that the previously reported 40% discordance rate is an artifact of statistical uncertainty rather than a genuine physiological phenomenon. This critical re-evaluation restores some confidence in the canonical interpretation of BOLD signals that was recently challenged. It highlights the necessity of rigorous statistical validation in quantitative fMRI.

      The following points should be addressed:

      (1) Absence of evidence is taken as evidence of absence

      The group-level significance analysis, summarized in the horizontal bar chart and cortical surface maps, labels non-significant voxels as 'CMRO2 not reliable', and the discussion concludes that positive BOLD responses are predominantly concordant with metabolism.

      The paper treats voxels with non-significant CMRO2 effects as 'statistically uncertain' rather than as potentially reflecting genuine null metabolic changes, conflating absence of evidence with evidence of absence. Because the 77.2% of voxels shown as light orange could reflect either real null metabolism or insufficient power, the paper cannot distinguish between these. This ambiguity matters because a genuine null metabolic response to positive BOLD would itself be physiologically interesting and would not straightforwardly support 'predominant concordance'.

      (2) Contextualization in other current literature

      I feel that the introduction of the paper could also consider the embedding of the current literature about biophysical processes in the negative areas.

      The negative responses have partly been discussed in the literature on quantitative physiology: e.g., Bohraus et al have been able to pinpoint the source of negative CMRO2 in positively activated voxels to large veins (https://doi.org/10.1016/j.celrep.2023.113341). Huber et al. have found that the neurovascular coupling (arterial venous weighting) is different in positively and negatively activated brain areas, making the interpretation of derived parameters on physiology hard.

      (3) Stylistic comments.

      In places, the tone of the language could be revised to ensure that it is perceived as making a constructive contribution to the discussion.

    2. Reviewer #2 (Public review):

      Summary:

      The rebuttal aims to provide a statistical re-evaluation of Epp et al. to investigate the effects of CMRO2 uncertainty on concordance/discordance analysis between BOLD signal responses and CMRO2 change estimates based on an R2 framework. The authors observe markedly higher variance in CMRO2 compared to BOLD, which raises concerns about sign classification purely based on group means/medians.

      Strengths:

      The study is well motivated, and the analytical pipeline is rigorous and has been provided. Overall, the manuscript provides several thoughtful and rigorous analyses that contribute meaningfully to the ongoing discussion surrounding neurovascular coupling and CMRO₂ estimation.

      Weaknesses:

      Some aspects of the analytical framework could be improved, as well as the discussion of the caveats of the methods of this and the original paper.

      (1) The binomial framework discussed on line 110 and described on line 321 reduces continuous ΔBOLD and ΔCMRO2 measurements to binary concordant/discordant labels, which may overemphasize unstable sign flips near zero effect sizes while discarding potentially meaningful magnitude information. The authors acknowledge that this overly strict approach yields very few meaningful voxels. A better justification or explanation of what we are meant to take away from this, other than the variability in the measurement, which is also explored elsewhere, would be helpful to the reader.

      (2) In the methods, in the section entitled: Voxel Selection: BOLD Activation Mask, the authors describe their more traditional univariate statistical method as compared to the PLS approach used in the Epp paper. While I appreciate why the authors chose this approach, which simplifies interpretation, is it possible that this led to a lower number of discordant voxels? If yes, then I would suggest this be also added in the discussion of how the original Epp paper's methodological choices led to the very large percentage of discordant voxels.

      (3) In the original paper, it looks to me like the discordant voxels have low CBF change and low rOEF. The gadolinium-based CBV measurement used to calculate OEF is a measure of total blood volume, while the blood volume that contributes to BOLD resides predominantly in veins and capillaries. Given the long PLD of the ASL acquisition and the total blood volume measurement, it seems to me that it is possible that discordant voxels may have high arterial blood volume, leading to overly large CBV measurement and an underestimation of CBF at this PLD (especially given their young age, for which I would expect ATT to be closer to 1-1.5s based on recent literature). While this is not currently discussed in this paper, it might be relevant to discuss how acquisition choices could bias some voxels towards erroneous CMRO2 estimates, which in turn would lead to these voxels being identified as discordant.

      (4) In the methods, on line 267, the authors describe how they calculated ΔCMRO2 and how it differs from the original paper. A short discussion of how this choice is likely to affect the variance estimates would be warranted, given that the original paper seems to have chosen their method for the explicit purpose of decreasing error propagation. Especially, I wonder if this difference could account for the observation that "77.2% of voxels showed no statistically significant group-level ΔCMRO₂ effect".

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors perform longitudinal mesoscale calcium imaging of visual and other cortical areas following binocular enucleation (blinding through the removal of the eyes) in adult mice. The study is observational and exploratory, and analyzes changes in the frequency distribution of calcium signals during locomotion and quiescence as a function of time after enucleation. They also analyze correlations between calcium signals in different brain regions to ask how apparent connectivity between regions changes over time. The main conclusions are (1) that there are multiple timescales of plasticity; (2) that the coupling between locomotion and activity in visual areas flips sign after enucleation, and (3) that correlations between brain areas are modulated by this long-lasting plasticity. Overall, the data are likely to be useful to researchers studying the impact of injury and catastrophic loss of sensory inputs on brain reorganization, but it is hard to draw firm conclusions from the observations provided beyond the very general conclusions listed above.

      Strengths:

      (1) The longitudinal imaging of multiple brain areas simultaneously allows the investigators to follow plastic changes in the same animals over time, to address questions about how apparent connectivity and brain state modulation unfold after injury.

      (2) The data suggesting a flip in sign of the coupling between movement and "activity" in visual areas is interesting and potentially novel.

      Weaknesses:

      (1) The mesoscale imaging has limitations. In particular, the authors use words/phrases such as "activity" and "functional connectivity" without ever discussing what the measures they provide with this approach (frequency distribution of summed calcium fluctuations, and the correlation between this measure across brain areas) actually mean, or how they approximate spike-based measures or cellular-resolution Ca signals. The manuscript would benefit from an in-depth discussion of these limitations.

      (2) In general, the figures are difficult to follow. In many cases, what is being plotted is hard to extract without a lot of work, and metrics are not well-justified. For example, they calculate the R value between movement power and spectral power of the Ca signal to quantify changes across time in the coupling between movement and activity (Figure 2). But from the example given, this does not look like a continuous relationship, and though R values are significant its not clear that this correlation is a good way of quantifying the change in sign they attempt to document. Figure 7 is impossible to read, and areas quantified are not indicated. The reader should not have to work this hard to figure out what they are plotting.

      (3) It would be reassuring to rule out an effect of repeated imaging on the metrics they describe here. Longitudinal imaging of the same duration without enucleation would be the best control. Alternatively, they do have multiple baseline measurements that they collapse into one value in most of their plots.

      (4) The discussion is very long. They spend a lot of time trying to relate their findings to the larger literature on visual deprivation, but because of differences in paradigms (enucleation, laser ablation, visual deprivation, binocular vs monocular) and differences in measures (see point 1), it's hard to draw conclusions. In my view, the manuscript would benefit from less speculation about plasticity mechanisms and more discussion of the strengths and weaknesses of their approach.

    2. Reviewer #2 (Public review):

      Summary:

      This study uses cortex-wide mesoscopic calcium imaging to investigate how adult vision loss induced by bilateral enucleation alters spontaneous cortical activity across behavioral states, including quiescence, locomotion, and anesthesia. The authors perform longitudinal imaging over two time scales, spanning days to weeks and weeks to months after enucleation, enabling them to track the changes of cortical reorganization.

      The main findings are that oscillatory activity in V1 undergoes a strong reversal in its relationship to behavioral state. Before enucleation, V1 activity is positively correlated with locomotion and negatively correlated with quiescence, whereas after vision loss, this pattern reverses. State-transition dynamics are similarly altered: locomotion onset shows reduced V1 activation, while cessation of locomotion is associated with increased activity after enucleation, while it caused suppression during baseline. In addition, the authors report an increase in slow-wave (0.1-4 Hz) activity in V1 after enucleation, starting in the first week and lasting over many weeks. Although these effects show partial recovery over time, many abnormalities persist for weeks to months.

      At the network level, the study reveals altered large-scale cortical organization, including reduced functional connectivity involving V1 that appears to remain impaired.

      Strengths:

      Overall, the work provides a thorough characterization of how adult vision loss reshapes cortical dynamics, particularly with respect to behavioral-state modulation.

      Weaknesses:

      However, there is also a lack of clarity due to the way the data are presented. Moreover, the study remains largely descriptive, as it does not address the mechanisms underlying these changes or their functional significance, making it difficult to interpret the broader implications of the observed cortical reorganization.

    3. Reviewer #3 (Public review):

      Summary:

      The authors track cortical activity across the dorsal cortex of head-fixed mice for up to ten weeks following bilateral eye removal, asking how the cortex reorganizes over an extended period after vision loss. They report a rapid and long-lasting reversal of the normal relationship between movement and visual cortex activity, together with a delayed, weeks-long window of enhanced slow-wave activity during rest and a persistent reorganization of large-scale cortical correlations.

      Strengths:

      The longitudinal scope is the work's strength. Tracking the same animals over a ten-week window after sensory loss is technically demanding and rarely done, and it yields a temporal picture that short studies cannot provide. The observation that the movement-related activation of the visual cortex inverts within a day and only partially recovers over weeks is striking and has not been documented at this timescale. The analysis is internally consistent across two protocols (short- and long-term) and frames the changes by behavioral state, focusing on rest versus movement. This is a useful analysis that the field has not systematically applied to studies of deprivation.

      Weaknesses:

      The manipulation is unusually severe: removing both eyes eliminates patterned vision, non-image-forming light input, and all residual retinal signals abruptly and irreversibly, in contrast to the milder and often reversible manipulations the discussion draws on. Without a sham-surgery control, the early effects cannot be cleanly separated from the surgery itself.

      The language of "plasticity" runs ahead of what the data actually measure, since the study quantifies spontaneous activity and pairwise correlations but does not assess receptive fields, evoked responses, synaptic changes, or the causal manipulation of any candidate circuit. The discussion nevertheless attributes findings to specific interneuron circuits, molecular pathways, and thalamocortical reorganization, none of which are tested in this study.

      The imaging method also constrains what can be claimed: widefield calcium signals are dominated by superficial-layer and excitatory output and cannot resolve the cell-type-specific mechanisms invoked in the discussion. Because the key findings lie in the low-frequency band where vascular contamination is greatest, the hemodynamic correction, particularly in the deprived state, where vascular tone itself may be altered, deserves more validation than it currently receives.

      Finally, the presentation relies heavily on group-level heatmaps in the main figures, with raw traces, spectrograms, and per-animal trajectories at the key inflection points (day 1, week 1, week 10) largely absent. This makes it difficult to judge whether the reported patterns are coherent across animals.

    1. Reviewer #1 (Public review):

      Summary:

      This study uses an encoding model approach to compare a range of different deep learning models in predicting functional MRI data, collected while participants played the game "Super Mario Bros" inside the scanner. The fMRI data is rich, within-subject data, with around 15 hours of gameplay for each of five participants who took part in the study. A range of models are compared, including deep RL models (PPO), behaviour cloning (imitation learning), supervised visual models (ResNet), and untrained but structurally equivalent models. The main metric of model comparison is brain prediction (i.e., cross-validated R^2, and within-subject generalisation to out-of-distribution gameplay), rather than focussing on which model features are being encoded.

      The core results are:

      (1) The deep RL and imitation learning models show a modest improvement in prediction accuracy relative to the untrained and visual models (around a 1-2% increase in R^2). Notably, this is against a background in which the untrained model - essentially random projections of the gameplay pixels - can explain around 6 or 7% of the variance in fMRI data (Figure 2). So, the improvement in model fit is a small (but significant) one, and a major driver of prediction scores appears to be low-level visual stimulation as opposed to gameplay prediction.

      (2) There is little variation across layers in prediction accuracy in the trained models. In the untrained model, prediction accuracy drops across layers. This suggests that the prediction accuracy in this untrained model results from its (early-layer) representations being closer to what is presented on screen - as the random weights move the untrained model's representation away from sensory features, it becomes less predictive of the brain. In a trained model, meaningful representations are maintained in deeper layers - and interestingly, there is no clear correspondence between layers of the model and layers of the visual pathway.

      (iii) There is a noticeable improvement in brain prediction by both the deep RL and imitation models with model training. In other words, the 1-2% increase in R^2 mentioned in point (i) is a result of the training, rather than any other factor.

      (iv) None of the models, including the untrained model, perform well in generalising to out-of-distribution data held out from the training/evaluation. This leads to the claim that the brain's encoding representations are 'brittle'.

      Strengths:

      (1) A major strength of the dataset is that it contains rich, extended naturalistic gameplay data within individual subjects. This mirrors some of the advantages seen in other naturalistic datasets (e.g., natural scenes dataset, storybook listening, video watching) - but there are very few examples of such data where the subject is controlling or generating the behaviour in the naturalistic task. This allows potentially new questions to be asked about how these representations are learned across time, within individual participants.

      (2) A further strength of the manuscript is the clarity with which the aims and hypotheses are articulated in the introduction, and evaluated/discussed throughout the paper. This provides a clear set of objective criteria against which to evaluate the performance of the resulting models; the paper is also written in a very clear and honest way, in that some of the a priori hypotheses are not supported - this makes for a more transparent report than one written in an a posteriori manner.

      (3) Finally, although the results in comparing different models are perhaps not as impressive as one might have hoped, the authors have been quite careful in making the models comparable in terms of their architecture and number of parameters, etc. This means that any variation in prediction is likely attributable to the different objective functions used to train the models, rather than other features of the model architecture.

      Weaknesses:

      (1) The work is currently framed as "training neural networks from scratch...leads to brittle brain encoding" - but I'm not sure that the results fully support this. First, the brittleness is still present in the untrained network (i.e., random projections of pixels), as shown in Figure 5b. This implies that the brittleness may not be a consequence of the network training, but of overfitting to the encoding (ridge regression) model of the fMRI data (as the authors acknowledge when presenting these results). I would instead encourage the authors to shift the emphasis slightly towards the (modest) improvement in prediction using the RL/imitation objectives, and/or the (similarly modest) improvement in prediction with training, rather than foregrounding the brittleness of the encoding.

      (2) While the analyses of how model prediction improves with training are nice, it is a shame that there is no consideration of how prediction improves (or otherwise) across the training of the participants. Do participants improve across the 15 hours of gameplay - or do they, for instance, become more predictable by the imitation learning model? Is this more true in the naïve participants than those with extensive past experience of Mario? And does this in any way lead to better alignment with model predictions across sessions? These all seemed like natural questions that could benefit from the unique longitudinal nature of this dataset, and it seemed a shame that they were not touched upon at all.

      (3) While there is little variation between the models in terms of predictive performance, it is currently a little unclear whether this is simply due to fitting a set of highly parameterised models to the data, or because the models are themselves fundamentally similar in their representations. One way to address the latter point might be to perform some kind of RSA or CKA (Kornblith et al, arXiv 2019; Williams et al, bioRxiv 2024) across the layer representations within-model, and between-models, to ask how similar (or different) the learned representations are between the different models used for fMRI prediction.

    2. Reviewer #2 (Public review):

      Summary:

      This paper aims to test whether training models to play video games from visual inputs through reinforcement learning leads to better matches to human visual encoding during gameplay, compared to models with the same architecture and training images but with different training objectives. The authors find a slight advantage for the RL model, but encoding performance and generalization overall are weak and variable.

      Strengths:

      This was a reasonable hypothesis to test, and the model comparisons adequately represent other possibilities for training a model of the given architecture. The ResNet proxy is a particularly interesting way to benefit from a larger model's pre-training while still using the same constrained architecture and training set.

      Weaknesses:

      I always prefer to see learning curves for models on the tasks they were trained on, just to contextualize their performance on the brain encoding results, but they are not shown here.

      The paper misses some of the relevant literature that has performed similar comparisons across learning objectives for visual encoding models, such as https://arxiv.org/abs/2112.02027 and https://pmc.ncbi.nlm.nih.gov/articles/PMC10569538/

      The authors end up advocating for the idea that large-scale pre-training is needed in order to build good visual encoders for matching human data. In many ways, this was already known (given that brain encoding scores scale with imagenet performance, which requires at least a moderate amount of general-purpose image training to achieve). However, they also note that "the brain encoding performance of the ResNet model was not significantly different from that of the Untrained model." I would assume that an ImageNet-trained ResNet would be in the direction of the type of large-scale pre-trained model the authors advocate for (even when not trained for action generation), yet their results don't support this direction being the solution. Are their results about Resnet not surpassing an untrained model consistent with prior work, and if not, why not? How do they view this in light of their argument for the use of larger models?

    3. Reviewer #3 (Public review):

      Summary

      In this paper, the authors have 5 human subjects learn to play Super Mario Bros while undergoing fMRI for 15 hrs each. They compare a reinforcement learning (RL) model (PPO), an imitation learning (IL) model, and a vision model (ResNet) in their ability to play the game, match human behavior, and, critically, explain human brain activity.

      The key findings can be summarized as follows:

      (1) RL, IL, and vision models explain similar amounts of variance in the BOLD signal (Fig 2a), with a significant but small trend of RL > IL > ResNet (Tab 1).

      (2) Untrained models with the same architecture explain a smaller but very similar amount of variance (Figure 2a, Table 1).

      (3) The brain maps across all models (and layers) are strikingly similar, with the strongest effects in visual, parietal, and motor regions (Figures 2b, 2d; Supplementary Material II).

      (4) Behavioral and neural performance are correlated across model checkpoints (but not levels), such that later checkpoints in training have better behavioral and neural encoding performance (Figures 3 & 4), although the neural effect plateaus pretty quickly.

      (5) Out-of-distribution performance is quite poor, both behaviorally (Figure 5a) and neurally (Figure 5b).

      I believe this work will be of interest to neuroscientists, cognitive scientists, and AI researchers alike. There has been a growing trend in neuroscience to adopt AI models as cognitive models of complex perception and action, while at the same time, AI researchers are increasingly looking at the brain for inspiration. The key finding of this paper -- that these models fail to generalize to out-of-distribution levels -- questions the core assumptions of this whole enterprise.

      Strengths:

      Unlike previous studies applying machine learning to naturalistic game-play, the authors take great care to make sure their models are evaluated on an equal footing, using equivalent or similar architectures/number of parameters and training data.

      While the number of subjects (5) is relatively small, the amount of data per subject (15 hours) is impressive, which is important for fitting the imitation learning & ResNet models and for obtaining reliable encoding performance for each individual subject. The authors employed a train/val/test split and held out sets, the gold standard in the literature.

      Overall, the paper was well-written and easy to follow. The figures clearly illustrate the main findings.

      Weaknesses:

      (1) Missing statistical tests

      I think the main weakness of the paper is that many of the claims are qualitative in nature and lack appropriate statistical tests, for example:

      - "The conv3 layer has the highest brain encoding score";<br /> - "Robust association between task performance and brain encoding" ;<br /> - "Level patterns strongly predict brain encoding";<br /> - "Brain encoding performance was severely degraded";<br /> - "Effect of training on brain encoding was apparent".

      While these effects are indeed qualitatively visible in the figures, it is unclear which of these differences are significant (with the notable exception of Table 1). I believe the paper would benefit substantially if these effects were quantified and every claim were supported by the appropriate statistical tests. As an example, with the exception of Table 1 and the corresponding paragraph, I could not find any p-values in the results section.

      (2) Missing model performance and human-likeness

      Also absent from the results is an assessment of model performance on the task and similarity to human performance/behavior. From Figures 3 and 4, we can see that the game score of PPO is around 500-1000 - how does that compare to the humans? We can also see that the imitation scores for IL are around 0.4-0.7, but what does that mean? Such results would be crucial to assess if the models have indeed learned to play the games and/or imitate the humans, and therefore, whether they would be good candidates as cognitive models (before even looking at brain activity). At minimum, plotting the human versus model game scores (see e.g. Tomov et al. 2023 Neuron, Figure 2) would be helpful; or, if you'd like to dig deeper, showing that human actions are more valuable or more likely under those models (see e.g. Cross et al. 2022 Neuron, Figure 2). It might also be helpful to look at imitation scores for the RL model and game performance of the imitation model -- I suspect they will both be bad, but they can at least serve as informative baselines for their counterparts.

      (3) Possible undertraining

      Relatedly, one possible explanation for why the Untrained model does so well is that all the models may be effectively undertrained. For example, while there are no training curves in the paper, it seems from the spacing of the checkpoint game scores (x-axis on Figure 3c) that the RL model may not have converged yet (it would be helpful if those were somehow colored by training epoch). Showing training curves would be helpful (i.e., something similar to Figure 3a, except with performance on the y-axis).

      Additionally, it would be great to provide more details regarding the PPO training protocol. How many episodes? How many steps per episode? How many steps for all of the training? Similarly, for the imitation learning model: batch size, number of epochs, optimizer, scheduler, etc.

      (4) Mysterious poor encoding performance of Untrained and ResNet models on the held-out set

      Critically, and related to that, I'm a little confused about the Untrained model results on the held-out set (Figure 5b, top row on the right). Why should those be any different from the test set results with the Untrained model (Figure 2a, right, fourth row from the top)? It makes sense why the other models are worse on the held-out set -- they have never been trained on any frames from those levels. However, the untrained model has not been trained on *any* frames from *any* levels, including the test set and the held-out set.

      The same is true for the ResNet model, which is pre-trained on a completely separate data set and yet similarly shows worse performance on the held-out set compared to the test set.

      This cannot be explained by the ridge regression, which has no parameters or hyperparameters fitted on either the test set or the held-out set.

      The big discrepancy in the untrained model & ResNet results between the test and the held-out set makes think that there is something substantially different about the levels in that held-out set; that they are truly out of distribution compared to the other 20 levels (e.g., maybe they're the last 2 hardest levels and look completely differently? e.g. ResNet proxy in Fig 5c shows worse performance than the mean, which is indicative of an anti-correlation). Alternatively, it may be some issue with the analysis pipeline. The poor generalization results are central to the claims of the paper, so I believe this should be clarified.

      (4) Brittleness conclusion rationale

      I'm not quite on board with the author's rationale that "[poor model performance on the out-of-distribution levels] demonstrates that the models we tested are limited in scope and may not provide a valid inference of brain-like processing, as human behavior remains robust and generalizable across levels".

      For one, unlike the models, humans were actually trained on those levels, so it would not be surprising if they perform just as well on them as on the other levels (but do they? Again, it would be great to see some behavioral data from the humans and the models).

      Second, as the authors themselves show, task performance and human-likeness do not really correlate with neural encoding across levels (Fig 4a & b, respectively), so even if model performance remained "robust and generalizable" on the held-out levels, that will not necessarily translate to good neural encoding.

      Thirdly, and perhaps most importantly, unless the test set and held-out set were sampled exclusively from the practice phase when the subjects have mastered all the levels (that doesn't seem to be the case, but the authors should clarify), then the humans are continuously learning, which means that their own internal representations of the game are evolving. That's not the case for the models, which I assume are in "inference mode" when their representations are extracted for neural encoding. That is, their weights are frozen. So there's a fundamental mismatch between the mode in which humans are operating (continuously learning and executing) and the mode in which the models are operating (just executing). While this is true for all the levels, it may partially account for the discrepancy in the held-out set specifically.

    1. Reviewer #1 (Public review):

      This study adds important data identifying how ocular motor neurons are transcriptionally specified and identifies additional genes important in ocular motor neuron function. The evidence supporting the claims is convincing, with bulk and single-cell RNA sequencing as well as functional testing of the vestibulo-ocular reflex. This work will be of interest to developmental biologists and eye movement specialists.

      Gershowitz, Hamling, et al investigate genes that specify specific cell populations within cranial motor nuclei III and IV, which control eye movements, by bulk and single-cell RNA sequencing, confirmatory in situ hybridization, and functional studies of vestibulo-ocular reflex in knock-out animals. They take advantage of the timing difference in the generation of dorsal versus ventral cells to selectively mark early-born (dorsal) vs late-born (ventral) cells using the Kaede photolabile protein. They used bulk RNASeq to identify differentially expressed genes between the two populations (which innervate different extraocular muscles). They next used single-cell RNASeq to further identify specific subpopulations of motor neurons and identify 3 main clusters, which broadly map to dorsal CNIII, CNIV, and ventral CNIII. They show that the differentially expressed genes identify subpopulations of neurons, rather than reflecting temporal changes related to cell age via a series of in situ hybridizations across ages. Finally, they show that knock-out of Sim1a, which is unregulated in dorsal nIII neurons, leads to decreased vestibulo-ocular reflex, despite a normal number of neurons in nIII. They tested the knock-out of two other differentially expressed genes, nav2a and onecut1, but found both normal cell number and normal vestibulo-ocular reflex.

      The conclusions of this paper are well supported by the data. As the authors acknowledge, additional experiments would add to the interpretation. Since the Sim1a mutants have normal cell numbers, the authors hypothesize that axon guidance may be disrupted, leading to the phenotype. This could be relatively easily assessed using the Isl1-GFP transgenic line and examining innervation patterns in the extraocular muscles. Additionally, testing horizontal eye movements and eye movements in response to visual, rather than vestibular, inputs would further refine the phenotypes and perhaps identify eye movement abnormalities in the mutant fish with normal VOR.

      More information on why these specific genes were prioritized for functional testing would be helpful, as it is unclear why these three genes were the top candidates.

      The authors should also include a discussion of other subtypes of oculomotor neurons, beyond which muscle they innervate. For example, there are oculomotor neurons that form single neuromuscular junctions on fast, singly-innervated fibers, and there is a separate pool of motor neurons that innervate the slow, multiply-innervated fibers. It would be interesting to note if there were any gene expression differences within the clusters that might represent this subdivision of neurons.

      This data is likely to be of great use to the field in further studies of cranial motor neuron biology.

    2. Reviewer #2 (Public review):

      Summary:

      The goal of the work is to identify genes that are uniquely expressed in subsets of eye muscle-innervating motor neurons, as a way to identify candidate genes for strabismus, a congenital vision disorder in humans. The author's previous work identified birth-order differences that correlate with the positions of neurons in the oculomotor (cranial nerve III) motor nucleus. Here, they use Kaede photoconversion to distinguish early- from late-born neurons and identified transcriptional differences between them by bulk RNA sequencing of FACS-sorted cells. Separately, they used single-cell RNA-Seq to sequence the transcriptomes of 89 extraocular motor neurons. They find signatures of early-born mIII, late-born mIII, and mIV neurons. While there is some overlap in gene expression, some of the differentially expressed genes are confirmed by HCR as being unique to one of these three populations of extraocular motor neurons.

      The authors test the functions of three differentially expressed genes in the vestibulo-ocular reflex by measuring the speed of rotation of the eye in response to the larval fish being tilted 15° from horizontal. One mutant, in the sim1a transcription factor, has markedly slowed responses. Although this is a global knock-out, the authors argue that this defect in the vestibulo-ocular reflex is due to a loss of sim1a function specifically in dorsal mIII neurons because sim1a is not expressed in the two upstream neurons in the vestibulo-ocular reflex circuit.

      Strengths:

      (1) This is the first time that transcriptional differences between and within extraocular muscle-innervating neurons have been described during development. In identifying differentially expressed genes that correspond with anatomical, functional, and temporal subdivisions of these neurons, they support the idea that gene expression programs established early in development underlie the functional differences amongst these neurons.

      (2) The combination of bulk RNA-Seq and single-cell RNA-Seq strengthens the identification of sim1a-expressing early-born mIII neuron subtype.

      (3) The work identifies candidate genes for strabismus.

      Weaknesses:

      (1) The authors show that sim1a is only expressed in mIII neurons and no other cells in the vestibulo-ocular reflex, as evidence that the phenotype in sim1a mutants is due to loss of its expression specifically in mIII neurons. However, as the authors note in the discussion, sim1a has other functions in zebrafish, including global calcium homeostasis via specification of the corpuscles of Stannius. The loss of this, or of some other sim1a function, could be indirectly responsible for the slow vestibulo-ocular response in sim1a mutants.

      (2) The authors perform the vestibulo-ocular response test in sim1a mutants at 7 dpf, which is within a day of when the mutants die, raising the concern that the slowed response is due to a dire systemic condition. The argument that nav2 mutants also die at 7 dpf but have a normal response is weak, since death does not always take a single course.

      (3) The evaluation of the sim1a mutant phenotype is limited to the vestibulo-ocular reflex. The authors do not explore whether the oculomotor neuron innervation of target extraocular muscles is affected in sim1a mutants.

    1. Reviewer #1 (Public review):

      Summary:

      In this article, the authors couple a 3d vertex model to the extracellular matrix and include activity through contractile springs at the edge. They study, sequentially, the distribution of shear stresses in liquid and solid spheroids, the correlation between stress and cell shape, and the spatial distribution of stresses. The authors find that stresses are higher in solid spheroids (somewhat unsurprisingly), but that the stress distributions are wider in the fluid spheroids. Moreover, stress and shape are not correlated with each other in solids (that seems to be due to vertex model peculiarities), but they are for liquids. In contrast, for solids, the stresses are concentrated at the interface.

      The authors attribute a lot of the phenomenology to strain-stiffening properties of vertex models as being akin to a network model (correctly in my opinion). Then they strain individual cells and confirm this link, though I missed any explanation of how they did this. Would it have to be within a medium for computational consistency?

      Finally, they generate an extended vertex model, where they replace the single face linking cells with a double face and mechanoresponsive springs. This allows for stronger coupling of individual cell motion to eventual movement out of the spheroid.

      Strengths:

      Coupling a three-dimensional vertex model to the extracellular matrix, modelled as a crosslinked fiber model, is a computational tour-de-force. Adding activity through fluctuations at the interface is also of the correct symmetry (stresses), instead of the self-propulsion which has been used by other authors, and which is not compatible with Newton's 3rd law. This also allows for accurate back-and-forth mechanical coupling between the cells and the ECM.

      I would like to highlight that deriving vertex model stress tensors in full three dimensions is an open problem due to the complex topology. Any progress is valuable, and decomposing things into tetrahedra like here will allow for connections with, in particular, finite element approaches. Therefore, adding some of these results (eq. 13) to the main text would strengthen the paper in my opinion.

      Adding the nonlinear springs to the VM in the 3rd act is a good idea, and a first step to mechanical feedback. One might argue that at this point, removing the vertex model part would even be an option.

      Weaknesses:

      The paper is written in a very qualitative manner, with all of the model equations and analysis hidden in the supplementary information. I do not understand this choice, as it makes things fuzzy and hard to read. The conclusion is also very long and simply reiterates the previous points.

      At the same time, this paper is rather thin on new results and reads more like a handful of new simulations carried out using the method established in [10] (from largely the same authors). Moving some of the actual results to the main text would help, in particular, the 3d stress formulation and the definitions of different measures.

      Vertex models also have a very clear limitation: They cannot model the transition from a confluent to a non-confluent tissue, and individual cells or groups of cells leaving the spheroid. Even having a surface and having significant deformations of the surface are numerically dicey, so the current model is at the edge of what is feasible. The model as written can only do "invasion" by a single cell moving outward, and then another following it a bit (or not).

      I strongly suspect that further progress on 3d cell models will need particle-based models or models where cells are fully meshed surfaces (some of which are in development currently).

      However, none of these problems is mentioned anywhere in the text. The authors also do not review the increasingly broad zoology of other models.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript concerns the mechanisms by which cells in a spheroid embedded in the extracellular matrix can escape, either as single or multiple cells.

      Strengths:

      Overall, the manuscript is well written and easy to follow. The claims are mostly justified by the data. Some data can be better analyzed and presented to strengthen the conclusion.

      Weaknesses:

      (1) The description around Figure 2c is not exactly well supported by their results. While values close to 0 for sigma3 dot g3 for solid-like spheroids indicate little correlation between the direction of maximum stress and maximum elongation, this analysis alone does not imply that highly stressed cells are necessarily less globular. The dot product combines the magnitudes of the two vectors and the angle between them. For the distribution graph, it would be useful to have the cumulative frequency equal 1.

      (2) One of the central claims of the paper is that morphology alone is not a reliable indicator of mechanical state. Since the authors compute cellular stresses and cellular shape in their simulation (i.e., Figure 3a and b), can the authors directly plot these two quantities for individual cells in solid-like and fluid-like spheroids?

      (3) There is experimental evidence showing the solid stress inside a spheroid is higher than at the periphery (e.g., https://www.nature.com/articles/ncomms14056). How does this cellular stress relate to these experimental measurements, since they are opposite to what is simulated here (i.e., the authors find max shear stress is lowest in the center and increases towards the boundary, which is opposite to what is measured?

      (4) It's worth pointing out that stress fibers aren't really prominent in cells in 3D spheroids. Nonetheless, cells moving on collagen fibers would have stress fibers and utilize contractile actomyosin bundles to generate traction forces.

      (5) In section 2D, it talks about the result that as the kcc associated with the boundary cell is decreased 10-fold for every 5 percent strain decrease in the fiber target spring length, can this result be shown? I have a hard time seeing where this came from.

      (6) The results of single-cell vs. two-cell breakouts shown in Figure 5 b and c are very qualitative and should be accompanied by some quantitative comparison.

    3. Reviewer #3 (Public review):

      Summary:

      The authors describe a mathematical and computational approach used to compute stresses and cellular deformations in a multicellular spheroid embedded in a fiber network. This approach is then used to predict stress and cellular anisotropy distributions in "solid-like" and "fluid-like" spheroids. Simulations show that shear stresses in solid-like spheroids are large and concentrated at the boundary of the spheroid, yet cells do not align with the direction of the largest shear. Conversely, shear stresses in fluid-like spheroids are smaller and uniformly distributed in the spheroid. In this case, cellular elongation is more likely to be aligned with the direction of the largest shear stress. The model and simulations also predict a nonlinear stress-strain relationship that is indicative of strain stiffening. This strain-stiffening is more pronounced in fluid-like spheroids. In an extension of the preliminary polyhedral vertex model, in which cellular interfaces are shared, the authors incorporate mechanical cell-cell interactions via adhesion springs between neighboring vertices. Using this extension, they show that cell breakout is more likely to occur in fluid-like spheroids, where cells are more likely to elongate and stiffen, allowing for larger forces to be exerted on the surrounding fiber network. Furthermore, the authors state that anisotropic cell-cell adhesion is required for multicell streaming during breakout.

      Strengths:

      The modeling and computational approach used in this research is this work's biggest strength. Treating the embedded spheroid as a set of polyhedra, where each polyhedron represents a single cell, is a mechanically robust, yet still tractable way to model multicellular spheroids in three dimensions. Starting with expressions for constraining cell volume and surface area as well as a surface energy term, the authors derive an expression for an averaged stress tensor for each polyhedron. This allows the authors to approximate the stress in each polyhedral cell that is caused by cellular deformations during mechanical interactions with the extracellular fiber matrix. This is a clever and robust approach that is based on fundamental mechanical principles that allow one to make reasonable predications about the mechanical state of the spheroid under a variety of conditions.

      Weaknesses:

      The weakness of the manuscript is the exposition. There are significant pieces of critical information missing from the manuscript that would make the presented work significantly more understandable and better support the authors' claims. Most importantly, many necessary details of the model are missing. I was able to get a better understanding of some of these details by reading the authors' earlier work (ref [10] in the submitted manuscript), and for this reason, I do feel that this work has value. However, several descriptions must be added for the paper to be more readily understandable. These include (1) a better explanation of what drives motion, in particular in the case where no external fiber network is present. (2) What physically distinguishes fluid-like spheroids from solid-like spheroids? Simply stating the value of the parameters s0 with no explanation is not sufficient. (3) An explanation of how histograms in Figure 2 are calculated is necessary. Are these histograms based on one simulation or several simulations? (4) The experimental results are briefly mentioned, but significantly more connection between these results and the numerical results of the cell breakout model is needed. (5) The description of the model that incorporates variable cell-cell attachments and cell breakout is very terse and needs more detail. Moreover, while the description of the results of this model is strong, the figure that illustrates cell breakout (Figure 5) is difficult to interpret. Addressing these and other issues will make the current manuscript, which presents an interesting model and result, much stronger and easier to read.

    1. Reviewer #1 (Public review):

      Summary:

      In this paper, the authors conduct both experiments and modeling of human cytomegalovirus (HCMV) infection in vitro to study how the infectivity of virus (measured by cell infection) scales with the viral concentration in the inoculum. A naïve thought would be that this is linear in the sense that doubling the virus concentration (and thus the total virus) in the inoculum would lead to double the fraction of infected cells. However, the authors show convincingly that this is not the case for HCMV, using multiple strains, two different target cells, and repeated experiments. In fact, they find that for some regimens (inoculum concentration) infected cells increase faster than the concentration of the inoculum, which they term "apparent cooperativity". The authors then provided possible explanations for this phenomenon and construct mathematical models and simulations to implement these explanations. They show that these ideas do help explain the cooperativity, but can't be conclusive as to what is the correct explanation. In any case, this advances our knowledge of the system and it is very important when quantitative experiments involving MOI are performed.

      Strengths:

      Careful experiments using state-of-the-art methodologies and advancing multiple competing models to explain the data.

      Weaknesses:

      Minor weaknesses in explaining the implementation of the model. However, some specific assumptions, which to this reviewer were unclear, could have substantial impact on the results. For example, whether cell infection is independent or not. This is expanded below.

      In the revised version, the authors address almost all of these minor weaknesses, strengthening the paper and its reproducibility.

      Suggestions to clarify the study:

      In the revised version, the authors carefully consider these suggestions and provide further details, clarifications and even some new results. Regarding the question of how infection of a cell with one virus could lead to lower probability for a secondary infection, I think that it is possible that infected cells activate antiviral programs that lead, for example, to lower expression of surface receptors. This has been considered at least in hepatitis C virus infection. However, this is a minor point.

      Overall, I think the revised version provides a sound study with relevant conclusions, and I thank the authors for their thoughtful consideration of my previous comments.

    2. Reviewer #2 (Public review):

      In their article, Peterson et al. wanted to show to what extent the classical "single hit" model of virion infection, where always the same quantity of virion is required to infect a cell, does not match with empirical observations based on human cytomegalovirus in vitro infection model, and how this would have practical impacts in experimental protocols.

      Strengths:

      - The use of a very simple and robust experimental assay, where they infected cells with serially diluted virions and measured the proportion of infected cells with flow cytometry. This convincingly showed how the proportion of infected cells differed from a "single hit" model which they simulated using a simple mathematical model ("power-law model"), and better fitted a model where virions need to cooperate to infect cells.

      - The use of different cell types and virus strains, which allows to draw some generalizations.

      - The exploration of the mechanisms that could explain this apparent cooperation, using biologically plausible simulations.

      - The practical consequences that this phenomenon has for lab virologists as well as modelers.

      Weaknesses:

      - The impossibility to discriminate between biological mechanisms is an important limitation of this study and calls for developing experimental designs able to further understand this question.

      - The outcome of the virion clumping remains highly sensitive to the choice of the clumps size distribution, which is itself very complicated to estimate, especially at high dilution.

      - The impossibility to directly fit the mathematical models to the data limit them to a qualitative discussion.

      Overall, this work is very valuable as it raises the general question of how the estimate of infectivity can be biased if extrapolated from a single virus titer assay. The observation that HCMV virions often cooperate and that this cooperation varies between context seems robust. The putative biological explanations would require further exploration.

      This topic is very well known in the case of segmented viruses and the semi-infectious particles, leading to the idea of studying "sociovirology", but to my knowledge this is the first time that it was explored for a non-segmented virus, and in the context of MOI estimation.

    1. Reviewer #1 (Public Review):

      This study by Charendoff et al provides interesting observations related to global histone hypermethylation in host cells, during Chlamydia trachomatis infections. The core observation they report is that the host histones are highly hypermethylated during infection, and this appears to be an amplifying effect due to continuous inhibition of demethylases, in part due to a metabolic shift in the host where succinate amounts (which inhibit demethylases) increases. The authors claim specifically due to the bacteria, since antibiotic treatment prevents histone hypermethylation (but leaves you wondering about cause/consequence correlations).

      The core observation of hyper methylation is very interesting, and well documented. There are a number of points to consider though in order to fully substantiate the findings, and close out loose ends. My comments are broad - and built around the interpretations (vs the data presented).

      (1) Related to observations coming Fig 1C etc, and connecting to Fig 3 - the hyper methylation appears to be across different protein arg/lys residues - and is not histone specific. So, is it just a consequence of high SAM pools and flux in infected cells? i.e. the bacterial infection increases SAM pools in cells, and provides an increase in substrate pools for the methyltransferases, leading to protein hyper methylation. The approach used here only measures steady-state SAM amounts (and not SAM flux or utilisation). For example, reduced SAM amounts in nuclei could be due to increased utilisation of SAM. The experiments done with the demethylase does not actually answer this question - if you decrease demethylase activity, you will get an increase in net methylation. The authors see an increase in net methylation in the infected cells - this would suggest that in addition (or perhaps primarily) to reduced demethylase activity, there could be much higher SAM utilisation/flux. Again, the over expression of JMJ proteins does not resolve this problem.

      (2) Adding to this - what happens to SAM pools in the cells treated with the inhibitors? This actually may not look like the slightly reduced SAM pool observed in infected cell nuclei. Also, what is the SAM/SAH ratio (a very useful indicator of methylation activity).

      (3) There is a correlation/implication issue here in Fig 2 - cells with C. trachoma's infection show hyper methylation. But these are the only cells with high C. trachomatis. So it is a bit ingenious to say that histone hyper methylation correlates with bacterial proliferation. The cells without bacteria don't have hyper methylation - and that does not have anything to do with the bacterial proliferation.

      (4) The claim that demethylase activity is down in infected cells again comes primarily from the increased succinate (2-fold) amounts in infected nuclei - and then correlated with experiments where succinate, (permeable) a-KG are supplemented in excess. While I personally like the hypothesis that the hypermethylation might be a result of an imbalance in cofactors (succinate vs a-KG) in infected cells, the data presented is very premature to make that conclusion. Again, steady state measurements of only succinate cannot provide a clear answer to that question. For example, is there a clear allocation/flux difference (between a-KG, and leading out to glutamate/glutamine, vs flux through the TCA and increased succinate accumulation? Is there a bottleneck/build-up of succinate in cells that might lead to the increase in nuclei? This also opens another direction of possible regulation - increased histone succinylation. When you see a large increase in succinate in the nucleus, before looking at demethylase activity - it becomes obvious if succinate itself increases histone succinylation (through HATs).

      (5) What might the authors hypothesise about why this hyper methylation happens? It appears in some ways that hyper methylation happens - potentially due to a metabolic bottleneck that the bacteria triggers (and there is a build-up of SAM and/or succinate, and altered flux out of a-kg). The methylation is just a visible outcome - but may not be central to pathogenesis or viability.

    2. Reviewer #2 (Public Review):

      Strengths:

      (1) Because the study compares genuinely infected cells with uninfected cells within the same infected cell population, it enables a clearer and more rigorous comparison.

      (2) By using multiple Chlamydia species and cells from multiple host species (human and mouse), and obtaining consistent findings across these systems, the study demonstrates the generality of bacterium-induced epigenomic alterations.

      (3) The study shows that the epigenomic changes are caused by reduced activity of JMJC domain-containing lysine demethylases, demonstrating through multiple complementary approaches-including the use of a demethylase inhibitor, overexpression of target-specific demethylases, and analysis from the perspective of cofactors required for JMJC domain-containing demethylases-that decreased lysine demethylase activity constitutes the molecular mechanism underlying the increased H3 methylation levels induced by Chlamydia infection.

      (4) By performing ChIP-seq analyses of H3K4me3 and H3K9me3, the study clearly delineates, on a genome-wide scale, how infection leads to increased levels of these epigenomic marks.

      Weakness:

      (1) Reduction of cofactors such as Fe2+ or a-KG decreases the activity of JMJC-domain-containing lysine demethylases (thereby directly affecting histone H3 lysine methylation). However, these cofactors are also involved in the activities of other epigenetic regulators, such as TET enzymes that contribute to DNA demethylation and SIRT family proteins that mediate histone deacetylation. Therefore, it cannot be excluded that modulation of these factors indirectly leads to the changes in H3 lysine methylation dynamics targeted in this study.

      (2) Related to point 1, although overexpression of JMJC-type demethylases has been shown to reduce the Chlamydia infection-induced increase in H3 lysine methylation, it is well known that over production of these enzymes, while target-specific, also leads to a genome-wide reduction of lysine methylation. Thus, a decrease in lysine methylation upon expression of these demethylases does not necessarily demonstrate that the infection-induced increase in H3 lysine methylation is caused by impaired JMJC-type demethylase activity.

    3. Reviewer #3 (Public Review):

      In this manuscript, the authors explore a molecular basis for hypermethylation of histones in epithelial cells infected with the obligate intracellular bacterial pathogen Chlamydia trachomatis. This is of particular interest given that Chlamydia is known to drastically alter host cell gene transcription, and histone hypermethylation would suggest a new way by which Chlamydia interferes with gene expression of its host. Histone methylation was previously implicated in the introduction of dsDNA breaks in infected cells, and the chlamydial effector NUE was reported to methylate histones, but the role of this modification in dictating host cell gene transcription has been unexplored. The authors use a suite of tools to approach this question, including various -omics techniques, genetic approaches, and biochemical assays. Overall, the manuscript provides many interesting pieces of data, though some of them are difficult to reconcile, which may reflect methodological hurdles that are not fully addressed in the current version of the manuscript. My major concerns regard the rationale/interpretation for various mechanistic experiments and that the heterogeneity of the histone hypermethylation phenotype is not addressed which I believe may explain some apparent inconsistencies in the results.

      Using an immunofluorescent approach, the authors show that a subpopulation of the nuclei in Chlamydia-infected cells (~10-20%) exhibit high amounts of methylated histone species. This occurs during the late stages of infection, near the time when Chlamydia would lyse the host cell and positively correlates with bacterial burden. Accordingly, halting chlamydial growth blocks the onset of histone hypermethylation. Exogenously supplying cofactors for histone demethylases, the low activity of which is implicated in the histone hypermethylation phenotype, reduces histone hypermethylation. In general, these data are compelling and raise interesting questions about the role of histone methylation in governing chlamydial egress from infected cells. Interestingly, these behaviors seem to arise independently of NUE, the secreted chlamydial histone methyltransferase, supporting the notion that a metabolic reprogramming may underlie the hypermethylation phenomenon.

      As noted above, the authors propose that hypermethylation arises due to decreased demethylase activity in infected cells. However, the data do not conclusively support this interpretation. For example, the approaches used to probe demethylase activity rely on (i) a direct biochemical measure of demethylase activity, (ii), pharmacological inhibition of demethylase, and (iii) heterologous expression of a specific demethylase. With the exception of (i), these approaches would be expected to alter histone methylation regardless of the source. That is, inhibition of demethylases should increase histone methylation regardless of whether the source of methylation is increased methylase or decreased demethylase activity. Similarly, overexpression of a demethylase would be expected to reduce cognate histone methylation arising either from increased methylase or decreased demethylase activity.

      Moreover, the authors report that the effect of the demethylase inhibitor on histone hypermethylation is significantly potentiated by infection, suggesting that infected cells have greater methylase activity than uninfected cells, because the latter barely respond to the presence of demethylase inhibitor. In other words, a dramatic increase in histone methylation in the presence of demethylase inhibitor is most parsimoniously explained by increased methylation (no longer being removed by demethylase), not decreased demethylation (which would be analogous to treatment with demethylase inhibitor). The authors do not directly assay methylase activity. These concerns extend to the rationale used to justify experiments with infected mice, which the authors treat with the demethylase inhibitor.

      The authors perform experiments to characterize the consequence of hypermethylation genome-wide. Because the authors do not enrich for those cells which exhibit histone hypermethylation, the results reflect the mixed population, and therefore presumably dilute out important signal related to the phenomena under investigation. For example, the proteomic analysis of post-translational modifications identifies only one methylated histone species, whereas the immunofluorescent approach shows consistent effects across five different methylated histone species. Moreover, the chromatin immunoprecipitation analysis indicates that there is unexpectedly a lower density of methylated histones at regions which are also enriched in uninfected cells. The authors argue that this suggests increased methylation is happening "outside" of these histone-dense regions, but direct evidence in support of this claim is lacking.

      In sum, this paper provides compelling evidence in support of the notion that histones are hypermethylated at various residues late in chlamydial infection, that this process is modulated by known cofactors of demethylases, and is the result of high levels of bacterial replication in the cell. That histone hypermethylation governs host gene transcription during chlamydial infection suggests a relatively novel mechanism by which Chlamydia subverts the host cell to establish a replicative niche or egress to infect a new cell. The information obtained regarding the methylation status of host proteins and host gene transcription controlled by a metabolic cofactor during infection will be a useful resource for other researchers. However, in the current version of the manuscript, the mechanistic basis for these behaviors is relatively unclear.

    1. Reviewer #1 (Public review):

      Summary:

      This study addresses an important question in reinforcement learning and metacognition by distinguishing value confidence from decision confidence and testing how each is computationally represented. The findings are significant because they suggest that value confidence is well captured by Bayesian uncertainty, whereas decision confidence reflects a hybrid computation combining probability correct with broader value certainty. The evidence is promising, supported by multiple datasets and model comparisons.

      Strength.

      (1) A major strength of the study is that the authors test their hypotheses across multiple datasets, including previously published datasets and newly collected data. This broad empirical approach increases the generality of the findings.

      (2) The Bayesian model of value confidence has a clear theoretical basis. The proposed hybrid model of decision confidence is also intuitive. It appears to capture important aspects of the decision confidence data.

      (3) The paper provides a useful framework for linking how certainty about value estimates guides the subsequent choice and the corresponding decision confidence.

      Weakness

      (1) The conceptual link between value confidence and decision confidence is not yet fully established. The manuscript argues that overall value certainty contributes to decision confidence, but this conclusion is based largely on the latent variable that the model infers from the decision confidence experiment alone. A more direct test would require measuring value confidence and decision confidence within the same participants and task, and analysing how these two types of confidence interact.

      (2) The individual-difference analyses in Figure 5 are methodologically challenging. The predictors used in these analyses are derived from model fits to the behavioural data and are then correlated to behaviour in the same task. This creates a risk that correlations inevitably arise. Thus, it does not assure that correlations are cognitively meaningful.

      (3) The model recovery results suggest that some candidate models are not clearly distinguishable.

      (4) The manuscript would benefit from clearer explanations of why specific models capture particular behavioural patterns.

      (5) The claim that value confidence modulates the exploration-exploitation trade-off should be interpreted carefully, because the model uses global uncertainty across both options, not option-specific value confidence.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors propose a common value-estimation framework based on Bayesian inference and show that it can account for both participants' confidence in their value estimates ("value confidence") and for their confidence in their final choices ("decision confidence").

      Strengths:

      The study extends several established findings in the confidence and reinforcement-learning literature. In particular, the authors not only examine decision confidence but also directly model value confidence, and they replicate the idea that decision confidence reflects a combination of multiple computations, previously described for categorical decisions (Navajas et al., 2017), in the context of continuous value-based decisions. I therefore consider the work a useful contribution to the field.

      Weaknesses:

      However, I believe that the scope of the conclusions is overstated relative to the results that are actually presented.

      (1) Interaction between value confidence and decision confidence

      The abstract and introduction frame the study as addressing a major gap in the literature, namely, the lack of direct investigation of the interaction between value confidence and decision confidence. Yet the manuscript never directly tests the interaction between these two quantities. Instead, the authors show that the reported decision confidence depends not only on the probability of being correct, but also on the precision of the decision variable DV, which is related to the precision of the value estimates underlying value confidence. While this is related to the proposed research question, it is not a direct analysis of the interaction between value confidence and decision confidence themselves.

      (2) Unified computational framework

      Similarly, the claim that the study provides a "unified computational framework" appears somewhat overstated. The proposed models build on standard and well-established Bayesian frameworks and extend them specifically to account for decision confidence. While this demonstrates that both forms of confidence can be expressed within a common Bayesian formalism, the manuscript does not establish a direct computational interaction or shared mechanism between them beyond their dependence on the same underlying uncertainty estimates.

      (3) "Phenotypes" interpretation

      The interpretation of the observed individual differences as distinct "behavioural phenotypes" also appears overstated. The reported analyses primarily show continuous variability across participants in the relative weighting of different components contributing to confidence reports, rather than evidence for qualitatively distinct categories or computational subtypes of decision-makers.

      (4) Decision confidence terminology

      I also found some conceptual ambiguity in the terminology used throughout the manuscript. Early in the paper, decision confidence is defined normatively as the subjective probability of having made the correct choice, corresponding to P(DV>0). Later, however, the authors show that participants' confidence reports are better explained by a combination of this probability and the precision of the decision-variable distribution. Despite this distinction, the manuscript continues referring to the reported quantity simply as "decision confidence." Clarifying the distinction between the theoretical construct and the empirical reports (for example, by referring to "reported decision confidence") would improve conceptual clarity.

    3. Reviewer #3 (Public review):

      Summary:

      Comay, Solovey, and Barttfeld aim to provide a unified computational account of confidence in reinforcement learning by distinguishing value confidence-the certainty associated with latent value estimates-from decision confidence-the confidence that a particular choice is correct. Across new experiments and reanalyses of previously published datasets, they argue that value confidence is best described by Bayesian posterior precision, that this form of confidence adaptively reduces decision noise as learning progresses, and that decision confidence is better captured by a hybrid model combining Bayesian probability correct with a more global estimate of value certainty. They further propose that individual differences in the relative weighting of these components define "confidence phenotypes" that predict task performance, exploration-exploitation behavior, and metacognitive accuracy.

      Strengths:

      A major strength of the study is that it addresses an important conceptual distinction that is often blurred in the confidence literature. The paper usefully separates uncertainty about latent environmental states from confidence in an action derived from those latent beliefs. This distinction is especially important in reinforcement learning, where uncertainty is not merely a retrospective judgment about accuracy but can directly shape future sampling, learning, and action selection. The manuscript is therefore well positioned to bridge work on Bayesian confidence in perceptual decision-making with work on uncertainty-guided learning and exploration.

      A second strength is the authors' use of multiple datasets and model comparisons. The claim that value confidence tracks Bayesian uncertainty is supported across tasks in which participants explicitly report confidence in value estimates, including datasets where reward variance is manipulated. The latter manipulation is particularly useful because it helps distinguish a Bayesian uncertainty account from simpler models based only on the number of observations. The finding that value confidence modulates the softmax slope and thereby promotes more exploitative choices as uncertainty decreases is also theoretically coherent and supported across several datasets, including a preregistered replication.

      The manuscript's most interesting and potentially impactful contribution is the hybrid model of decision confidence. The authors show that a model based only on Bayesian probability correct captures confidence on correct trials better than on incorrect trials, whereas adding an "overall value confidence" term improves the fit. This is a useful result because it suggests that confidence reports in reinforcement learning may not be a pure readout of decision-level discriminability, but instead may combine decision-specific evidence with more global latent-state uncertainty. This could help explain why human confidence often deviates from ideal Bayesian predictions, especially on error trials.

      Weaknesses:

      However, the interpretation of the hybrid model remains the main weakness of the paper. The second term, overall value confidence, is not equivalent to the precision of the decision variable. It can dissociate from decision difficulty: two options can be far apart but individually uncertain, or nearly identical but individually well estimated. The authors appear to recognize this issue and have reframed the term as "overall value confidence" rather than decision-variable precision. This is a useful clarification, but the conceptual role of the term still requires sharper treatment. In its current form, it is sometimes described as part of a unified confidence computation, but it may be more accurately understood as a biasing or contextual signal that modulates reported confidence without necessarily improving decision calibration.

      A related concern is model identifiability. In many reinforcement-learning tasks, probability correct and overall value confidence both change systematically over the course of learning. As a result, the hybrid model may gain predictive power partly because it captures generic time-on-task or learning-progress effects, rather than because participants explicitly combine two separable uncertainty signals. The manuscript would be stronger if it more clearly demonstrated that the two latent variables are distinguishable in the behavioral data, for example, through model recovery, parameter recovery, cross-validated prediction, and analyses of the correlation between latent regressors across task conditions and individuals.

      The link between the decision rule and confidence model also deserves more scrutiny. The authors use value confidence to modulate decision noise in the choice model, and then use a related global value-confidence term in the confidence-report model. This creates an appealing unified architecture, but it also raises the possibility that the same latent variable is doing multiple kinds of explanatory work. The paper would benefit from a clearer separation between uncertainty as a driver of choices, uncertainty as a determinant of confidence reports, and uncertainty as an inferred latent variable extracted from the same behavioral data.

      From a computational neuroscience perspective, the manuscript would also benefit from a more explicit discussion of how these confidence quantities might be represented neurally. The current model treats value confidence, probability correct, and overall value confidence as scalar latent variables available to the observer. Yet uncertainty-related computations may be represented nonlinearly in neural population activity rather than as explicit scalar readouts. Work on nonlinear neural decoding and population codes has shown that task-relevant variables can be carried by nonlinear statistics of neural activity, especially when nuisance variables obscure mean tuning, and that behavioral choices can reveal whether such nonlinear information is efficiently decoded. This literature provides a useful framework for connecting the present behavioral model to possible neural implementations of value and decision confidence.

      Overall, the authors largely achieve their goal of demonstrating that value confidence and decision confidence are computationally dissociable in reinforcement learning. The evidence for Bayesian value confidence is strong, and the evidence that confidence-guided exploitation improves the account of choice behavior is convincing. The evidence for the hybrid account of decision confidence is promising but would be strengthened by additional analyses clarifying model identifiability, the interpretation of the overall value-confidence term, and the conditions under which the model makes distinct predictions from simpler time-, value-, or evidence-based alternatives. The paper is likely to be useful for researchers interested in computational models of confidence, metacognition, and adaptive behavior under uncertainty.

    1. Reviewer #1 (Public review):

      This study provides evidence that the apicoplast-locaized isoform of acyl-carrier protein (ACP) has acquired important non-enzymatic functions in the malaria parasite. Previous studies have shown that the apicoplast-located FASII-dependent pathway of fatty acid synthesis is not essential in Plasmodium blood stages. In contrast, genome-wide knockout studies suggested that ACP, a key protein in this pathway, is essential in these stages, indicating that it may have additional non-canonical functions. In this study, the authors confirm that ACP is essential in Pf blood stages (using both apicoplast IPP rescue and conditional knockdown); show that this essential function requires modification with 4-phosphopantetheine and use proximity biotinylation and complementary immunoprecipitation pull-down approaches to provide compelling evidence that ACP binds to and stabilizes the apicoplast-located isoform of pyruvate kinase II. Notably, these interactions appear to differ from those associated with the binding of mitochondrial isoforms of ACP to proteins involved in Fe-S biosynthesis. Loss of ACP was shown to lead to a decrease in PKII levels and apicoplast DNA/RNA synthesis, consistent with loss of NTP synthesis in this organelle. The data are clear and very well described, and the findings represent a significant advance in our understanding of metabolic regulatory mechanisms in apicomplexan apicoplast studies.

      Strengths:

      The study uses a variety of complementary genetic approaches to demonstrate the essentiality of ACP and the enzyme involved in its activation with 4-PP in Pf blood stages, demonstrating that the ascribed non-enzymatic function is mediated by holo-ACP. Similarly, a number of complementary biochemical approaches, including proximity biotinylation, immunoprecipitation, and co-expression of PfACP and PK-II in a heterologous bacterial expression system, are used to confirm the physiological significance of the PfACP and PK-II interaction. The study also reports additional findings, such as the independence of P. faciparum blood stages on exogenous (media) fatty acids, indicating that intracellular stages can salvage all of their requirements from the red blood cell.

      Weaknesses:

      Overall, this is a very strong study. While questions remain around the function of other apicoplast ACP-interacting proteins detected in this study, I don't have any suggestions for significant improvements.

    2. Reviewer #2 (Public review):

      This study focuses on revealing the essential divergent function of the Acyl Carrier protein (ACP) in the deadliest human malaria parasite, Plasmodium falciparum. More precisely, using inducible KO, cellular and biochemical approaches, the authors determined that instead of a canonical role for ACP allowing the de novo synthesis of fatty acids in the apicoplast (essential relict plastid) of the parasite, the enzyme couples with pyruvate kinase II to generate nucleoside triphosphate to maintain parasite survival during blood stages. The study is novel, well-designed, providing interesting new data on Plasmodium and apicomplexa biology. The results convincingly support the major claim of the study. However, it is currently incomplete to support some claims on the essentiality of some apicoplast pathways.

      In this study, Geher et al. focused on deciphering the role of the Acyl Carrier Protein (ACP) present in the relict non-photosynthetic plastid, i.e. the apicoplast of the most lethal human malaria parasite, Plasmodium falciparum. More particularly, they determined an essential function of ACP independent of its usual/typical function as the central protein for the normal function of the apicoplast Type II fatty acid synthesis (FASII) pathway. Rather, the protein seems to associate with the apicoplast Pyruvate Kinase II, together generating an essential nucleoside triphosphate (NTPs) source to fuel the apicoplast and parasite survival instead.

      By generating a TetR-DOZY-based inducible KD line for ACP, they confirmed that the protein is indeed essential to maintain apicoplast integrity and parasite survival during asexual blood stages, as previously predicted and experimentally shown. They showed that ACP requires a biochemical modification, typically activating the protein for its function in the FASII pathway, i.e. binding of the 4-PP group by holoACP synthase. Then, they showed that the other enzymes of the FASII pathway are likely dispensable during the blood stage, as they were able to generate a KO line of the first enzyme of the pathway, FabD (which was predicted to be essential in P. falciparum). Based on a cell culture approach in a controlled culture medium, they further claimed that, unlike current evidence-based hypotheses, the FASII pathway (and thus a potentially FASII-linked ACP) has no role/activity during blood stages. Using a proximity biotinylation approach, they determined that ACP associates with the apicoplast pyruvate Kinase II (PKII), previously shown to generate NTPs in the apicoplast for energy and DNA/RNA maintenance (Xia et al. 2019), and not to fuel the FASII pathway as its main function in blood stages. Finally, they showed that the disruption of ACP induces the reduction of the presence/content in PKII in the parasite, as well as the drastic reduction of the apicoplast DNA and RNA content. Together, they concluded that the main function of ACP is indeed the NTP formation via its association with PKII, rather than its canonical role for the generation of fatty acids in the apicoplast.

      This study is novel and focuses on a topic of particular interest in malaria biology, but also for most of the apicomplexa-related diseases, and beyond for plastid bearing orgnaisms and this unusual role for ACP. The study is well thought out with proper biochemical approaches that convincingly point to this association of ACP with PKII for NTP synthesis as a major function during P. falciparum blood stages. However, there are currently some important experimental issues/flaws, missing experiments that induced wrong interpretations and thus do not support some important claims of the study, notably for the role of FASII and the interaction between ACP and PKII.

      Therefore, at this point, the study is only partial and would require major additions and/or important text edits/revisions before being considered for acceptance.

      Major points:

      From the graph of P. falciparum growth, we can see that in the lipid-rich condition, where both FabH KO and ACP KO can survive, the addition of mevalonate was essential for the growth of ACP KO. Along with the other evidence (PKII association, DNA levels...), we therefore agree that PfACP is involved in the mevalonate pathway. The authors claim that the FASII pathway is inactive/not essential in the P. falciparum blood stage. However, the authors have not shown any evidence on whether ACP is or not involved in the FASII pathway during the asexual blood stage. As currently designed, the experiments presented cannot conclude on that point for several reasons. Indeed, it was previously shown that (i) the expression of the protein from the FASII pathway are all present in blood stages and are significantly upregulated in patients that are under under "nutrient starvation" (Daily et al. Nature 2007), (ii) that, growing parasites under similar low lipid conditions in vitro induces an activation/upregulation of FASII, which can be measured by stable isotope precursor labelling and lipidomics (Botté et al. 2013), (iii) that growing the PfFabI KO line under deprived lipid conditions leads to parasite death (Amiar et al. 2020), indicating that the FASII pathway can become critical, if not essential, depending on the host nutritionnal content together correlating patients' data and metabolic adaptation for the same reasons in the related parastie Toxoplasma gondii (Amiar et al. 2020, Krishnan et al. 2020, Liang et al. 2020, Primo et al. 2021, Charital et al. 2024, Dass et al. 2024, Bitew et al. 2025).

      Here, the authors are expecting to show that FabH (and thus the FASII pathway) is not essential in an experiment that is not designed to be in low lipid conditions but rather in lipid rich conditions: Such high lipid conditions of culture in this study is granted by daily feedings with high fatty acid supplement (30-90 uM palmitic acid and 30-60 uM oleic acid). These fatty acid concentrations were used previously by Mitamura et al. (2005) and Mi-ichi et al.(2007) to replace non-determined supplements such as Serum or Albumax supplement to grant similar growth by a completely controlled culture medium.

      This means the concentrations above do not represent limited fatty acid concentrations, especially not with daily feeding (representing an excess supplied amount of lipids, unlike regular 48h feedings) that allowed the authors to easily reach very high non-physiological parasitaemia of more than 20%!! Amiar et al. previously showed essentiality of FabI in P. falciparum in the limited fatty acid culture at a lower concentration (<30uM 16:0, <45um 18:1), than the Mi-Ichi et al. controlled medium with regular 48 h culture feeding. Therefore, with the current experimental settings, the FAH KO is placed in high lipid conditions, thus preventing any conclusion on its essentiality under low lipid conditions.

      Furthermore, it is too uncertain to conclude that ACP is only essential for the mevalonate pathway. This would be a similar discussion to the Yeh et al. 2011 and the Swift et al., where induced Apicoplast knockout caused parasites to require IPP to survive, but there were always remnant apicoplast vesicles and thus the putative presence of an active FASII in the parasite, where de novo fatty acid synthesis could be maintained. Amiar et al. (2020) and Krishnan et al. (2020) showed that disruption of FASII and absence of de novo FA synthesis in T. gondii could be compensated by the exogenous supplementation of myristic acid, C14:0. Here, high fatty acid supplementation using commercially available fatty acids may include unexpected fatty acid species such as myristic acid in palmitic acid or oleic acid, since all commercially available fatty acids guarantee only >99% but not 100%. If P. falciparum requires a very, very low amount of myristic acid to survive, the amount of possible contamination, like 1 nM, may be sufficient to maintain their survival. Thus, ACP and FabH might be very important to generate de novo fatty acids within parasites, but this was not shown by the authors.

      Therefore, the manuscript currently contains incorrect conclusions on the potential essentiality/use of FASII, against current experimental evidence.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Duss et al. use several complementary and state-of-the-art strategies to characterize the effects of norepinephrine release from LC axons on post-synaptic cell types in the hippocampus. While a large body of research supports an important role for NE signaling in hippocampal function, the precise role by which NE promotes these effects remains poorly elucidated, in large part due to the complexity that adrenergic subtypes can be expressed in a variety of cell types and promote a variety of responses. Towards assessing this, the authors first establish an optogenetic strategy by which their delivery stimuli mimic endogenous activation of LC in 'moderate' and 'high' acute stress events, using NE sensors to titer stimulation patterns to similar levels of NE release. They then conduct a series of 2P imaging experiments in mice and compare response properties of various cell types in the hippocampus (excitatory and inhibitory neurons, and astrocytes) when the animal is 'naturally' or optogenetically aroused (via activation of the LC). The results are surprising. Whereas natural arousal causes activation of astrocytes, pyramidal cells, and interneurons, optogenetic activation of the LC does almost the opposite, with only astrocytes responding positively. Another important finding from the study is that astrocytes seem to be the most responsive cell type in the hippocampus to NE release, suggesting they could be key components for downstream functional effects of NE release in this brain region.

      Strengths:

      (1) The study was methodically done with respect to the characterization of how optogenetic parameters related to levels of NE release. Also, the analysis of their calcium imaging of various cell types in the hippocampus was very comprehensive.

      (2) Related, their discovery that cell types in the hippocampus respond differently to NE release, while not a completely unexpected finding, is something that has not been addressed experimentally in such a direct way before (to my knowledge).

      (3) Their finding that optogenetic stimulation of the LC produces opposing results to when these cells are naturally activated has wide implications for the LC field and potentially beyond.

      Weaknesses:

      I was surprised that no efforts were made to further assess what might be causing this discrepancy in hippocampal responses to optogenetic vs. natural activation of the LC. Some experiments that I felt were missing:

      (1) The authors go to great lengths to measure NE release in a variety of arousing conditions (tail lift, foot shock, 5Hz LC opto, 20Hz LC opto), but then in their 2P imaging, they're comparing the opto results to a 'natural' arousal state defined as when the mice were in motion. Maybe I missed it, but I wasn't sure that they ever checked the level of hippocampal NE release in this running state, similar to what they did in the other arousal conditions. Thus, it wasn't clear to me how comparable this state was to the optogenetic stimulation.

      (2) The authors do a nice experiment to show that increases in the hippocampal NE sensors are dependent on LC activity via optogenetic inhibition of the LC (Figure 1, Supplement 3). It seems like a missed opportunity to include a similar strategy in their 2P testing, to assess whether the differing responses of pyramidal cells, interneurons, and astrocytes are truly due to NE release. I could imagine it might be difficult to precisely time LC inhibition with periods of movement, but I imagine that mice would still run even if the LC is inhibited.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript aims to determine the extent to which LC-mediated NA release in the CA1 region of the hippocampus (at both population and cellular levels) contributes to physiological arousal responses associated with innate behaviors (stress, locomotion). The manuscript is divided into two parts in which the authors compare time-locked responses in astrocytes, interneurons (pan-targeting), and pyramidal (CaMKIIa-driven targeting) cells.

      In the first part of the manuscript, the authors perform bulk recordings of either NA release or calcium activity locked onto either 'natural arousal' events (tail lift, foot shock, force swim) or direct optogenetic activation of LC somas. A first aim is to identify an optogenetic stimulation frequency that would mimic NE release in the target area by low- and high-intensity stressors. In the second aim, they compared evoked responses across cell types and concluded that stressors and direct LC activation trigger similar responses in astrocytes but not in interneurons or pyramidal cells.

      In the second - and most extended - part of the manuscript, the authors performed 2-photon cellular recordings of these different cell populations and compared responses evoked by the onset of locomotion vs. direct activation of the LC. Doing so, they observed a great degree of heterogeneity across these two conditions and across cell types. They conclude that NA effects on the hippocampus are primarily mediated by astrocytes and that LC-NA neuromodulation alone does not recapitulate the full breadth of 'natural arousal' modulations. They conclude that other neuromodulators likely contribute to how the hippocampus responds to high arousal levels.

      Strengths:

      Overall, the manuscript is well written and the figures are particularly clear.

      Optogenetics is a very successful technique in contemporary neuroscience, yet one important identified limitation is that it operates largely in a non-physiological regime, driving spike rates in regions rarely visited under normal physiological operations. This has raised valid concerns about the physiological relevance of findings obtained from studies using this technique. Here, the authors aimed at calibrating optogenetic manipulations of the LC so as to match the physiological release of NA observed in specific behavioral contexts. This is a valuable endeavor that could bring the field towards more reproducible and broadly valid findings.

      Another important open question is how different cell types coordinate to support global network activity and adaptive behavior. By recording distinct cell populations from the same region (CA1) and in response to the same category of endogenous versus exogenous events (locomotion or LC activation), it becomes possible to unravel important and specific operation modes, here also linked to a specific category of neuromodulation signaling.

      Weaknesses:

      This manuscript was difficult to review. There is clearly a lot of work and effort that went into it, and the multiple techniques seem well implemented, often with appropriate controls. Yet, the general framing, the links between experiments and interpretations, unfortunately, look questionable in my opinion. Below, I unpack what I think are the 4 main weakness points.

      (1) Incomplete calibration of optogenetic manipulations to physiological regimes

      While mapping optogenetic stimulation protocols to physiological variations is valuable, the proposed approach suffers from major limitations. First, the only parameter that is calibrated is the peak of NE release (as estimated from GRAB-NE fluorescence). Thus, it excludes other important aspects of the response, including trial-to-trial variability and the temporal dynamics of the response. Furthermore, stressor and LC activation conditions are simply non-comparable in terms of the duration of the stimulation (e.g., 3 min swim test versus 10s optogenetic stimulation), likely involving neuromodulation at different timescales (phasic vs. tonic). Albeit not explicitly mentioned, the number of trials and inter-trial interval between successive stimulations are also likely unmatched. On another note, the identification of the best stimulation frequency seems based on a grid of predefined values, while a more precise, continuous assessment could have easily been used. Finally, even though phasic NE release is known to depend on baseline tonic NE levels (especially with a sensor that reports a sublinear function of NE concentration), this dimension is ignored.

      (2) Weak links between imposed stressors and spontaneous locomotion

      The general approach is surprising: authors calibrated the optogenetic stimulation protocol on a range of stress-related behaviors and applied this to locomotion behavior. Indeed, while the first part of the manuscript uses different stressors in freely moving contexts to 'naturally' elevate arousal, the second part uses spontaneous locomotion bouts in a head-fixed situation as proxies for heightened 'natural' arousal. These two parts are very difficult to relate, and it is entirely unclear how NE regimes observed in the first context generalize to the second. Yet, on several occasions, the authors directly relate the first (fiber photometry, Fig.1) and second (2-photon, Fig. 2-6) parts of the manuscript. For instance, they conclude in favor of a "weak alignment between astrocytic responses to arousal and to LC stimulation on a cellular basis, despite the similarity of the bulk response." It remains unclear why closer preparations weren't used in the two parts, such as time-locked change in GRAB-NE2m fluorescence according to either locomotion onset or in a fear conditioning assay, both using fiber photometry in a head-fixed setting.

      (3) LC optogenetics and spontaneous locomotion differ by more than the origin of the arousal drive

      By directly comparing spontaneous locomotion and LC activation, the authors imply that the only difference between these two conditions is the origin of arousal: endogenous vs. exogenous, respectively. Furthermore, they interpret LC activation as triggering a pure NA effect while locomotion would reflect the conglomerate modulation from multiple neuromodulatory systems. On the one hand, LC activation likely results in the recruitment of other arousal centers (the raphe serotonin system, for instance, see 10.1101/2025.03.26.644382). On the other hand, differences between these conditions span well beyond specific arousal centers (see the massive motor-related activity in cortical dynamics: 10.1038/s41593-019-0502-4). Another, more methodological concern is the larger instability of the field of view during locomotion by comparison to optogenetic activation. While I am sure the authors corrected for movement-related translation in x and y directions, there might still be residual motion artefacts in the z direction that could account for some of the differences between the two conditions.

      (4) Loose equivalence between locomotion and natural arousal

      On many occasions, the authors draw a direct equivalence between spontaneous locomotion and 'natural arousal'. Arousal is a multifaceted concept that relates to far more behavioral readouts and network states than just locomotion. For instance, imagine a freezing mouse in response to a threat: locomotion would be absent, but the animal would still be quite aroused. It is ok to leave aside a particular readout and focus on other one(s) (especially thus in the case of arousal, which has many aspects). However, in that case, a single readout cannot be equated with 'natural arousal' as a whole. Instead, terms like 'locomotion' or 'locomotion-linked arousal' should be preferred. Indeed, in the particular case of locomotion, what is being readout is the upper part of the arousal continuum, whereas pupil size or whisker pad movements can also provide a more complete readout, including the lower and intermediate parts of that same continuum. While it is not necessary to include other arousal readouts (once claims are appropriately modified), the motivation for leaving out available readouts (lines 187-201) feels like a post-hoc rationalization.

      In sum, these 4 points call in my opinion for a profound change in how results are presented and interpreted. If agreed, a solution could be to leave aside the first part of the manuscript, to provide a more accurate picture of the differences between optogenetic activation and spontaneous locomotion, and to better flag the limitations of the approach (a part that I believe is entirely missing in the current version).

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors focused on the CA1 region of the hippocampus to compare Ca2+ dynamics in astrocytes, pyramidal neurons, and interneurons in response to optogenetic stimulation of locus coeruleus-triggered noradrenaline (NA) release, or movement (natural arousal)-triggered NA release. The most striking finding is that all studied cell types responded differently to LC stimulation compared to natural arousal. The description of these findings is important as a resource for further mechanistic studies on how multiple neuromodulator systems may interact or for predicting the consequences of the selective impairment of the noradrenergic system.

      Strengths:

      The technical design and conduct of the experiments, analysis including statistics, as well as the presentation of the results, are timely and very solid.

      Weaknesses:

      The identity and localization of NA receptors responsible for effects on neurons are less clear, and therefore, the difference between LC stimulation and natural arousal is less surprising. However, the presented data are consistent with the established finding that astrocytes directly sense NA mainly through α1 adrenergic receptors, yet in this study, astrocytes that responded strongest to LC stimulation did not respond strongest to natural arousal, and vice versa for other astrocytes.

      The authors seem to favor diversity of astrocyte responsiveness as an explanation, but also mention differences in LC activation pattern and distance of individual astrocytes to NAergic nerve terminals. Therefore, this warrants a careful consideration of a critical aspect of the experimental design. The authors delivered Ca2+/NA sensors as well as the optogenetic tools via AAV. While Figure 1 Supplement 3 suggests that most LC neurons were transduced, AAV transduction will almost certainly lead to a diversity in copy numbers per cell. On the receptor side, this can lead to an artificial diversity in Ca2+ response detection sensitivity among individual cells, but more importantly, for the LC, this could account for a different pattern of activation by optogenetic stimulation compared to activation by natural arousal. Such a problem would remain unnoticed with the currently presented matching of optogenetic and natural arousal stimulations of LC using population NA sensor signals (Figure 1, fiber photometry).

      Major suggestion:

      A critical experiment to test for this caveat would be to ideally express the NA sensor in astrocytes (due to their space-filling process arborizations and direct response to NA; but expression in neurons, as present, would work as well) and study the spatial pattern of NA release using two-photon microscopy, comparing multiple days and LC stimulation by optogenetics versus natural arousal. In case these experiments revealed nonuniform NA signal patterns, stable over days, but different when caused by optogenetic stimulation versus natural arousal, it would possibly shift the interpretation of the astrocyte response patterns towards depending mainly on NA release rather than diversity in NA responsiveness. Such a finding would be consistent with studies that compared arousal-mediated Ca2+ dynamics in NAergic terminals and Bergmann glia in the cerebellum (PMID: 36790089). On the other hand, in case these added experiments revealed similar NA release patterns in response to optogenetic stimulation versus natural arousal, then the presented findings would convincingly represent a biological phenomenon.

      Minor suggestion:

      Using "movement" as a proxy for arousal is very appropriate. To avoid the misunderstanding that different phenomena have been studied, it may be useful to acknowledge that early studies of noradrenergic signaling to astrocytes have found that speed of locomotion does not correlate well with astrocyte Ca2+ responses, and electromyographic signals have been used as a "proxy for movement" (PMID: 24945771).

    1. Reviewer #2 (Public review):

      The study by Chen, Deng et al. aims to develop an efficient viral transneuronal tracing method that enables retrograde tracing in larval zebrafish. The authors utilize pseudotyped rabies virus that can be targeted to specific cell types using the EnvA-TvA system.

      Pseudotyped rabies virus has been used extensively in rodent models and, in recent years, has begun to be developed for use in adult zebrafish. However, compared to rodents, the efficiency of spread in adult zebrafish is very low (~one upstream neuron labeled per starter cell). Additionally, there is limited evidence of retrograde tracing with pseudotyped rabies in the larval stage, which is when most functional neural imaging studies are conducted in the field. In this study, the authors systematically optimized several parameters for rabies tracing, including rabies virus strains, glycoprotein types, temperatures, expression construct designs, and the elimination of glial labeling. The optimal configurations developed by the authors are up to 5-10-fold higher than more commonly used configurations.

      The results are compelling and support the conclusions.

    1. Reviewer #1 (Public review):

      Summary

      The authors apply dynamic representational similarity analysis (dRSA), a method introduced in de Vries and Wurm 2023, to source-reconstructed MEG data from 40 participants who viewed ballet dancing sequences under three conditions: normal viewing, up-down inversion, and temporal piecewise scrambling. In normal viewing, they replicate their previous finding of a hierarchical pattern of leading-edge neural representations, with view-invariant body motion represented earliest in time (around 500 ms before the corresponding stimulus state), followed by view-dependent body motion (around 200 ms) and pixelwise motion (around 150 ms). Inversion selectively attenuates the leading-edge representation of view-invariant body motion while enhancing view-dependent body motion. Scrambling abolishes all leading-edge motion representations and instead increases post-stimulus representations of body posture. The authors interpret these findings as evidence that biological motion perception relies on a hierarchy of priors operating within a predictive-processing framework, with inversion specifically disrupting holistic priors and scrambling disrupting kinematics priors.

      Strengths

      The empirical work is careful and technically ambitious. The dRSA framework introduced in the 2023 paper is a useful methodological contribution to the study of dynamic neural representations, and the present manuscript extends it in well-motivated directions. The dataset is substantial: 40 participants, source-reconstructed MEG, three within-subject conditions. The replication of the 2023 normal-condition findings in an independent 40-subject sample is solid, which is increasingly rare and welcome in the field. The inversion and scrambling manipulations are well-motivated, and the conditions are matched on stimulus identity. Principal component regression is used appropriately to handle the genuine challenge of correlated and autocorrelated stimulus features, and the authors validate this choice through simulations. Eye position is included as a covariate and successfully regressed out, addressing a common confound in MEG decoding work. Behavioral catch trials demonstrate that participants attended to the stimuli across conditions. Both frequentist and Bayesian statistics are reported with appropriate corrections for multiple comparisons. The inversion result, in particular, is striking, and the asymmetry between view-invariant and view-dependent representations is informative.

      Weaknesses

      The central interpretive step in the manuscript treats a negative-lag dRSA peak as direct evidence for active hierarchical predictive inference. The data are equally consistent with at least three other accounts that the manuscript does not engage with, and the conclusion is therefore stronger than the data support.

      First, the leading-edge dRSA signature is a natural consequence of nonlinear temporal integration of autocorrelated stimulus features. A long line of work from the Winawer and Grill-Spector labs (Zhou et al. 2018, Zhou et al. 2019, Stigliani et al. 2017, Kim et al. 2024) has established that the human visual cortex implements compressive temporal summation with delayed divisive normalization and that temporal integration windows progressively increase from early to higher visual areas. A nonlinear-summation response to an autocorrelated feature encodes deviations from the recent baseline. For smooth trajectories, this is essentially a local derivative, and the derivative inherits the trajectory's leading edge as a free consequence - no predictive machinery required. The integration-window hierarchy that Kim et al. (2024) recovered from voxelwise spatiotemporal pRFs maps onto the 150 / 200 / 500 ms hierarchy reported here almost one-for-one. That alignment is unlikely to be coincidental and deserves explicit treatment.

      Second, the experimental design places participants firmly in the regime where Dayan's successor representation (SR) predicts that the brain holds a precompiled associative cache of trajectory structure. Each unique sequence is presented approximately 47 times across the experiment. An SR in Dayan's original formulation is a precompiled lookup table, not an online inference engine - querying it during familiar trajectories produces leading-edge representations through passive associative retrieval, mechanistically distinct from active prediction despite producing similar signatures. The senior author's own lab has demonstrated SR-like representations in V1 (Ekman, Kusch, de Lange 2023 eLife), but this paper is not cited or engaged with in the present manuscript despite its direct relevance.

      Third, the canonical computational model of biological motion perception (Giese and Poggio 2003 Nat Rev Neurosci) is a fully feedforward template-matching architecture that predates the predictive-coding framing of biological motion. It accommodates the inversion effect (templates tuned to upright statistics), the hierarchy of timescales (graded leaky integrator time constants), and the scrambling effect (broken sequence-neuron activation) without invoking generative models or prediction errors. The manuscript cites Giese-tradition work for the inversion-effect literature but does not engage with the model itself, even though it is the field standard.

      The inversion result, while empirically striking, has a simpler interpretation than the one offered. Inversion makes viewpoint-invariant body computation fail because the underlying machinery is tuned to upright body statistics. A weaker representation produces a weaker dRSA signature at every lag, including the leading edge - no appeal to priors in the active-inference sense is required. The view-dependent enhancement under inversion fits this reading naturally: when viewpoint abstraction fails, processing falls back to viewpoint-specific representations that remain extractable. The manuscript implicitly acknowledges this when it states that "predictions were channeled to the level at which prediction was still possible," but does not notice that this concession softens the strong predictive-coding inference.

      The scrambling result is internally awkward on the predictive-coding framing. The paper acknowledges that pixelwise motion prediction should, in principle, survive 200-500 ms scrambled segments (typical latency around 150 ms) but reports that it does not. The proposed save - that segments are "too short to start up prediction" - undercuts the framework, since by the same logic, most of normal viewing would also be pre-prediction. A cleaner reading is that scrambling destroys the temporal autocorrelation of stimulus features, which is the prerequisite both for nonlinear-summation neural responses to produce leading-edge representations and for SR-style associative retrieval to operate.

      A further concern is that the experimental design and analysis pipeline are structurally biased toward producing the cleanest possible predictive signature. The 14 stimuli are repeated extensively, and trials are averaged across repetitions before dRSA is computed, filtering out exactly the variability that would distinguish online prediction from amortized retrieval. The 2023 paper reports a control comparing the first and last thirds of the experiment, but this test is in the post-saturation regime for any plausible associative-learning rate and does not actually adjudicate the question. A first-exposure or first-run analysis would be diagnostic. Finally, the behavioral task changed between the 2023 paper and the present manuscript. The earlier paradigm asked participants to recognize the current motion ("arms moving up?"), while the present paradigm asks participants to judge whether an occluded video continues correctly. The latter explicitly demands prediction. This change transforms the experimental context from naturalistic viewing into one that actively incentivizes predictive engagement, potentially inflating the very signatures the paper interprets as spontaneous prediction.

      The 2023 Nature Communications paper actually navigated these interpretive questions more carefully than the present manuscript does, explicitly stating that the approach "does not provide conclusive evidence for predictive processing/coding theory but leaves the door open for related theories such as adaptive resonance or Bayesian inference without predictive coding." The current manuscript would benefit from restoring that epistemic discipline. The data and methods are valuable; the interpretive frame is overstated relative to what the evidence supports.

      Impact and utility

      The dataset and dRSA framework are useful contributions to the study of neural representation of dynamic stimuli, and the inversion and scrambling conditions open productive lines of inquiry. The interpretive over-commitment to predictive processing risks limiting the paper's reach into adjacent literatures - temporal integration, successor representations, template-matching biological motion models, encoding-model approaches - where the findings could land productively. With a more pluralistic interpretive frame, this work would speak to a substantially broader audience and connect more naturally with existing mechanistic accounts of dynamic visual processing.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, de Vries and colleagues apply successful probabilistic inference and predictive coding frameworks to the question of biological motion perception. In contrast to most studies of predictive processing in humans, which rely on the presentation of discrete events, they instead aimed to track continuous predictions in the context of more naturalistic inputs such as biological motion. In these settings, the authors have previously demonstrated an inverted temporal hierarchy of prediction whereby high-level movement features (e.g., view-invariant body motion) are predicted earlier than lower-level ones (e.g., pixelwise motion). The specific question they set out to address in this manuscript is whether these predictions derive from prior beliefs about the biological and physical organization of biological movements versus the local extrapolation of motion from past observations.

      The authors used anatomical MRI-driven source reconstruction of MEG activity recorded from human participants watching either normal, vertically-mirrored, or temporally scrambled movies. They then aimed to correlate activity in preselected ROIs with summary representations of these movies based on different visual features at 3 different hierarchical levels using RSA. Doing so, they could confirm that predictive processes could be identified prior to the change in the stimulus and organized anatomically along the visual cortical hierarchy. Critically, they report that mirrored movies selectively disrupted the highest processing level while the lowest level remained largely unaffected. Interestingly, the predictions at the intermediate level were boosted in mirrored movies, suggesting a possible channeling of predictions at this level when highest-level predictions are unavailable. Finally, disrupting all predictive aspects with the scrambled movies entirely abolished predictions at all levels, with signals mainly reflecting reactive bottom-up processing of inputs.

      In sum, biological motion perception relies on a tight coordination of multi-level predictions based on both motion-related holistic and kinematics priors.

      Strengths:

      Overall, this is a very strong manuscript, with the text being clearly written. I liked the fact that the authors not only compared responses to normal videos against the same videos flipped upside-down, but also to temporal piecewise scrambling of that same video, allowing to identify the respective roles of holistic motion priors vs. temporal predictions. Of course, more work is needed to tease apart what key quantities are represented in these holistic priors. For now, the authors argue that they likely combine prior beliefs about the biological organization of bodies, such as the likely angle of joint movements, and about the physics of reality, such as gravity. Further work teasing apart these aspects would be interesting to read!

      All analyses seem well executed and, while some aspects of the presentation of results could be slightly improved (see below), the manuscript is very clear and the conclusions are supported by the data. Finally, I liked the words of caution the authors added to the discussion. For instance, while they largely used negative vs. positive latency as a proxy for top-down vs. bottom-up processing respectively throughout the manuscript, they also accurately acknowledge that predictive computations could also modulate processes at positive lags, through, for instance, latency modulation.

      Weaknesses:

      The main aspect of the work I was left to struggle with is this idea that priors can be read out directly from large patterns of activity rates as measured with MEG. While some past experimental work does support this view, theoretical proposals also suggest that one benefit of predictive coding lies in its computational and energy-efficient properties, whereby only novel, unpredicted aspects are encoded in the rate of neural activity. Some other research lines, for instance, focusing on silent working memory, also report the brain's ability to store important computations in ways that are not reflected in costly increases in overall activity. The authors do not really unpack why they expect to see predictions to be encoded in such a way in the first place. They also do not discuss what that implies in terms of neural organization and whether other aspects of neural activity (e.g., oscillations, synaptic weights) could subtend predictive processing in this context. At the end of the day, this activity change is clearly there in the data, so that's totally fine to interpret that; it just would be helpful to unpack what such an implementation of prior beliefs would imply in terms of neural organization.

      The other weakness point I see is the little consideration for behavior throughout the paper. Behavior is indeed mostly treated as a negative control, ensuring that differences between conditions at the neural level do not follow from different behavioral strategies or other peripheral factors. Critically, task design nicely incorporates two types of tasks: one that is related to motion (occlusion of movement) and one that's independent of it (color change of fixation cross). Yet, these conditions are not directly compared at the neural level. It would be useful to see whether the neural signatures of prediction are largely independent from the ongoing task or whether behavior gates the types of priors and prediction processes that are applied to incoming sensory inputs. Moreover, the text says that "neither in accuracy nor in reaction time was there a significant difference between conditions", yet significance stars in Figure 1d seem to suggest there is a difference in the fixation cross task. What am I missing? If there is indeed a difference in overall performance, can the results (esp. the reduced dRSA correlation strength in normal < inverted < scrambled movie) be interpreted in terms of a multi-tasking cognitive cost?

      I also have some other minor questions and comments:

      (1) In this task situation, prediction does not only come in the continuous domain but also relies on a mental simulation model, in particular in the occlusion task. However, corresponding literature, notably the work by Shepard & Metzler (1971) on mental rotation (as well as follow-ups), is not mentioned here, I believe. Could the authors perhaps mention this if they think that's relevant (if not, feel free to ignore).

      (2) I'm concerned that the novelty of dynamic RSA as explained at lines 56-64 might appear slightly exaggerated. After all, isn't it just a generalization of matrix correlation in model and brain time domains? (Again, feel free to ignore if I misunderstood.)

      (3) How do authors explain that high-level motion prediction is still significantly larger than zeros (correct?) in the inverted movie condition? Shouldn't it be entirely abolished?

    3. Reviewer #3 (Public review):

      Summary:

      The authors investigate whether the brain's predictive representation of observed biological motion depends on holistic priors about body structure or on kinematic priors about motion continuity. The manuscript applies dynamic representational similarity analysis to MEG data from a large number of participants viewing ballet sequences under three conditions: normal, upside-down inverted, and temporally scrambled into short epochs.

      Strengths:

      The study reports that inversion selectively attenuates predictions of view-invariant body motion and enhances predictions of view-dependent body motion, while leaving low-level pixel-wise motion prediction unaffected. Further, scrambling eliminates predictive motion representations at every level and instead produces stronger post-stimulus representations of body posture, with view-invariant posture also delayed. The pattern across the two manipulations is internally consistent, holds across both peak magnitude and peak latency measures, and is also supported by a neural-to-neural dynamic representational similarity analysis (dRSA) analysis between normal and inverted conditions. The principal component regression pipeline is validated through simulations showing that it recovers the model of interest while suppressing covarying models. In particular, the inversion result provides strong evidence that high-level predictions of biological motion depend on holistic priors while predictions at lower levels do not, and the finding that disruption at the top of the hierarchy does not propagate down is informative for predictive processing accounts that assume a more cascading architecture.

      Weaknesses:

      The interpretation of the scrambling result is the main caveat of the manuscript. The claim that low-level motion prediction depends on kinematic continuity rests on the absence of pixelwise motion prediction in the scrambled condition, but the 200 to 500-ms segments may not be sufficient for prediction to develop, as the authors also point out. Without a parametric manipulation of segment length, it is difficult to distinguish a genuine dependence on kinematic priors from a floor. The interpretation of increased post-stimulus posture representations as prediction errors is also somewhat indirect, since a positive latency does not rule out potential top-down modulation/factor.

    1. Reviewer #1 (Public review):

      Summary:

      Dhillon and Lewis present an optical approach to record single CRAC channel activity, overcoming the long-standing barrier imposed by the channel's extremely small unitary conductance. By fusing HaloTag to Orai1, labeling with JF646-BAPTA, and combining TIRF microscopy with whole-cell voltage clamp (Patch-TIRF), the authors achieve genuine single-channel resolution. A central contribution is the recognition that JF646-BAPTA undergoes reversible photophysical blinking that can be readily mistaken for gating events. The authors exploit the multi-dye labeling of hexameric Orai1, combined with voltage-clamped definition of open and closed fluorescence levels, to distinguish true gating transitions from blinks. The result is the first kinetic characterization of single CRAC channel openings activated by STIM1, reporting multiple open and closed states with durations from about 0.1 s to tens of seconds, predominantly high open probabilities ({greater than or equal to} 0.7), and an unexpected population of "silent" channels that co-localize with STIM1 but show no detectable activity over the observation window.

      Strengths:

      The work is technically rigorous, and the controls are appropriate. The integration of patch-clamp voltage control with TIRF imaging is a thoughtful methodological choice that defines the open- and closed-channel fluorescence reference levels with precision, providing a quantitative framework that the field has lacked. The use of the non-conducting Orai1-E106A mutant as a specificity control (Figure 4C) is exactly the right experiment, and the demonstration that JF646-BAPTA signals require Ca²⁺ flux through Orai1 itself anchors the entire approach. The identification and characterization of JF646-BAPTA blinking (Figures 2 and 3) is a significant contribution in its own right. The authors show clearly that the dye exhibits long-lived dark states and that transitions to zero fluorescence, rather than to a finite calcium-free baseline, are diagnostic of blinking rather than channel closure. This caveat has immediate implications for the interpretation of recent work using the same dye on other calcium-permeable channels, and will recalibrate the broader field of HaloTag-based single-channel optical recording. The kinetic analysis itself reveals something that was previously inaccessible: seconds-long open times, multi-state gating behavior, and a population of channels that co-localize with STIM1 yet remain electrically silent. These findings are physiologically meaningful and would not have been detectable by macroscopic electrophysiology. Overall, an outstanding study.

      Weaknesses:

      The manuscript would benefit from a small number of additional analyses of the existing data and modest refinements to the presentation. The discrete-channel interpretation of the intensity histogram in Figure 6C, the open probability distribution in Figure 8C, and the assignment of the "silent" channel population are all interesting and likely correct, but each rests on assumptions that the authors are well positioned to test directly using data already in hand. Brief additional discussion of the dynamic range of JF646-BAPTA in situ and of how the temporal resolution of the recordings shapes the inferred kinetic model would also help readers calibrate the findings.

      None of these points challenges the central claims of the paper, and none requires new experiments.

    2. Reviewer #2 (Public review):

      Summary:

      Dhillon and Lewis use the enhanced brightness of the new calcium indicator dye JF646-BAPTA attached to Orai1-bound HaloTag to identify single CRAC channel events detected as [Ca2+]i fluctuations rather than currents. This enables them to detect Orai1single channel kinetics of permeation, overcoming the currently unmeasurable single channel CRAC conductances (~ 20-40 fS). TIRF microscopy narrows the z-section and improves calcium event localization.

      JF646-BAPTA reversibly blinks between fluorescent and non-fluorescent states, complicating single-channel detection. Blinking occurs both in permeabilized cells with saturating Ca2+ and in intact cells at physiological [Ca2+]i. Using voltage clamp and TIRF imaging, CRAC gating events were distinguished from blinking by analyzing fluorescence responses to voltage changes.

      Hyperpolarization (-100 mV) increases fluorescence, indicating channel opening. Responses blocked by La3+ confirm specificity for Orai1, while minimum fluorescence at +30 mV corresponds to closed channels. Dynamic range and response kinetics help differentiate genuine gating from blinking artifacts. Long channel openings (seconds to tens of seconds) are observed, with most open times around 1.2 seconds. Longer openings (tens of seconds) are present but difficult to sample. Silent channels constitute 11% of puncta.

      The paper carefully examines a new method to sample CRAC kinetics, which should enable further mechanistic studies of STIM control of ORAI and modulation by other signaling components such as calcineurin. Development of bright nonblinking dyes or dyes whose blink rates are directly correlated with a calcium-binding site will enhance this route of investigation.

      Comments:

      This is an excellent methodological study, rigorous and thorough. I wondered whether La3+ alone could alter JF646-BAPTA blinking, but the authors show that JF646-BAPTA exhibits reversible transitions to a non-fluorescent state (blinking) under both Ca2+-saturated and physiological conditions, independent of channel activity or the presence of La3+.

      Strengths:

      A novel method providing additional tools to study store-depletion induced Ca currents mediated by Stim-Orai family members.

      Weaknesses:

      Limited by blinking dyes, the only ones currently sensitive enough to measure the calcium fluxes through single channels.

    3. Reviewer #3 (Public review):

      Summary:

      Previous work from the Cahalan lab used fluorescent Genetically Encoded Ca2+ Indicators (GECI), like GCaMP6f, tethered to the N- or C- terminus of Orai1 to monitor CRAC channel optical signals (Dynes et al., PNAS 2016 PMID: 26712003; J Gen Physiol 2020 PMID: 32589186; PNAS 2023 PMID: 37729200). In this study from the Lewis lab, the HaloTag system enables C-terminal labeling of Orai1 with a reactive JF646-BAPTA loaded into cells. The article raises two key issues with the Ca2+ indicator probe that may limit potential applications: probe loading conditions and blinking.

      Making Sense of Probe Probe-lems:

      This is a three-component system: the hexameric Orai1 channel, the Halo tag, and the Ca2+ indicator (four components if you count the GFP- or mCherry-tagged STIM1 in the endoplasmic reticulum membrane that activates the plasma membrane Orai1 channel). The Orai1 channel, tagged with the Halo protein, appears to function normally, judging from the characteristic inwardly rectifying Ca2+ current first observed in T lymphocytes (Lewis and Cahalan, Cell Regulation 1989 PMID: 2519622). One problem is to find a condition for indicator dye loading that results in complete and uniform labeling with the covalently linked JF646 indicator. JF646-BAPTA is a far-red fluorescent indicator related to BAPTA, with a Kd of ~150 nM. The esterified form can be loaded into cells, as is routinely done for Ca2+ indicators like fura-2 or fluo-4. Ideally, to monitor local Ca2+ in the cytosolic nanodomain of the Orai1 channel, the indicator should react with each and every Halo tag of the hexameric channel. The authors assessed published methods by varying the exposure time to the JF646-BAPTA-esterified probe. The authors then used green JF552 labeling following red JF646-BAPTA loading to assess the completeness of labeling. Even overnight incubation of Halo-tagged cells was not sufficient. The addition of Pluronic treatment for 1 hr improved labeling, and a standard condition was adopted. Under this condition, no additional labeling with the green JF552 was seen, implying complete labeling with JF646-BAPTA. However, even with complete labeling, several additional effects might reduce the effective signal-to-noise, which is lower in these studies than expected from in vitro measurements - for example, if the JF646-BAPTA molecules are incompletely de-esterified, or if there is quenching between the closely spaced probes attached to the channel hexamer.

      A second, more serious problem analyzed by this article is that the JF646-BAPTA probe blinks on and off spontaneously, making it problematic to monitor true single-channel events in which the channel open state is assessed by the fluorescent probe. The authors distinguish blinking from channel-gating events by carefully noting the residual level of fluorescence in the absence of Ca2+ influx. Blinking events occur in bursts that reduce fluorescence transiently to zero, whereas the closed channel labeled with JF646-BAPTA retains a low level of fluorescence (~20%). To circumvent the blinking issue, the authors use whole-cell patch recording, in conjunction with optical recording (Patch-TIRF). This allows channel-gating events to be identified by step-wise changes in fluorescence due to Ca2+ entry upon hyperpolarization to -100 mV, above a baseline level of fluorescence at +30 mV, which the authors presume represents the closed channel level of fluorescence. Irreversible photobleaching is an additional issue, limiting the recording times to less than 1 minute.

      Visualizing Orai1 Single-Channels:

      With the blinking problem circumvented, at least in part, the authors uncovered a wide variety of single-channel events. Cells with low expression levels of Orai1 revealed 0-3 active Orai1 channels per STIM1 puncta. The range of gating behavior at the single-channel level is one of the revelations in this study. A substantial fraction (11%) of puncta contained "silent" channels that did not open (detected by the non-zero level of baseline fluorescence for closed channels). At the other extreme, some channels remained open for tens of seconds. On average, channels that opened and closed stochastically exhibited a bi-exponential distribution of bright states (open channels), with a major component of fast events (92 ms) and a minor component of slower ones (1190 ms), as well a single-exponential distribution of dark states (closed channels), and open probabilities >0.7. Channel open/closed times and the high open probability of active Orai1 channels seen here reinforce previous work based on analysis of CRAC current fluctuations in whole-cell recording, and optical single-channel recording using a different genetically encoded Ca2+ indicator, G-GECO1, tethered to Orai1 (Prakriya and Lewis, J Gen Physiol 2006 PMID: 16940559; Dynes et al., PNAS 2016 PMID: 26712003).

      Expression levels for single-channel optical recording must be low; accordingly, puncta contained only 0-3 active channels. However, under conditions of high STIM1 and Orai1 expression, conventionally used to investigate channel function, as in Figure 1, cells with large currents express many thousands of active channels. The number of active channels per cell can be calculated by dividing the peak current (~-100 pA) by the voltage (-100 mV); this corresponds to a whole-cell conductance (G) of ~1 nS (conductance is measured in Siemens). The single channel conductance (gamma, too low to detect electrically) is estimated by noise analysis to be 20-40 fS. Thus, the number of active channels is given by G / gamma corresponding to a range of > 25,000 - 50,000 open channels per cell. Under similar conditions of high STIM1/Orai1 co-expression in HEK cells, individual Orai1 channels were visualized at high density in puncta by freeze-fracture electron microscopy (Perni et al., PNAS 2015 PMID: 26351694), revealing puncta packed with Orai1 particles corresponding to hundreds to >1000 channels per punctum. Measuring the center-to-center distances between particles in puncta revealed two peaks in a distribution of inter-particle lengths: 9 nm (consistent with the approximate width of the Orai1 channel hexamer) and 15 nm (possibly due to two adjacent Orai1 channels held together by intervening STIM1 dimers).

      Strengths:

      The authors do an excellent job of analyzing and discussing probe artifacts that can confound measurements at the single-channel level. On the technical side, we thank the authors for including a photon 'budget' for their imaging experiments by including: the conversion factor from camera intensity units (c.u.) to photoelectrons, cell background fluorescence levels, and nominally Ca2+ free single channel fluorescence levels. One parameter missing from the list is the size of the region of interest used for channel recording. We expect the intensity measurements provided in the channel traces to correspond to mean ROI intensity levels. Upon knowing the ROI size in pixels, the magnitude of fluorescent signals could then be calculated in photons. Taken together, these values will aid comparisons to previous work and help guide subsequent researchers doing their own optical recording.

      The most important finding of this study is the ability to analyze single-channel properties of active Orai1 channels using the HaloTag approach. By direct measurement, the authors confirm previous work that there are at least two open states and that the CRAC channel open probability is greater than 0.7.

      Like any good study, this work suggests opportunities for further work. At the chemistry level, one focus should be the development of new probes that don't blink and have lower affinity for Ca2+ to circumvent unwanted responses to global Ca2+ signaling. Far-red probes like JF646-BAPTA have the advantage of reduced scattering for in vivo imaging applications. At the level of channel molecular function, the results pave the way for unraveling mechanisms of channel gating, such as the requirement for STIM1 binding to activate sub-states of Orai1, and how the channel undergoes Ca2+-dependent inactivation. At the cellular physiology level, localized Ca2+ probes should help to clarify mechanisms that couple to changes in gene expression and reveal Ca2+ signaling in subcellular structures, including dendritic spines. As a nice proof of principle, Halo-tagging enabled Ca2+ signals to be measured in primary cilia (Deo et al., J Am Chem Soc 2019 PMID: 31430138). Future users of HaloTag and GECI Ca2+ indicators will need to confront the issues (probe-lems) at the single-channel level that are carefully raised and analyzed in this article.

      Weaknesses:

      The major confounding issue identified here is probe blinking. The authors find a way to circumvent the issue, but not to prevent it. Is it triggered by high laser light intensity? Do the six JF646-BAPTA molecules tagging a single Orai1 channel exhibit quenching or correlated blinking?

      Which type of probe is better for understanding more about the CRAC channel function? It is difficult to evaluate the pros and cons of the HaloTag and GECI approaches without a side-by-side comparison under identical conditions (except for the probe, obviously). With respect to Ca2+ affinities, higher Kd values (lower affinity) are probably better. JF646-BAPTA has a relatively low Kd value (150 nm) compared to Orai1-GCaMP6f (620 nM in situ), which may account for the saturation of optical signals at potentials more negative than -75 mV in this study. In contrast, saturation did not occur at negative potentials with Orai1-GCaMP6f in the study by Dynes et al., 2020. Lower affinity also makes the probe more resistant to unwanted signals from global increases in Ca2+. With respect to response kinetics, the finding that JF646-BAPTA has faster Ca2+ binding and unbinding kinetics than GECIs in Deo et al., 2019, occurred before publication of the jGCaMP8 series indicators in Y. Zhang et al., Nature 2023. Kinetic measurement of Orai1-jGCaMP8f fusions was reported in Dynes et al., PNAS 2023, and these measurements were performed using the same patch-TIRF approach as the present manuscript. While photoinactivation of jGCaMP8f fused to Orai1 interfered with kinetic measurements, Orai1-jGCaMP8f V203Y (a mutant with greatly reduced photoinactivation) exhibited a tauon of 10 ms and tauoff of 15 ms, roughly twice as fast as the values reported for Orai1-HaloTag-JF646-BAPTA in the present manuscript. The manuscript text comparing Halo-Tag kinetics with GECI should be revised accordingly.

      The authors suggest that single-channel events reported previously for Piezo1 channels (Bertaccini et al., Nat Comm 2025 PMID: 40593468) may be due to probe blinking. However, that study included two critical controls that demonstrate that signals reflect bona fide channel activity rather than blinking artifacts. Notably: (1) treatment with channel activator Yoda1 increased bright-state occupancy (Figure 3C - 3G), and (2) increasing channel open probability by administering a mechanical stimulus increased bright-state occupancy (Supplementary Figure 13).

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript deals with the ability to identify material hardness from the vibrations induced by single light taps on that surface. Psychophysical tests of human perception under varying conditions of modified fingertip compliance and/or externally imposed vibrations demonstrated that total spectral energy was the main determinant of perceived hardness and that perception of increased hardness can be induced by adding external vibration at the time of contact.

      Strengths:

      The experiments are well-reported and the data potentially useful, but much narrower than is implied by the (provisional) title and abstract. Their potential application to tactile perception in virtual reality seems promising, but the largely unexplored need for synchronization with physical contact and modulation with velocity and force of that contact seems likely to complicate proposed applications to prosthetics and telerobots.

      Weaknesses:

      (1) The authors have confused discriminability with perception. The sense of touch is derived from several different types of mechanoreceptors and processed into several dimensions of haptic perception. The fact that subjects can rank surface material hardness correctly when requested to focus on that alone does not mean that they rely on total spectral energy normally or that total spectral energy is normally perceived as surface material hardness, as opposed to other aspects of materials, such as their surface texture. They have not considered the effects of more complex features of most surfaces, such as curvature, lamination or other exploratory movement strategies besides light taps.

      (2) Discussion section. Lines 262-264 are overstated. Dynamic spectral energy can be used to modify perceived hardness when exploratory movements are limited to taps that are unlikely to generate any other useful cues, such as skin deformation or proprioception. The authors have not explored what happens if there actually are conflicting cues in non-vibratory modalities. There are many different examples from sensory psychophysics of percepts that arise from taking the mean of conflicting cues (e.g. stereophonic sound localization) and others that arise from a dominant modality (e.g. self-motion perception from visual flow fields, vestibular signals and proprioception).

      The authors have ignored the substantial literature on artificial tactile sensors and their ability to identify texture, hardness and other haptic properties of materials. These have emphasized the importance of the many types and parameters of exploratory movements, which were loosely specified and not quantified in these studies.

      See:

      Li, Q., Kroemer, O., Su, Z., Veiga, F. F., Kaboli, M., & Ritter, H. J. (2020). A Review of Tactile Information: Perception and Action Through Touch. Ieee Transactions on Robotics, 36(6), 1619-1634. doi:10.1109/tro.2020.3003230.

      Fishel, J. A., & Loeb, G. E. (2012). Bayesian exploration for intelligent identification of textures. Frontiers in Neurorobotics, 6(4). doi:10.3389/fnbot.2012.00004

      Fishel, J. A., & Loeb, G. E. (2012). Sensing Tactile Microvibrations with the BioTac - Comparison with Human Sensitivity. Paper presented at the IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, Rome.

      (3) Introduction (lines 23-31) and Discussion (lines 296-298). The notion that tactile receptors are "frequency tuned" is something of a straw man. Different receptor types are preferentially sensitive to different broad spectral bands, but it has long been known that they can be driven by larger stimuli outside those bands and that humans have very limited ability to discriminate actual frequency of tactile vibration (as opposed to auditory pitch), particularly for frequencies greater than the maximal one-to-one firing rate of neurons (~200-300 Hz). Conversely, fine onset timing of spikes in tactile afferents appears to be available from brief contact taps to identify features other than hardness; see:

      Johansson, R. S., & Flanagan, J. R. (2009). Coding and use of tactile signals from the fingertips in object manipulation tasks. Nature Reviews Neuroscience, 10, 345-359.

      Pruszynski, J. A., Flanagan, J. R., & Johansson, R. S. (2018). Fast and accurate edge orientation processing during object manipulation. eLife, 7, e31200.

      (4) Methods section. The Lofelt L5 actuator used to apply vibrations to the fingernail is rather large for use on multiple fingers of a haptic display. Do the authors know of any more compact technology with the requisite power and frequency response? One of the most useful contributions of this paper is to suggest that those details matter relatively little, which opens up more compact technologies such as piezoelectric actuators.

      (5) Methods section. It is good that headphones were used to block and mask audible tapping sounds, which are known to be capable of generating tactile illusions (Jousmäki, Veikko, and Riitta Hari. "Parchment-skin illusion: sound-biased touch." Current biology 8.6 (1998): R190-R191). But that suggests that hardness might be signalled by precisely timed acoustic stimuli, which would be much easier to deliver than fingertip vibration.

    2. Reviewer #2 (Public review):

      This paper aimed to demonstrate that total spectral energy alone is sufficient to drive hardness perception and material identification. Through five user studies, they tested materials ranging in stiffness and with covered fingers to support their claim. Using a spectral energy compensation framework, they concluded that total spectral energy alone, regardless of frequency content, was sufficient to support material hardness percepts. However, it should be noted that all experiments used a tapping procedure, which is not the standard exploratory procedure when judging material hardness. A tapping method also selectively enhances vibratory feedback while limiting others. This fundamentally limits the scope of their work, and assessing their claim on generalizability would require further experimentation.

      Some additional clarification and extension on the experiments are also suggested:

      (1) According to Lederman and Klatzky (1987), pressure, and not tapping, is the exploratory procedure humans use to judge hardness. And during tapping instead (as used in all experiments), it is expected that the dominant cue available to the user comes from vibrations, as other mechanical cues, such as skin stretch, are limited. These vibrations could serve as a proxy for hardness, as claimed by the authors, but it is unclear if the participants are basing their evaluations on perceived hardness or vibration intensity. A more fundamental question that needs to be answered to support the paper's claim is whether a single tap is sufficient for conveying a material's hardness. To better support their claim, I recommend that the authors include an experiment using participants' bare fingers with materials of the same modulus but different damping coefficients. These materials would produce different vibration signals when tapped, but are equivalent in hardness.

      (2) The setup text for experiment 4 does not match the results. Results suggest that a finger covered with a bubble and touching a soft material was used (i.e. dual compliance), but the setup describes otherwise. The authors should clarify this and confirm that this is different from experiment 2.

      (3) As silicone, foam, and rubber can have very similar or different hardness depending on the specific material used, please report the hardness of each material tested (Shore or Young's modulus) to better understand the range of stiffness tested.

      (4) In the "materials grouping and selection" section, it states that a pilot study suggested hard materials tended to be perceptually similar while softer materials were easily distinguishable. However, this contradicts the results in experiment 1. The authors should expand on the details of the pilot study and address the inconsistency between its findings and experiment 1.

      (5) The methods section suggests that individual recordings for each material were performed before the experiment. Please clarify if this is correct, or if a single signal for each texture was used across all participants. Additionally, were the participants' tap pressure controlled during either the recordings or in the experiments? If not, how do the authors account for the difference in intensity that would be generated due to different tapping pressures across participants and trials?

    1. Reviewer #1 (Public review):

      Summary:

      This study develops a novel theory to account for various aspects of dopamine signals, particularly dopamine ramps. They propose that dopamine reward prediction error (RPE) signals are generated by a dual-process learning system in which values inferred by a model-based system enter the RPE asymmetrically into the update target but not the prediction (equation 6). The work offers specific, mechanistic explanations of Krausz et al. (2023) and Guru et al. (2020), Kim et al. (2020) by maintaining an RPE interpretation, and presents an alternative to the state-uncertainty account in Mikhael et al. (2022) that doesn't require the asymmetric uncertainty assumption Mikhael needs, using Campbell et al. (2025) in a thoughtful way. The asymmetric-RPE idea is clean and well presented. Overall, this study makes an important contribution to the field.

      Strengths:

      The theory is relatively simple and intuitive. It addresses a long-standing controversy or mystery in the field of dopamine.

      Weaknesses:

      (1) The biggest outstanding question is what V_TD does - letting V_MB drive everything would seem to produce much of the same outcomes in the settings discussed here. The discussion suggests that in situations where there is little contribution of the model-based system, the backpropagating bump is a feature (e.g. Amo et al.). It would be interesting to see if this is a true outcome of the model, potentially by varying the arbitration parameter k. This is an interesting alternative account from eligibility trace explanations of the lack of backpropagating bump in some experimental settings.

      (2) The model-based accounts are quite simplistic, and this should probably be acknowledged - it does help delineate their contribution, but in the model, only the goal-reward value is updated; everything else is a known computation. Perhaps engage more deeply with Sagiv et al?

      (3) The application of Campbell et al. (2025) to push back on Mikhael (lines 253-259) is interesting: if striatum to VTA implements TD via synaptic delays such that V(s_t) is a delayed copy of V(s_{t+1}), then state uncertainty is necessarily shared between the two terms in the RPE, defeating Mikhael's required asymmetry.

      But the same circuit logic creates tension for the dual-process model. It seems they are proposing that the frontal cortex projects V_MB into VTA dopamine neurons (as proposed in 3.1 and the Discussion) and adds to the prediction error derived from the biphasic filtering of value. But the biphasic idea (and data of Campbell et al.) implies that the V(t+1) and -V(t) come from the same source and are proportional. Adding the V_MB term is akin to adding a positive bias, breaking the optimality of the TD error for predicting value and predicting over-learning of cached value. It is worth considering whether V_MB passes through a similar filter - I am not sure if it is fatal if V_MB contributes somewhat to the negative term of the update error.

      (4) A few places where the predicate of the conclusion needs more care. The "normative" framing throughout 3.2 and the Discussion is normative conditional on the architecture already including a separate cached system that needs to converge to the true value function and on a system in which the model based is learnt much faster - see comments about learning rate parameter later.

      (5) Kim et al. is cited heavily as a data source for Figure 4, but is never engaged with as a theoretical alternative, even though Kim et al. explicitly argued that an appropriate state representation makes standard TD compatible with ramps and the teleport responses. That is, Kim et al. is already a TD account of these phenomena, and doesn't require a second learning system. The introduction and Mikhael discussion treat the field as if the choice were between "dopamine = value" (Hamid, Howe, Mohebi) and dopamine = RPE-with-special-conditions (Mikhael, Kato-Morita), but Kim et al.'s framework is also dopamine = RPE. Two specific places this matters: (i) Figure 4 currently demonstrates that the dual-process model reproduces the Kim teleport results, but Kim et al.'s framework also reproduces them - the figure doesn't distinguish the two, and I am not sure the figure gives this message cleanly. (ii) Kim et al. report that ramps develop with training over days; the manuscript should address whether the dual-process model has an alternative explanation for this, especially given the contrast with the Guru result (ramps diminishing with training over a longer timescale).

      (6) The arbitration parameter k is fixed at 0.5 throughout, and the paper acknowledges this is for simplicity, but a supplementary panel sweeping k ∈ {0, 0.2, 0.5, 0.8, 1.0} on the key figures (Figure 1B convergence, Figure 2D ramp dynamics, Figure 3D Krausz updating) would be informative. At k = 0, the model reduces to standard TD; at k = 1, it's effectively V_MB-driven. I think these would be easy to add and help clarify the work this assumption is doing.

      (7) Learning-rate asymmetry needs justification. The story relies on α_MB >> α_TD throughout (α_MB = 0.50, α_TD = 0.01 - a 50× ratio). With α_MB = 0.5, a single rewarded trial moves R[goal] halfway to the new value, which would predict strong dependence of dopamine ramp amplitude on the previous trial's outcome. This is testable in existing data (Krausz et al. should have enough trials to fit the exponential decay constant for trial-history dependence; Guru's swap-session data likewise), and the paper would be strengthened by explicitly deriving and checking that prediction.

      (8) α_MB is dropped to 0.10 specifically for the Krausz simulation without justification in the text - Why? Either the value should be the same as elsewhere, or the paper should explain why Krausz's task requires slower MB learning. It would be good to check the robustness of the Krausz simulation - the test phase is a single set of three trials (t-2 = omission, t-1 = reward, then t = 50% rewarded) after training on a single set of 500 simulated trials (believe only one random seed is used - given the high alpha, varying this set of simulated trials seems important). Also, do they get the other result in Krausz (t-2 = reward, t-1 = omission, t = 50% rewarded)?

      (9) It might be possible to fit the alpha to the Guru and Krausz simulations - this might be informative to show the range over which it varies.

      (10) The Kato and Morita account is cited in the introduction but never really discussed again - it would be good to engage with this a bit more in the discussion. The rejection of the value-based accounts seems to rely primarily on Kim et al., where the value and TDRPE accounts differ, but this could be directly acknowledged, rather than absorbing credit for this into their model.

    2. Reviewer #2 (Public review):

      Summary:

      This paper offers a novel theoretical account of dopamine ramps. The key idea is that the reward prediction error (putatively signaled by dopamine) uses a partially model-based estimate for future value (the prediction target). Because the model-based value estimate emerges more rapidly than the model-free estimate, it inflates the RPE, and this inflation increases with reward proximity - hence ramps. The authors show that this account can explain many aspects of existing data on dopamine ramps across several different studies.

      Strengths:

      Overall, I liked this paper. The idea is interesting and plausible. The paper is well-written and clearly argued. The modeling has been done rigorously.

      Weaknesses:

      My major comments are: (1) it's not always clear which phenomena are uniquely well-explained by this new account vs. earlier accounts; and (2) the limitations of the account are not entirely transparent.

      (1) The paper models some of the studies reported by Kim et al (2020). As was already shown in that paper, a standard TD error could explain the results (although a major limitation of that treatment was that it did not model the recursive effect of RPEs on learning, as discussed in the Mikhael paper). It's not clear if there's additional explanatory value provided by this new account, though, of course, it's good to know that those results are captured by the new account. Likewise, Mikhael et al (2022) already offered an account of their data (somewhat more complex than the standard TD model). Again, it's not clear if there's additional explanatory value provided by the new account (and again, it's nice to see that the model can capture these results). Finally, I found myself wondering whether the Guru et al (2020) result couldn't be explained by a more standard TD model (assuming the value function is sufficiently convex). I don't think it's essential that the new account provides additional explanatory value in every case, but I think it's important to convey to readers what's new and what's not, as well as what aspects of the data require particular kinds of mechanisms to explain. It would be really helpful to see the predictions of alternative TD models in order to make this clearer.

      (2) The Mikhael model was motivated by the puzzle that ramping is observed in navigation tasks (with sensory cues) but typically not in classical conditioning tasks lacking sensory cues. The correction term, derived from normative considerations, explained this discrepancy. It's not clear to me if/how the new account can explain the discrepancy.

    3. Reviewer #3 (Public review):

      Summary:

      This work presents a new hypothesis for why dopamine signals have sometimes been observed to "ramp up" in spatial tasks as rodents approach a location associated with reward. In essence, the hypothesis is that value estimates (i.e., predictions about future rewards) from a model-based system, which may be able to more quickly form such estimates via an inference-like process, can be used to speed up the (relatively slow) learning of such estimates by a model-free system. This is suggested to occur by including the model-based estimate as part of the target towards which model-free estimates are updated in the course of temporal-difference (TD) learning. The early discrepancy between these estimates can be expected to give rise to systematic TD errors - putatively represented in dopaminergic activity - that give rise to dopamine ramps, which are expected to diminish over time as the estimates of both systems converge. The authors show that a model that implements this idea makes predictions about dopamine activity that are a good qualitative match to data from a number of recent experimental studies.

      Strengths:

      The work suggests a normative account for a phenomenon that has persistently troubled the canonical theory of dopamine function. The account is appealing in its elegance and simplicity, and the authors present compelling evidence that it can capture the empirical observations of key recent papers. Another strength of the account is that it readily suggests avenues for future theory development and experimental test, including what the 'best' target estimate should be at any given time, how rapidly one might expect ramps to develop or diminish, and the neural implementation of the proposed algorithm. This is likely to stimulate further theoretical and experimental work in the field.

      Weaknesses:

      One aspect of dopamine "ramps" that was troubling from a theoretical standpoint was their apparent persistence over time. Given the authors' prediction that these would disappear over time in a stable environment and the supporting evidence they cite (from Guru et al., 2000), the reader might be left confused about the state of evidence about whether dopamine ramps persist or not. Perhaps relatedly, the issue of how the activity of dopamine cells and dopamine release are related is not discussed, which may be relevant given that early studies (e.g., Howe et al., 2013) used voltammetry to measure extracellular dopamine concentrations.

    1. Reviewer #1 (Public review):

      Summary:

      The authors develop alignment methods for layer-specific widefield calcium imaging in the mouse cortex. Under the assumption that the majority of the widefield signal originates at the level of the cell bodies, different cortical layers will appear at different locations in a top-down view as a function of the curvature of the mouse cortex. The authors develop software tools to correct for this, as well as depth-dependent source blurring. Finally, they apply these tools to investigate functional connectivity differences of different neuron types and find only subtle differences.

      Strengths:

      The work is technically strong, the experiments well executed, and the presentation clear.

      Weaknesses:

      One concern I have is that the central assumption underlying the rationale for the depth correction, namely that the source of the majority of the widefield signal is the cell body, may be incorrect. Layer 5 neurons have a dense axo-dendritic plexus very close to the surface of the cortex. Given the attenuation length of visible light in tissue, as well as our own measurements (https://elifesciences.org/articles/71476#fig6s1), I suspect that the majority of the widefield calcium signal originates in the superficial axo-dendritic plexus. The authors acknowledge this possibility, but there are a few simple measurements they could make to address this more directly. If indeed, as I suspect, the majority of the calcium signal originates in the first 50 um of tissue (even when imaging layer 5 neurons), the curvature correction is counterproductive, of course. The authors could test the effect of adding brain slices of varying thicknesses on top of e.g., a layer 2/3 widefield recording. If the authors are correct, and most of the signal is from cell bodies, this should, at most, attenuate the layer 2/3 recording to the level of a layer 5 recording. Anecdotally, while doing the measurements for the figure referenced above, we have done this experiment with a 100 um thick slice, and no quantifiable calcium responses remained.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript by Lorenzo and colleagues presents wide-field cortical imaging data obtained from experiments conducted with three triple-transgenic mouse lines that specifically express the calcium sensor GCaMP6f in neurons of layers 2/3, 5, and 6 of the neocortex, respectively.

      It first includes a methodological contribution aimed at optimizing the analysis of the acquired signals, taking into account both the geometry of the neocortex and photon scattering in the cortical tissue, which affect fluorescence signals differentially depending upon their cortical depth of origin.

      In particular, they built upon the work previously published in eLife by Waters in 2024, which, based on a simulation of photon scattering using a Monte Carlo random-walk model, provided an estimate of the tissue volumes contributing to the fluorescence signals measured from the surface in several mouse lines expressing Gcamp in a layer-specific manner.

      The authors here additionally performed empirical measurements of the point spread function at different cortical depths to determine spatial kernels to be used to deconvolve wide-field imaging data acquired from their three-layer-specific GCaMP6f-expressing mouse lines. They assess the added value of this deconvolution approach based on recordings of the cortical responses evoked by whisker stimulation in the barrel cortex, using lightly anesthetized, layer 2/3 and layer 5 GCaMP6f-expressing mice.

      Altogether, these proposed methods aim at optimizing the registration of recorded signals on a common reference frame, allowing to compare cortical spatiotemporal dynamics recorded from distinct layer-specific GCaMP-expressing mice.

      The manuscript further contains a more neurophysiological contribution, directly utilizing the proposed methods to perform a comparative layer-specific functional connectivity analysis from data collected with the 3 different mouse lines, while the mice were head-fixed below the macroscope.

      Strengths:

      Wide-field 1-photon functional optical imaging, which allows recording cortical spatiotemporal dynamics over a large portion of the dorsal neocortex in mice, has become a tool of choice to study how activity over a wide range of cortical areas is orchestrated in various behavioral contexts. The ever-increasing availability of transgenic mice exhibiting pan-cortical calcium- or voltage-dependent sensors within specific neuronal populations is generating a growing interest in these approaches among the neuroscientific community.

      Nowadays, it is possible to image specifically the activity of excitatory neurons whose cell bodies are located in given cortical layers. However, interpreting fluorescence signals recorded from the surface while originating from deep layers proves difficult due to photon scattering, which reduces image definition, as previously established by Waters et al. (2024).

      The ability to correct for this blurring effect and to place the recorded signals within a common frame of reference is therefore essential not only for comparing activity across layers but also for integrating findings across studies, thereby advancing our collective understanding of neocortical physiology.

      In this sense, this work by Lorenzo and colleagues is definitely both timely and valuable.

      Overall, the manuscript is clearly structured and well-written, and the figures are of excellent graphic quality.

      The proposed approach to correct the blurring of the fluorescent signals, which increases with depth, by means of empirical measurements of point spread functions and deconvolution, seems pertinent and efficient.

      Finally, the authors have collected evoked and spontaneous dynamics of calcium signals from 3 different layer-specific GCaMP mice, which in itself represents a substantial experimental effort, not least because of the need to generate the animals. Out of these data, they provide a unique comparative analysis of layer-specific functional connectivity.

      Weaknesses:

      To fully benefit a large community, some aspects of the proposed methodological advances need to be more detailed in the manuscript and potentially refined. For instance, it is very difficult to evaluate, given the tiny confocal images provided in Figure 1, the potential contribution of GCaMP signal from apical dendrites of layer V neurons in Rbp4-GCaMP6f mice. It is also difficult for the reader to assess the added value of the layer-specific reference maps, given that functional image registration relies on nonlinear transformations and limited detail is provided regarding the procedure used to realign the functional data with these maps (lines 465-467). It is not really clear how the illustrated "composite maps" and the "five functional spots" used for the registration are computed. In addition, one could question the choice of the large time windows used to generate these composite maps/functional landmarks. Since the early component of the evoked responses is more likely to reflect the location of the initial thalamocortical inputs, restricting the analysis to the early phase of the responses might improve the accuracy of primary cortical area identification. This concern regarding the time window used to define specific cortical representation areas may also be relevant to Figure 4, which illustrates the results of the proposed deconvolution approach used to correct for photon scattering (although the time windows used for these analyses are not specified).

      With regard to Figure 4, the reader might wonder why the results are not illustrated similarly for the layer 6 mice. It would therefore be useful to clearly indicate whether these data are not shown because they were not collected, or because it proved impossible to identify single whisker representations, despite the proposed deconvolution procedure.

      Regarding the analysis of layer specificity in terms of functional connectivity, the authors extensively use the term "resting-state" to describe the behavioral context of data collection, given that the animals were not engaged in a goal-directed task. However, because the mice were experiencing head fixation beneath a functional epifluorescence macroscope for only the second time, it is questionable whether this state can truly be classified as "resting." As indicated by the global quantification of body movements, the animals most likely alternated between quiet wakefulness and more active phases.

      To allow the reader to accurately interpret the reported functional connectivity differences, the authors should at least provide a quantification of the time animals spent in the quiet versus active states, and assess whether these proportions were comparable between the different mouse lines. Another way to address this issue would be to perform functional connectivity analyses after splitting the data according to these two states based on body movement quantification, although it is difficult to assess the feasibility of this approach without knowing the temporal distribution of these states within the dataset.

      This seems particularly important since differences in neural cross-regional correlation patterns have been linked to arousal levels, with a comparable optical imaging approach, by Shahsavarani and colleagues (Cell Reports, 2023), who compared initial and prolonged resting periods. In addition, the authors report here that layer differences in functional connectivity are more pronounced in regions associated with the default mode network, whose activity is likely to differ between quiet and active wakefulness.

      Finally, given the richness of the dataset, it would be very interesting to assess how the proposed deconvolution approach affects PCA-ICA-based functional parcellation of spontaneous cortical activity (Reidl et al., NeuroImage, 2007; Makino et al., Neuron, 2017) and whether it enables cross-layer comparisons of independent cortical modules. Such supplementary analyses would substantially increase the impact of this work.

    3. Reviewer #3 (Public review):

      This paper provides valuable technical and theoretical validation of layer-specific wide-field imaging. Here, the authors use specific transgenic lines that provide layer-specific cell body expression (and some superficial dendrites). They then use deconvolution approaches and potentially more accurate atlases based on depth-dependent features to register and resolve what are layer-specific functional GCaMP signals.

      In general, the work is extremely well done, and I have little specific criticism. I think the author should be commended for their creative solutions, including using the light source at different depths to measure apparent scattering and blurring, allowing them to incorporate the deconvolution approach.

      Throughout the manuscript, they refer to the signals as layer-specific and, for the most part, conclude similar functional connectivity as in different layers with some noted exceptions. This is an outstanding resource for the community.

      Major Comment:

      I think they should add some caveats that the lines that they employ do contain dendrites that are in more superficial cortices. Could they make some estimates of signal contribution from these, say, layer 6 neuron superficial dendrites versus the deep somata? This clarification should be included in the abstract; maybe they could call these apparent somatic signals? Another way of doing this would be a Soma-targeted deep indicator, but this is probably beyond the scope of the paper.

      Alternatively, how much of the layer 5 signal would be expected to be recovered?

    1. Reviewer #1 (Public review):

      Summary:

      The current manuscript characterizes in detail the macrophages in the thymus. The authors identify two distinct populations of thymic macrophages and describe their surface marker expression and transcriptional signatures. They also explore their ontology and kinetics of settling and persistence in the thymus and find that the TIMD4+ macrophages are derived from embryonic progenitors and self-maintain in the thymus, while the TIMD4- macrophages are derived from monocytes. Most importantly, the authors test the functional importance of thymic macrophages for T cell development using an in vitro depletion system, from which they conclude that macrophages are important for one of the earliest selection steps in T cell development - the beta selection.

      Strengths:

      The authors use state-of-the-art techniques, such as multiple genetically modified mice, multi-color flow cytometry, single-cell RNA sequencing, genetic fate mapping, and fetal thymic organ culture (FTOC) combined with depletion. Their work is in good agreement with prior published studies on the subject, such as Tacke et al. (PMID: 26091486) and Zhou et al. (PMID: 36449334). In addition to reproducing prior knowledge, the authors uncover novel and unexpected facets of thymic macrophage biology, such as their SpiC independence and the fact that TIMD4- thymic macrophages depend on CCR2 (Tacke et al. have shown that the overall thymic macrophage compartment is normal in CCR2-/- mice). Most surprisingly, the authors claim that thymic macrophages control an early checkpoint in T cell development, the beta selection. This has not been reported before, as beta selection is usually considered a cell-autonomous process in thymocytes that does not require input from other cells.

      Weaknesses:

      The thymic macrophage depletion experiments are not well controlled, and the authors' interpretation of the results is a stretch. First, the treatment depletes other cell types, most notably dendritic cells (DCs), which have well-known roles in thymic selection (though not specifically in beta selection). The authors' reasoning that macrophages are abundant in the cortex, where beta selection occurs, while DCs are enriched in the medulla, seems questionable, as the embryonic thymus typically lacks (or has very small) medulla. A second salient point is that the authors haven't ruled out direct toxicity of the dimerizer drug AP20187 on thymocytes (specifically DN cells) in MAFIA mice.

      Altogether, this is a solid manuscript that largely confirms the previously established ontogeny and heterogeneity of thymic macrophages. However, the participation of thymic macrophages in beta selection needs stronger evidence.

    2. Reviewer #2 (Public review):

      This manuscript from Zuniga-Pflucker laboratory describes that thymic macrophages are heterogeneous in flow cytometric and transcriptomic profiles, containing two major populations characterized by TIMD4 and CX3CR1 expression. These macrophage populations are both parenchymal in the thymus but are unequal in developmental ontogeny, Flt3 expression history, and CCR2 dependency. The manuscript further reports the interesting findings that the depletion of thymic macrophages impairs thymocyte development at the DN3 beta-selection checkpoint. These results provide an important advance for further understanding of thymus biology, especially in view of the contribution of heterogenous thymic macrophage subpopulations.

      However, Zhou et al. previously reported essentially similar heterogeneity in thymic macrophages. It was demonstrated that TIMD4+ macrophages and CX3CR1+ macrophages have distinct origins and are different in developmental characteristics (27). The authors should better clarify what was previously demonstrated and what is newly described in this study. Zhou, et al. also demonstrated that TIMD4+ macrophages are localized in the cortex whereas CX3CR1+ macrophages distribute in the medullary region. Whether or not these previous findings are reproduced and supported in the present study is important in view of the new finding that thymic macrophages are important for beta-selection, which is presumed to occur in the thymic cortex. The authors may be able to suggest more strongly that TIMD4+ macrophages regulate beta-selection in the thymic cortex through phagocytic efferocytosis. (Indeed, the Figure 1 legend states that frozen thymic sections were used for immunofluorescent staining to identify the localization of thymic macrophages, without showing the results.)

    1. Reviewer #1 (Public review):

      Summary:

      The authors use Dyngo-4a, a known Dynamin inhibitor to test its influence on caveolar assembly and surface mobility. They investigate whether it incorporates into membranes with Quartz-Crystal Microbalance, they investigate how it is organized in membranes using simulations. Finally, they use lipid-packing sensitive dyes to investigate lipid packing in the presence of Dyngo-4a, membrane stiffness using AFM and membrane undulation using fluorescence microscopy. They also use a measure they call "caveola duration time" to claim that something happens to caveolae after Dyngo-4a addition and using this parameter, they do indeed see an increase in it in response to Dyngo-4a, which is reduced back to the baseline after addition of cholesterol.

      Overall, the authors claim: 1) Dyngo-4a inserts into the membrane and this 2) results in "a dramatic dynamin-independent inhibition of caveola scission". 3) Dyngo-4a was inserted and positioned at the level of cholesterol in the bilayer and 4) Dyngo-4a-treatment resulted in decreased lipid packing in the outer leaflet of the plasma membrane 5) but Dyngo-4a did not affect caveola morphology, caveolae-associated proteins, or the overall membrane stiffness 6) acute addition of cholesterol counteracts the block in caveola scission caused by Dyngo-4a.

      Overall, in this reviewers opinion, after the additional experiments in the review process, all claims are now well-supported by the presented data from electron and live cell microscopy, QCM-D and AFM.

      Significance:

      A number of small molecule inhibitors for the GTPase dynamics exist, that are commonly used tools in the investigation of endocytosis. This goes as far that the use of some of these inhibitors alone is considered in some publications as sufficient to declare a process to be dynamin-dependent. However, this is not always correct, as there are considerable off-target effects, including the inhibition of caveolar internalization by a dynamin-independent mechanism. This is important, as for example the influence of dynamin small molecule inhibitors on chemotherapy resistance is currently investigated (see for example Tremblay et al., Nature Communications, 2020).

      The investigation of the true effect of small molecules discovered as and used as specific inhibitors and their offside effects is extremely important and this reviewer applauds the effort. It is important that inhibitors are not used alone, but other means of targeting a mechanism are exploited as well in functional studies. The audience here thus is besides membrane biophysicists interested in the immediate effect of the small molecule Dyngo-4a also cell biologists and everyone using dynamic inhibitors to investigate cellular function.

      Comments on revised version.

      Overall, in this reviewer's opinion, after the additional experiments in the review process, all claims are now well-supported by the presented data from electron and live cell microscopy, QCM-D and AFM.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors probe the mechanisms by which Dyngo-4a, a dynamin inhibitor used to block endocytosis, impact caveolae dynamics. They provide compelling evidence that Dyngo-4a inhibits caveolae dynamics and endocytosis (as well as several other aspects of plasma membrane dynamics) by a dynamin-independent mechanism. They also provide strong computational and experimental data showing that Dyngo-4a inserts into membranes and decreases lipid packing in the outer leaflet of the plasma membrane. Finally, they demonstrate that the addition of excess cholesterol to cells reverses the effects of Dyngo-4a on caveolae dynamics, presumably by reversing lipid packing defects. Based on these findings they conclude that lipid packing regulates caveolae dynamics and endocytosis in a cholesterol-dependent manner.

      This work should be of value to cell biologists interested in plasma membrane remodeling and membrane trafficking, biophysicists that study small molecule/membrane interactions and membrane remodeling processes, and chemists interested in designing drugs to target membrane trafficking machinery and pathways.

      Strengths and weaknesses:

      This work addresses the important topic of how a widely used endocytic inhibitor actually works. In the process of addressing this question, the authors uncover unexpected connections between how lipids are packed in cell membranes and membrane dynamics. The methods are appropriate and many of the claims made in this work are well supported by data.

      The authors have also been responsive to comments raised during review by including additional experimental evidence that Dyngo-4a inhibits caveolae endocytosis as well as documenting the effects of Dyngo-4a on caveolae morphology.

      The work also raises some interesting questions for the future. As one example, the authors note that in addition to inhibiting caveolar dynamics, Dyngo-4a inhibits generalized plasma membrane mobility, transferrin uptake, and fusion of fusogenic liposomes to the plasma membrane. More work will be required to determine whether these events are mediated by a common, lipid packing-dependent mechanism.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      This manuscript offers a careful and technically impressive dissection of how subpopulations within the subthalamic nucleus (STN) support reward-biased perceptual decision-making. The authors recorded STN neurons in monkeys performing an asymmetric-reward visual motion discrimination task, then combined single-unit analyses, regression modeling, and drift-diffusion model (DDM) fitting to identify functionally distinct neuronal clusters. Each subpopulation shows unique relationships to computational decision variables - evidence accumulation rate, decision bound, and non-decision time - as well as to post-decision evaluative signals including choice accuracy and reward expectation. The revised manuscript substantially strengthens the original submission by improving both the objectivity of neuron selection and the robustness of the clustering solution.

      Strengths:

      The asymmetric-reward paradigm cleanly separates perceptual and motivational contributions to STN activity, allowing the authors to characterize how neurons blend these distinct sources of information. The dataset is extensive and well-controlled, and the behavioral and neural analyses are tightly integrated. Relating cluster-specific activity to DDM parameters provides an interpretable computational link between population signals and behavior. The clustering solution is now validated across two algorithms, two monkeys, and subsets of trials - establishing that the three-cluster structure is robust. The new Figure 9 offers a conceptually useful, if necessarily speculative, synthesis connecting the identified subpopulations to distinct basal-ganglia pathways (hyperdirect versus indirect). The new Figure 8 documenting the anatomical intermingling of subpopulations is also important, as it directly informs the interpretation of prior and future STN stimulation studies.

      Weaknesses:

      The inferred relationships between neural clusters and DDM parameters remain correlational - the authors now appropriately flag this throughout, and the causal inference gap is acknowledged in the Discussion with concrete proposals for future targeted perturbation strategies. While a generative multi-cluster model would further strengthen mechanistic interpretation, the conceptual framework in Figure 9 provides a reasonable intermediate step given the scope of the study and the absence of simultaneous population recordings, which preclude direct inter-cluster covariation analyses. These remaining limitations are inherent to the experimental design rather than analytical oversights.

      Comments on the previous version:

      The authors have responded thoroughly and constructively to all of my concerns. The revised clustering pipeline - incorporating finer temporal resolution, objective neuron selection, outlier removal, a second clustering algorithm, cross-monkey validation (Rand indices of 0.94 and 1.0 for the two monkeys), and trial-subset stability analysis - substantially increases confidence in the three-cluster solution. The correlational nature of the DDM-activity relationships is now clearly stated, and the Discussion appropriately contextualizes the causal inference gap while suggesting feasible future directions. The new Figure 9 provides the conceptual synthesis I had hoped for, within the realistic scope of the present study. I am satisfied with the authors' responses and have no further requests.

    2. Reviewer #2 (Public review):

      This study uses monkey single-unit recordings to examine the role of the STN in combining noisy sensory information with reward bias during decision-making between saccade directions. Using multiple linear regressions and clustering approaches, the authors overall show that a highly heterogeneous activity in the STN reflects almost all aspects of the task, including choice direction, stimulus coherence, reward context and expectation, choice evaluation, and their interactions. The authors report in particular how three classes of neurons map to different decision processes evaluated via the fitting of a drift-diffusion model. Overall, the study provides evidence for functionally diverse and anatomically intermingled populations of STN neurons, supporting multiple roles in perceptual and reward-based decision-making.

      This study follows up on work conducted in previous years by the same team and complements it. Extracellular recordings in monkeys trained to perform a complex decision-making task remain a remarkable achievement, particularly in brain structures that are difficult to target, such as the sub-thalamic nucleus. The authors conducted numerous analyses of STN activities, using sophisticated statistical approaches and functional computational modeling.

    1. Reviewer #1 (Public review):

      Summary:

      In the paper, the authors propose a new RNA velocity method, TSvelo, which predicts the transcription rate linearly based on the expression of RNA levels of transcription factors. This framework is an extension of its recent work TFvelo by including unspliced reads and designing a coherent neuralODE framework. Improved performance was demonstrated in six diverse datasets.

      Strengths:

      Overall, this method introduces innovative solutions to link cell differentiation and gene regulation, with a balance between model complexity (neuralODE) and interpretability (raw gene space).

      Comments on revised version:

      The authors have added comprehensive analyses in this revision, and all of my concerns have been very well addressed. Here, I just want to re-emphasize the original points 1 and 3.

      (1) The analysis and clarification are very helpful - thanks! I found that Fig. R1 and R2 are very insightful, as DoRothEA-only returns much worse performance. Please consider adding these two figures to the supp figure and possibly highlighting your setting for edge pruning (down-weights); therefore, the model is more likely to be affected by false negatives than false positives in the TF-target prior.

      (3) Please consider adding some discussion on the challenges in capturing cell cycle transitions.

    2. Reviewer #3 (Public review):

      Despite the abundance of RNA velocity tools, there are still major limitations, and there is strong skepticism about the results these methods lead to. In this paper, the authors try to address some limitations of current RNA velocity approaches by proposing a unified framework to jointly infer transcriptional and splicing dynamics. The method is then benchmarked on 6 real datasets against the most popular RNA velocity tools.

      Comments on revised version.

      The Authors addressed all my comments suitably. I'd like to thank them for the time they spent addressing them: the revised paper is much more convincing.

      I have 2 very minor follow-up concerns:

      (1) I appreciated the simulation study, however, no null simulation is present.<br /> We know RNA velocity tools are inclined to provide false positives: trajectories even when the data doesn't have any.<br /> I'd be helpful to add null simulations where the data has no trajectories and see if methods erroneously identify any.

      (2) Several of the novel analyses are only reported in the Supplementary material and only references in the main text (e.g., "A validation of TSvelo on simulated data is provided in Fig. S1 and Fig. S2 in the Supplementary Information."). This is pity!

      If allowed, I'd add some comments about the new analyses (simulations, computational benchmarks, etc...) also in the main text.

    1. Reviewer #1 (Public review):

      The authors have conducted substantial additional analyses to address the reviewers' comments. However, several key points still require attention. I was unable to see the correspondence between the model predictions and the data in the added quantitative analysis. In the rebuttal letter, the delta peak speed time displays values in the range of [20, 30] ms, whereas the data were negative for the 45{degree sign} direction. Should the reader directly compare panel B of Figure 6 with Figure 1E? The correspondence between the model and the data should be made more apparent in Figure 6. Furthermore, the rebuttal states that a quantitative prediction was not expected, yet it subsequently argues that there was a quantitative match. Overall, this response remains unclear.

      A follow-up question concerns the argument about strategic slowing. The authors argue that this explanation can be rejected because the timing of peak speed should be delayed, contrary to the data. However, there appears to be a sign difference between the model and the data for the 45{degree sign} direction, which means that it was delayed in this case. Did I understand correctly? In that regard, I believe that the hypothesis of strategic slowing cannot yet be firmly rejected and the discussion should more clearly indicate that this argument is based on some, but not all, directions. I agree with the authors on the importance of the mass underestimation hypothesis, and I am not particularly committed to the strategic slowing explanation, but I do not see a strong argument against it. If the conclusion relies on the sign of the delta peak speed, then the authors' claims are not valid across all directions, and greater caution in the interpretation and discussion is warranted. Regarding the peak acceleration time, I would be hesitant to draw firm conclusions based on differences smaller than 10 ms (Figures R3 and 6D).

      The authors state in the rebuttal that the two hypotheses are competing. This is not accurate, as they are not mutually exclusive and could even vary as a function of movement direction. The abstract also claims that the data "refutes" strategic slowing, which I believe is too strong. The main issue is that, based on the authors' revised manuscript, the lack of quantitative agreement between the model and the data for the mass underestimation hypothesis is considered acceptable because a precise quantitative match is not expected, and the predictions overall agree for some (though not all) directions and phases (excluding post-in). That is reasonable, but by the same logic, the small differences between the model prediction and the strategic slowing hypothesis should not be taken as firm evidence against it, as the authors seem to suggest. In practice, I recommend a more transparent and cautious interpretation to avoid giving readers the false impression that the evidence is decisive. The mass underestimation hypothesis is clearly supported, but the remaining aspects are less clear, and several features of the data remain unexplained.

      Comments on revised version.

      The authors have reworked the sections of the text where the narrative was too strong or binary wrt alternative interpretations. The result is well balanced. No further recommendation.

    2. Reviewer #3 (Public review):

      Summary:

      The authors describe an interesting study of arm movements carried out in weightlessness after a prolonged exposure to the so-called microgravity conditions of orbital spaceflight. Subjects performed radial point-to-point motions of the fingertip on a touch pad. The authors note a reduction in movement speed in weightlessness, which they hypothesize could be due to either an overall strategy of lowering movement speed to better accommodate the instability of the body in weightlessness or an underestimation of body mass. They conclude for the latter, mainly based on two effects. One, slowing in weightlessness is greater for movement directions with higher effective mass at the end effector of the arm. Two, they present evidence for increased number of corrective sub movements in weightlessness. They contend that this provides conclusive evidence to accept the hypothesis of an underestimation of body mass.

      Strengths:

      In my opinion, the study provides a valuable contribution, the theoretical aspects are well presented through simulations, the statistical analyses are meticulous, the applicable literature is comprehensively considered and cited and the manuscript is well written.

      Weaknesses:

      I nevertheless am of the opinion that the interpretation of the observations leaves room for other possible explanations of the observed phenomenon, thus weakening the strength of the arguments.

      I raised the following points in my original review, but I find that the authors have judiciously addressed these points through their various revisions.

      I believe that the article constitutes a valuable contribution and that the results and conclusions are certainly worthy of consideration by the human motor control community.

      (1) The authors model the movement control through equations that derive the input control variable in terms of the force acting on the hand and treating the arm as a second-order low pass filter (Eq. 13). Underestimation of the mass in the computation of a feedforward command would lead to a lower-than-expected displacement to that command. But it is not clear if and how the authors account for a potential modification of the time constants of the 2nd order system. The CNS does not effectuate movements with pure torque generators. Muscles have elastic properties that depend on their tonic excitation level, reflex feedback and other parameters. Indeed, Fisk et al.* showed variations of movement characteristics consistent with lower muscle tone, lower bandwidth and lower damping ratio in 0g compared to 1g. Could the variations in the response to the initial feedforward command be explained by a misrepresentation of the limbs damping and natural frequency, leading to greater uncertainty to the consequences of the initial command. This would still be an argument for un-adapted feedforward control of the movement, leading to the need for more corrective movements. But it would not necessarily reflect an underestimation of body mass.

      *Fisk, J. O. H. N., Lackner, J. R., & DiZio, P. A. U. L. (1993). Gravitoinertial force level influences arm movement control. Journal of neurophysiology, 69(2), 504-511.

      While the authors attempt to differentiate their study from previous studies where limb neuromechanical impedance was shown to be modified in weightlessness by emphasizing that in the current study the movements were rapid and the initial movement is "feedforward". But this incorrectly implies that the limb's mechanical response to the motor command is determined only by active feedback mechanisms. In fact:

      (a) All commands to the muscle pass through the motor neurons. These neurons receive descending activations related not only to the volitional movement, but also to the dynamic state of the body and the influence of other sensory inputs, including the vestibular system. A decrease in descending influences from the vestibular organs will lower the background sensitivity to all other neural influences on the motor neuron. Thus, the motor neuron may be less sensitive to the other volitional and reflexive synaptic inputs that it may receive.

      (b) Muscle tone plays a significant role in determining the force and the time course of the muscle contraction. In a weightless environment, where tonic muscle activity is likely to be reduced, there is the distinct possibility that muscles will react more slowly and with lower amplitude to an otherwise equivalent descending motor command, particularly in the initial moments before spinal reflexes come into play. These, and other neuronal mechanisms could lead to the "under-actuation" effect observed in the current study, without necessarily being reflective of an underestimation of mass per se.

      (2) The subject's body in weightless is much more sensitive to reaction forces in interactions with the environment in the absence of the anchoring effect of gravity pushing the body into the floor and in the absence of anticipatory postural adjustments that typically accompany upper-limb motions in Earth gravity in order to maintain an upright posture. The authors dismiss this possibility because the taikonauts were asked to stabilize their bodies with the contralateral hand. But the authors present no evidence that this was sufficient to maintain the shoulder and trunk at a strictly constant position, as is supposed by the simplified biomechanical model used in their optimal control framework. Indeed, a small backward motion of the shoulder would result in a smaller acceleration of the fingertip and a smaller extent of the initial ballistic motion of the hand with respect to the measurement device (the tablet), consistent with the observations reported in the study. Note that stability of the base might explain why 45º movements were apparently less affected in weightlessness, according to many of the reported analyses, including those related to corrective movements (Fig. 5 B, C, F; Fig. 6D), than the other two directions. If the trunk is being stabilized by the left arm, the same reaction forces on the trunk due to the acceleration of the hand will result in less effective torque on the trunk, given that the reaction forces act with a much smaller moment arm with respect to the left shoulder (the hand movement axis passes approximately through the left shoulder for the 45º target) compared to either the forward or rightward motions of the hand.

      (3) The above is exacerbated by potential changes in the frictional forces between the fingertip and the tablet. The movements were measured by having the subjects slide their finger on the surface of a touch screen. In weightlessness, the implications of this contact can be expected to be quite different than on the ground. While these forces may be low on Earth, the fact is that we do not know what forces the taikonauts used on orbit. In weightlessness, the taikonauts would need to actively press downward to maintain contact with the screen, while on Earth gravity will do the work. The tangential forces that resist movement due to friction might therefore be different in 0g. . Indeed, given the increased instability of the body and the increased uncertainty of movement direction of the hand, taikonauts may have been induced to apply greater forces against the tablet in order to maintain contact in weightlessness, which would in turn slow the motion of the finger on the table and increase the reaction forces acting on the trunk. This could be particularly relevant given that the effect of friction would interact with the limb in a direction-dependent fashion, given the anisotropy of the equivalent mass at the fingertip evoked by the authors.

      I feel that the authors have done an admirable job of exploring the how to explain the modifications to movement kinematics that they observed on orbit within the constraints of the optimal control theory applied to a simplified model of the human motor system. While I fully appreciate the value of such models to provide insights into question of human sensorimotor behaviour, to draw firm conclusions on what humans are actually experiencing based only on manipulations of the computational model, without testing the model's implicit assumptions and without considering the actual neurophysiological and biomechanical mechanisms, can be misleading. One way to do this could be to examine these questions through extensions to the model used in the simulations (changing activation dynamics of the torque generators, allowing for potential motion backward motion of the shoulder and trunk, etc.). A better solution would be to emulate the physiological and biomechanical conditions on Earth (supporting the arm against gravity to reduce muscle tone, placing the subject on a moveable base that requires that the body be stabilized with the other hand) in order to distinguish the hypothesis of an underestimation of mass vs. other potential sources of under-actuation and other potential effects of weightlessness on the body.

      In sum, my opinion is that the authors are relying too much on a theoretical model as a ground truth and thus overstate their conclusions. But to provide a convincing argument that humans truly underestimate mass in weightlessness, they should consider more judiciously the neurophysiology and biomechanics that fall outside the purview of the simplified model that they have chosen. If a more thorough assessment of this nature is not possible, then I would argue that a more measured conclusion of the paper should be 1) that the authors observed modifications to movement kinematics in weightlessness consistent with an under-actuation for the intended motion, 2) that a simplified model of human physiology and biomechanics that incorporates principles of optimal control suggest that the source of this under-actuation might be an underestimation of mass in the computation of an appropriate feedforward motor command, and 3) that other potential neurophysiological or biomechanical effects cannot be excluded due to limitations of the computational model.

    1. Reviewer #1 (Public review):

      Summary:

      The objective of this study was to infer the population dynamics (rates of differentiation, division and loss) and lineage relationships of NK cell subsets during an acute immune response and under homeostatic conditions.

      Strengths:

      A rich dataset and a detailed analysis of a particular class of stochastic models.

      Weaknesses: (relating to initial submission)

      The stochastic models used are quite simple; each population is considered homogeneous with first-order rates of division, death, and differentiation. In Markov process models such as these there is no dependence of cellular behavior on its history of divisions. In recent years models of clonal expansion and diversification, in the settings of T and B cells, have progressed beyond this picture. So I was a little surprised that there was no mention of the literature exploring the role of replicative history in differentiation (e.g. Bresser Nat Imm 2022), nor of the notion of family 'division destinies' (either in division number, or the time spent proliferating, as described by the Cyton and Cyton2 models developed by Hodgkin and collaborators; e.g. Heinzel Nat Imm 2017). The emerging view is that variability in clone (family) size arises may arise predominantly from the signals delivered at activation, which dictate each precursor's subsequent degree of expansion, rather than from the fluctuations deriving from division and death modeled as Poisson processes.

      As you pointed out, the Gerlach and Buchholz Science papers showed evidence for highly skewed distributions of family sizes, and correlations between family size and phenotypic composition. Is it possible that your observed correlations could arise if the propensity for immature CD27+ cells to differentiate into mature CD27- cells increases with division number? The relative frequency of the two populations would then also be impacted by differences in the division rates of each subset - one would need to explore this. But depending on the dependence of the differentiation rate on division number, there may be parameter regimes (and timepoints) at which the more differentiated cells can predominate within large clones even if they divide more slowly than their immature precursors. One might not then be able to rule out the two-state model. I would like to see a discussion or rebuttal of these issues.

      Comments on revised version.

      I am happy with the latest revisions that the authors have made.

    1. Reviewer #1 (Public review):

      Summary:

      Kashiwagi et al. undertook a population analysis of dendritic spine nanostructure applied to the objective grouping of 8 mouse models of neuropsychiatric disorders. They report that spine morphology in cultured hippocampal neurons shows a higher similarity among schizophrenia mouse models (compared with autism spectrum disorder (ASD) mouse models) and identify an effect of Ecrg4 (encoding small secretory peptides) on spine dynamics and shape in these models.

      Strengths:

      The study developed a method for objectively comparing spine properties in primary hippocampal neuron cultures from 8 mouse models of psychiatric disorders at the population level using high-resolution structured illumination microscopy (SIM) imaging. This novel technique identified two distinct groups of mouse models according to the population-level spine properties: those with ASD-related gene mutations and those with schizophrenia-related gene mutations. Functional studies, including gene knockdown and overexpression experiments, identified an effect of Ecrg4 on the spine phenotype of the schizophrenia model mice.

      Weaknesses:

      The main weakness is that the study is wholly in vitro, using cultured hippocampal neurons. The authors present this as an advantage, however, arguing that spine morphology as measured in a reduced culture system can demonstrate direct effects of gene mutations on neuronal phenotypes in the absence of indirect influences from nonneuronal cells or specific environments.

    2. Reviewer #2 (Public review):

      Okabe and colleagues build on a super-resolution-based technique they have previously developed in cultured hippocampal neurons, improving the pipeline and using it to analyze spine nanostructure differences across 8 different mouse lines with mutations in autism or schizophrenia (Sz) risk genes/pathways. It is a worthy goal to try to use multiple models to examine potential convergent (or not) phenotypes, and the authors have made a good selection of models. They identify some key differences between the autism versus the Sz risk gene models, primarily that dendritic spines are smaller in Sz models and (mostly) larger in autism risk gene models. They then focus on three models (2 Sz - 22q11.2 deletion, Setd1a; 1 ASD - Nlgn3) for timelapse imaging of spine dynamics, and together with computational modelling provide a mechanistic rationale for the smaller spines in Sz risk models. Bulk RNA sequencing of all 8 model cultures identifies several differentially expressed genes which they go on to test in cultures, finding that ecgr4 is upregulated in several Sz models and its misexpression recapitulates spine dynamics changes seen in the Sz mutants, while knockdown rescues spine dynamics changes in the Sz mutants. Overall, these have the potential to be very interesting findings and useful for the field. My major concerns from the initial manuscript, especially regarding cherry picking and circularity have been addressed with revised analytical approaches. I have some remaining minor comments.

      (1) The comparison between two wild-type samples versus wild-type-mutant samples is helpful - I think this could be added to the manuscript.

      (2) For results of timelapse imaging - please spell out in the results section the direction of change (lines 270 - 277).

      (3) Using linear mixed effect models for statistical analysis is a significant improvement. While a sample size (n) of mice = 3 is not ideal, I think given the multiple different mouse lines used and intensity of analysis, this is probably the best that can be done, although further validation in larger samples eventually is to be hoped for.

      (4) The revised text is much improved, but I still think the authors should be upfront somewhere in the text that the schizophrenia-associated genes can only confer biased risk for schizophrenia (and that the clinical phenotype can also include autism). As I said before, I think this is the best we can do and I agree with their choices, but it is important not to overstate the link. The differences they see make it clear that these are still relevant distinctions.

    1. Reviewer #1 (Public review):

      Summary:

      Roseby and colleagues report on a body region-specific sensory control of the fly larval righting response, a body contortion performed by fly larvae to correct their posture from an inverted (dorsal side down) position. This is an important topic because of the general need for animals to locomote in the correct orientation and the clever and broadly useful methodologies used in this paper to uncover the sensory triggers for the behavior, including a body region-specific optogenetic approach along different axial positions of the larva, region-specific manipulation of surface contacts with the substrate, and a 'water unlocking' technique to initiate righting behaviors, all strengths of the manuscript. The authors found that multidendritic neurons, particularly the daIV neurons, are necessary for righting behavior. The contribution of daIV neurons had been shown by the authors in a prior paper (Klann et al, 2021), but that study had used constitutive neuronal silencing. Here the authors used acute inactivation to confirm this finding. Additionally, the authors describe an important role for anterior sensory neurons. They move on to test the genetic basis for righting behavior and, consistent with the regional specificity they observe, implicate sensory neuron expression of Hox genes Antennapedia and Abdominal-b in self-righting.

      Strengths:

      Strengths of this paper include the important question addressed and the elegant and innovative combination of methods, which led to clear insights into the sensory biology of self-righting and links between body plan and nervous system function that will be useful for others in the field. The manuscript is very clearly written and couched in interesting biology.

      Limitations:

      There are several important questions for future study that, left unresolved, do not diminish the significance of this manuscript. These include the cellular and developmental basis for Hox gene action, the contributions of dorsal and ventral regions of the animal in righting, and the regional contributions of other sensory cell types in the righting response.

      Comments on revised version.

      The authors have addressed my major concerns.

    2. Reviewer #2 (Public review):

      Summary

      This work explores the relationship between body structure and behavior by studying self-righting in Drosophila larvae, a conserved behavior that restores proper orientation when turned upside-down. The authors first introduce a novel "water unlocking" approach to induce self-righting behavior in a controlled manner. Then, they develop a method for region-specific inhibition of sensory neurons revealing that anterior, but not posterior, sensory neurons are essential for proper self-righting. Deep-learning-based behavioral analysis shows that anterior inhibition prolongs self-righting by shifting head movement patterns, indicating a behavioral switch rather than a mere delay. Additional genetic and molecular experiments demonstrate that specific Hox genes are necessary in sensory neurons, underscoring how developmental patterning genes shape region-specific sensory mechanisms that enable adaptive motor behaviors.

      Strengths

      The work by Roseby et al. is notable for its elegant experimental design, the development of innovative methods that are likely to benefit the fly behavior community, and the strong experimental support for its conclusions. The manuscript is clearly written, well structured, and presents thoughtfully designed experiments that have been further improved in the revised version. This updated manuscript includes a comprehensive set of behavioral experiments using an additional Gal4 line (ppk-Gal4), which yields confirmatory results and strengthens support for the original hypothesis. It also incorporates quantification of Gal4 line strength, improvements to existing figures, the addition of new figures, and overall refinement of the text.

      Weakness:

      A remaining limitation of this manuscript is the lack of a cellular and mechanistic analysis explaining how Hox genes give rise to the observed behavioral phenotypes. The authors note that this question is being addressed in an ongoing follow-up study, which will expand the project to examine the roles of all Hox genes across the sensory system and to characterize their expression patterns within each of its subcomponents, with the aim of providing mechanistic insight. I look forward to seeing this work in a future manuscript.

      Comments on revised version.

      I have no further recommendations for the authors; most of my comments and questions have been satisfactorily addressed.

    1. Reviewer #2 (Public review):

      Summary:

      The authors sought to characterize the somatic mutation landscape and gene expression profiles of Kenyan breast cancer patients. By comparing Whole Exome Sequencing (WES) and RNA-seq data from 23 paired tumor-normal samples against The Cancer Genome Atlas (TCGA) cohorts, the study specifically aimed to highlight the role of the ZNF gene family.

      Strengths:

      The study addresses a critical gap in genomic research by focusing on an underrepresented African population, which is essential for achieving global health equity in oncology.

      Weaknesses:

      The cohort is relatively small for definitive landscape characterization. The study fails to explore the mechanistic link between identified somatic mutations and observed aberrant gene expression.

      Impact and Utility:

      The impact of this work is currently limited. While the data adds to the growing repository of African genomic samples, the lack of novelty and mechanistic insight reduces its utility for the broader scientific community. To be clinically valuable, the study would need to offer more robust, unbiased profiling that could eventually inform population-specific diagnostics or therapies.

      Additional Context:

      Breast cancer in African populations often presents with different clinical trajectories compared to Western cohorts. While any data from these regions is vital, "landscape" studies require high statistical power and unbiased analysis to differentiate true population-specific drivers from noise or small-sample variance. Without a clear regulatory mechanism linking mutations to phenotypes, the findings remain preliminary observations.

    2. Reviewer #3 (Public review):

      Summary:

      This revised study analyzes the somatic mutational profiles and transcriptomic expression of three zinc-finger genes (ZNF217, ZNF703, ZNF750) in 23 Kenyan women with breast cancer, using whole-exome sequencing and RNA-sequencing of paired tumor-normal tissues. A total of 358 somatic mutations were detected, and all three genes were significantly upregulated in tumors compared to normal tissues (ZNF217 showing the most prominent difference). The findings provide preliminary evidence for the idenfication of diagnostic/prognostic biomarkers or therapeutic targets in sub-Saharan African populations.

      Strengths:

      The study's key strengths lie in its focus on an underrepresented Kenyan cohort, addressing a critical gap in sub-Saharan African breast cancer genomic research. It integrates DNA-level mutation analysis with RNA-level expression data, leveraging standardized bioinformatics pipelines and rigorous quality control to deliver detailed insights into mutation types, functional impacts, and amino acid changes.

      Comments on revised version:

      After careful revision by the authors, the manuscript has become more rigorous. The limitations including small sample size and lack of functional validation are properly acknowledged, and conclusions are prudently presented as hypothesis‑generating rather than causal claims. Meanwhile, strengthened multi‑omics analyses, TCGA validation, logical reorganization of results and improved figure presentation further enhance the reliability of this work.

    1. Reviewer #1 (Public review):

      Summary:

      This important study performs a theoretical analysis of the evolutionary dynamics of strains under a classical resource competition model to understand how clonal interference and diversification of resource preferences interact to structure microbial population genetic structure. They find that in large asexual populations evolving in relevant parameter regimes, where evolutionary and ecological time scales overlap, populations are characterized by a small number of ecotypes, which are groups of strains that share a given resource preference, whose dynamics in the long run are dominated by priority effects.

      Strengths:

      The manuscript constitutes a novel and sound contribution to theory in ecology and evolution, under relevant parameter regimes which have been previously overlooked due to the complexities they bring, i.e. when the weak mutation regime breaks down. Here, the authors make a considerable step forward by taking advantage of analytical advances in the population genetics theory of clonal interference in recent years (travel fitness wave moving at a constant average speed v), which they apply to resource competition models typically studied in ecology.

      The main insights in the derivations shown in the supplementary text are clearly summarized in Figure 2 of the main manuscript, where the different phases of the somewhat counterintuitive dynamics of the strategic mutations in the model are quantified.

      Weaknesses:

      Despite its many merits, I believe the manuscript can profit from a few clarifications as I point out below:

      (1) I think the authors should make explicit in the abstract of the paper that they study a stair to heaven fitness landscape and that the rate of beneficial mutations does not slow down.

      (2) Evolution is elegantly incorporated in the resource consumption model by assuming two classes of mutations: strategic mutations and constitutively beneficial mutations. I believe that the biological meaning of these different types should be better explained. Specifically, on pages 3 and 4, the authors state that strategy mutations "alter resource uptake strategy and potentially its overall magnitude as well", whereas the other type is "only tangentially related to resource consumption (e.g. eliminating a pathway that is not necessary in the current environment)." I find this a bit strange since this is a model of resource competition, and I would assume that the latter type of mutations would be neutral. Maybe I am not reading this well, and the meaning of the mutations, as well as their assumed rates, could be clarified with some examples as the authors state that these mutations are routinely observed in microbial evolution experiments.

      (3) The authors discuss the theoretical results obtained in the light of the famous Lenski experiment, where ecotype formation is observed in some populations. However, in the mentioned example, cross-feeding was the mechanism involved. Since in their model, unlike in other models, cross-feeding is not considered, I found this example to be misplaced. In addition, in the Lenski experiment, a single (and essential) resource is present in the environment, so the assumptions of the model do not appear to apply. On the other hand, in Herron and Doebeli's experiments, two resources (substitutable) were present, so a comparison with their experimental results would be more appropriate.

      (4) The paper should also discuss deleterious mutations, which I did not see mentioned anywhere.

    2. Reviewer #2 (Public review):

      Summary:

      In "Ecological diversification in rapidly evolving populations", the authors use a consumer-resource model with competition for 2 different resources to study diversification for cases in which ecology and evolution are separated (weak-mutation limit) and when they overlap. They find the potential for the timing of a mutation (and not just its associated fitness) to confer an advantage against fitter strains (which they call "priority effect"), and the aggregation of dominant trait values that lead to the definition of "ecotypes" that discretize and structure the community.

      Strengths:

      The authors introduce detailed analytical calculations in the limit of overlapping ecology and evolution, which is a case that typically eludes analysis. The work also pays particular attention to the timing of "invasion" by a mutation, whereas most approaches focus on the long-term outcome of evolution (e.g. fixation of a trait value).

      Weaknesses:

      The model makes important assumptions that limit its generality considerably. In particular, the two "evolving traits" defined in the model are very specific and by no means the simplest possible resource competition evolutionary model that the authors claim it to be. The manuscript is not clear enough to be reproducible, and the authors do not discuss in sufficient depth the huge amount of work that is presented in the manuscript. The bibliography omits important work focused on diversification emerging from eco-evolutionary interactions similar to the ones studied in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Balasubramaniam and colleagues continue this group's efforts to understand mitochondrial-derived compartments (MDCs) that bud off from yeast mitochondria in response to metabolic stress. In a previous genetic screen, they identified Ups lipid transfer proteins and the AAA-protease Yme1 as components that modulate MDC formation. In this study, the authors link these observations by showing that Yme1 modulates levels of Ups1, Ups2, as well as MICOS complex members in the mitochondrial proteome. Using genetic approaches, they then show that Yme1's role on MDCs is dependent on its catalytic activity (via an inactive mutant) and that YME1 shows genetic interactions with UPS1/2 and MIC10/MIC60. The overall model is that Yme1 activity responds to metabolic cues and acts via proteolysis of these two distinct mitochondrial machineries to regulate MDC biogenesis.

      Strengths:

      The strengths of the study are its integration of mitochondrial proteomics with strong genetic approaches, as well as synergy with the authors' previous studies on the role of lipids in MD genesis. The work is overall well carried-out and experiments are thoughtfully discussed.

      Weaknesses:

      The major weaknesses are a lack of mechanistic resolution surrounding the model, e.g., proposed or tested mechanisms by which Yme1 activity is regulated by metabolic cues, or how Ups1/2 activity and the MICOS contribute to MDC generation. The authors acknowledge these as open questions, but addressing them would still enhance the significance of the study.

    2. Reviewer #2 (Public review):

      In this manuscript, the authors report a novel regulation of the outer mitochondrial membrane remodeling domains called mitochondria-derived compartments, MDCs. The team has previously established the main principles behind this recently identified quality control pathway, but the mechanisms that control MDCs formation remain incompletely understood. Using the baker's yeast model, the authors identify the conserved mitochondrial protease Yme1 as a crucial factor that regulates MDC formation. Mechanistically, Yme1's proteolytic function controls the levels of Ups1 and Ups2 lipid transfer proteins and the components of the membrane organizing complex called MICOS, thus providing a plausible model as to how Yme1-dependent proteolysis permits MDC formation through the removal of lipid and MICOS-dependent constraints. Finally, the authors show that this Yme1-mediated activity is also defined by metabolic conditions. In principle, this study is interesting and novel, and holds potential to provide new insights into the regulation of the MDC pathway that emerged as a new fundamental mitochondrial quality control mechanism. However, the following points should be carefully addressed.

      Major points:

      (1) Yme1 has been previously shown to regulate mitochondria-specific autophagy through Atg32 processing. Given the high similarity of the MDC pathway to piecemeal autophagy and the fact that both pathways share some of the core components, the authors should address the involvement of Atg32 in their model. It would also be important to include a brief discussion addressing the differences between piecemeal autophagy and the MDC pathway.

      (2) The Rpt3 (P215L) expression experiment is interesting, but appears to be somewhat superficial due to the unclear mechanism by which the mitochondrial network morphology is restored in these cells. Could this result be replicated in the dnm1∆ mgm1∆ double deletion mutant, which is a well-established model for mitochondrial network restoration?

      (3) Figure 3E. The changes in PE levels appear to be minor. While statistically significant, the observed differences may not be physiologically relevant. More in-depth lipidomic analysis data should be presented to substantiate the authors' argument and better address the questions at hand. Related to that, could PE or PA supplementation stimulate MDC formation?

      (4) The connection between rapamycin treatment and Yme1-regulated MDC formation is unclear and puzzling and needs to be explained better.

      (5) The MICOS complex is clearly involved in the regulation of MDC, but the manuscript misses the mark on providing compelling evidence and a clear explanation as to how MICOS contributes to said regulation.

      Minor points:

      (1) The authors should discuss potential reasons for the dramatically different rates of MDC formation in the S288C and W303 background cells. Does this have anything to do with generally more robust mitochondrial functions in the latter cells?

      (2) Proper statistical analyses should be provided for all the graphs presented.

      (3) The authors should include Yme1 immunoblots to confirm the identity of strains being studied and validate the presence or overexpression of Yme1 and its catalytic mutant in their experiments.

    3. Reviewer #3 (Public review):

      Summary:

      Since describing MDCs over a decade ago, the lab of the corresponding author, Hughes, has been at the forefront of further characterizing these structures. Here, they follow up on recent work (PMID: 38497895), where a screen identified Yme1 as a potential regulator of MDCs. After confirming that Yme1-ko prevents MDCs that are usually induced via various established treatments (Rapamycin, cycloheximide, Concanavalin A), the authors confirmed that the proteolytic activity of Yme1 is required. Next, using proteomics, they identified how loss of Yme1 impacts the mitochondrial proteome with and without Rapamycin treatment to induce MDCs. From this result and based on insight from other published data implicating lipids, the focused initially on the lipid transfer protein Usp2, a known target of Yme1. Here, they showed that loss of Usp2 could partially rescue MDC formation in Yme1-ko cells. To look for other Yme1 targets that might also be involved in MDC formation, next, they investigated the MICOS complex, which was also notable in their proteomics data. They then showed that inhibiting MICOS also partially restored MDC formation in Yme1-ko cells. They then tested the combined effects of Usp2 and MDC inhibition on MDCs, which was limited by the fact that the combination of full MICOS disruption, Usp2-KO, and Yme1-KO was not viable. To circumvent this limitation, they investigated the knockout of individual MICOS subunits in combination with Usp2 and/or Yme1. Finally, they showed that growth conditions also mediate MDC formation in the context of Yme1 overexpression. In rich media, Yme1 overexpression induces MDCs on its own. However, this induction is lost upon amino acid starvation, suggesting that there are still other as-yet-unidentified factors regulating the formation of MDCs.

      Strengths:

      The authors use unbiased approaches and genetic models to begin unraveling a novel regulatory role of Yme1 in the formation of MDCs.

      Weaknesses:

      (1) The authors find both Ups1 and Ups2 in their screens, but only focus on Ups2 in this paper. It would be good to know why they did not also investigate Ups1, and its other protease Atp23, which could potentially act similarly to Yme1, or even rescue the loss of Yme1.

      (2) I'm not convinced that the data support the notion that Usp2 and MICOS have distinct effects on MDCs. In Figure S3C-D, there is no statistical analysis to indicate whether the small differences between the MICOS-ko and the double knockout are significant. If MICOS-ko and Ups2-ko were acting through different mechanisms, one would expect their combination to be additive; this does not appear to be the case, as both single deletions and the double deletion all cause similar levels of MDCs (~30-40%). Rather, this result is what you would expect if they were working through the same mechanism. There also does not appear to be an additive effect in Figure 4F-G, when using the mic60-ko rather than the complete MICOS-ko. In this regard, the authors note in their discussion that 'loss of MICOS may disrupt membrane associations or alter lipid distribution between mitochondrial subcompartments' (lines 390-392). The latter situation seems like it would be the same mechanism as Usp2 and would more accurately explain their findings.

      (3) The manuscript is missing key data confirming the re-expression or overexpression of Yme1 protein (Figure 1 E/G and Figure 5A). It is important to know the relative levels of expression of the re-expressed proteins to each other and to endogenous Yme1.

      (4) Some clarification of the details for metabolically restrictive conditions would be helpful.

      (5) Beyond just the presence/absence of MDCs, does more detailed quantification of their size/shape reveal any subtle differences between conditions?

    1. Reviewer #1 (Public review):

      Summary:

      Combining in vitro refolding, SEC-based assembly assays, peptide-library screening, MALDI-TOF, LC-MS/MS, structural analysis and immunopeptidomics, this manuscript investigates the peptide-binding principles of the promiscuous chicken MHC-I molecule BF2*21:01.

      Strengths:

      Although the peptide motif of BF2*21:01 is highly complex, this manuscript identified several principles, including a preference for 10-mer peptides, co-variation between P2 and Pc-2, effects of P3 and Pc-3, and a strong cellular preference for Leu at Pc. The results are important for avian MHC biology and poultry vaccine epitope prediction.

      Weaknesses:

      The manuscript is sometimes difficult to follow because the authors present a large amount of peptide-library, structural and immunopeptidomics data. without always clearly explaining how these datasets support the proposed simplifying principles.

      Major Issues - Points Requiring Clarification or Additional Support:

      (1)(Line 282-301, 537-545)<br /> The immunopeptidomics conclusions are mainly based on one B21 cell line with one biological replicate and at least two technical replicates. Given the complexity of the BF2*21:01 peptide repertoire, this is a major limitation. The authors should either provide additional biological replicates or clearly state this limitation in the Abstract, Results and Discussion.

      (2) (Lines 290-313)<br /> The B21 cell preparations contain both BF2 and the lowly expressed BF1 molecule. Some peptides, especially 8-mers or peptides with atypical motifs, may derive from BF1*21:01. The authors should clarify how BF2*21:01-bound peptides were distinguished from possible BF1-derived peptides, or interpret the immunopeptidomics motif more cautiously. The authors should also provide or cite evidence confirming the B21 haplotype identity of the cell line and chicken materials used for immunopeptidomics.

      (3) (Lines 217-221, 243-253)<br /> The authors acknowledge that MALDI-TOF cannot reliably distinguish peptide combinations with identical or similar masses, nor determine residue positions in some cases. Therefore, MALDI-TOF results should not be overinterpreted as precise evidence for residue preference. The authors should clearly indicate which conclusions are supported by LC-MS/MS.

      (4) (Lines 297-301, 316-330)<br /> The authors suggest that longer peptides may bulge in the middle or extend out of the groove at the C-terminal end. The rationale for the C-terminal extension is not clearly explained. Why is the C-terminal extension considered rather than the N-terminal extension? If the binding register is uncertain, long peptides should be analyzed separately from canonical-length peptides.

      (5) (Lines 406-439)<br /> In vitro assembly assays show that several hydrophobic residues can be tolerated at Pc, whereas immunopeptidomics shows a strong Leu preference at this position. The authors should clarify whether this Leu preference reflects intrinsic BF2*21:01 binding specificity, TAP-mediated peptide transport, antigen processing, peptide loading, or a cell-line-specific effect. Additional experimental support, such as TAP transport analysis, would strengthen this conclusion.

      (6) (Lines 172-178, 243-279, 442-457)<br /> The structural analysis explains some residue combinations, such as Arg at P2 with Glu at Pc-2 or Trp at Pc. However, the structural interpretation is not fully integrated with the large-scale peptide library and immunopeptidomics results. Representative high- and low-frequency combinations should be discussed structurally.

      (7) The inference of co-variation between P2 and Pc-2, as well as the modulatory effects of P3 and Pc-3, should be better explained. At present, some conclusions appear to be based mainly on residue-frequency patterns, and the logical connection between these observations and the proposed binding principles is not always clear. Statistical analyses, such as mutual information, chi-square tests or permutation tests, and representative structural explanations would strengthen this conclusion.

    2. Reviewer #2 (Public review):

      Summary:

      The study presents an in-depth analysis of the peptide repertoire bound by a promiscuous chicken MHC molecule using mass spectrometry, x-ray crystallography and modelling. While the MHC can bind a very diverse set of peptides, the authors have found some new rules that govern peptide binding to this MHC that could help to build a predictive model to study the repertoire of pathogen-derived peptides.

      Strengths:

      The study uses a range of well performed experiment across multiple techniques and provides an in-depth analysis of the peptide repertoire, including peptide sequences, length, preferred residues, stability and MHC presentation.

      Weaknesses:

      The data overall support the analysis and conclusion well. The only caveat is linked to Figure 4, which does not describe the stability of the peptide-MHC complex, but instead shows refold yield, and the two are not always linked.

    1. Reviewer #1 (Public review):

      Summary:

      The "multiple-demand" (MD) system is a well-known finding of human brain imaging and is thought to play a central role in cognitive control. To directly compare the MD system in humans and monkeys, Mione et al. used functional magnetic resonance imaging to measure whole-brain activation in a multi-step saccadic maze task. In humans, the authors found a distributed pattern of brain activity close match to the canonical MD network and extends to adjacent regions of dorsal attention and other networks. While there was good correspondence between monkey and human data, differences were also notable in the lateral frontal cortex, the dorsal parietal cortex, and the sensorimotor cortex.

      Strengths:

      Though previous data hint at a corresponding network in the macaque, there has been no direct comparison to human data. This study provides a direct cross-species comparison with whole-brain data from fMRI, and the findings suggest an extended and strongly interconnected brain network recruited by increased cognitive challenge.

      Weaknesses:

      In previous human imaging, the MD system is defined by overlapping activation for many kinds of cognitive demands. In the present work, however, the authors used just a single task. Although there is some overlap between the putative monkey MD network and the canonical MD network identified in human imaging, there should be caution in linking current findings to the MD system based on limited task events.

    2. Reviewer #2 (Public review):

      Summary:

      Mione et al. aim to resolve a long-standing question in comparative neuroscience: whether the macaque brain contains a functional analogue to the distributed human multiple-demand (MD) network. To address this, the authors employ a direct cross-species fMRI comparison using a multi-step saccadic maze task in humans and a simplified two-step version in macaques. By contrasting goal-directed navigation against a control condition that requires similar motor responses but no strategic planning, the study isolates the neural signatures of cognitive control across species.

      Strengths:

      The most compelling aspect of this work is its methodological alignment. Previous attempts to compare these systems often relied on comparisons of human BOLD signals and macaque single-unit recordings. By running parallel fMRI protocols, the authors establish a shared measurement basis that allows for a more direct comparison. The resulting activation maps clearly demonstrate conserved network topology across dorsomedial frontal, lateral, and medial parietal, and insula cortices. Combining these results with recent research on functional and structural connectivity further supports the idea that these networks evolved across species and provides a helpful starting point for future comparative studies. The findings will be highly useful for researchers investigating the evolutionary origins of domain-general cognitive control, as well as for neuroimaging methodologists developing cross-species alignment pipelines.

      Weaknesses:

      However, there are several differences in how the two groups were studied that make it harder to compare the results precisely. The human task mixed 2-, 4-, and 6-step trials within the same experimental blocks, whereas macaques performed only 2-step trials. This design difference likely places human participants in a state of sustained proactive cognitive control (Braver, 2012), as they must remain prepared for highly demanding trials at any moment. This elevated baseline arousal may artificially inflate MD network activation during the simpler 2-step trials in humans, making direct magnitude comparisons with the macaque data difficult. Additionally, the general linear model combined correct and error trials into a single regressor. Given that macaques exhibited substantially higher error rates, this approach risks diluting task-specific planning signals with activity related to error monitoring and reward prediction errors. The preprocessing pipeline also applied a 4 mm full-width half-maximum smoothing kernel to macaque data acquired at 1.5 mm resolution. Relative to the smaller size of the macaque brain, this kernel is quite large and likely blurs fine-grained topographical distinctions. This may partly explain why the macaque lateral frontal cortex shows a single dorsal activation patch rather than multiple discrete patches seen in humans. Furthermore, there is concerning inter-individual variability in the macaque data. Normally, a functional network like the MD system is identified by consistent activation across all individuals. In this study, however, the two monkeys show substantially different activation maps and behavioral patterns. This lack of consistency renders the group-level results questionable, as it is unclear whether the group-level map represents a unified biological system or merely an average of disparate individual maps. Finally, the subcortical activations shown in Figure 7 require more precise anatomical localization to confidently distinguish cerebellar nodes from adjacent brainstem structures.

      The authors demonstrate a broad functional correspondence between human and macaque cognitive control networks, moving the field beyond speculative homology. The data suggest that an extended, interconnected network is recruited by cognitive challenge in both species; however, the strength of this claim is limited by the inter-individual variability and methodological constraints noted above. Assertions of precise topological equivalence should therefore be tempered. The absence of ventrolateral prefrontal and strong dorsal parietal activations in the macaque group analysis may reflect genuine biological differences, but could also stem from limited statistical power, excessive smoothing, or task-design asymmetries. While the overall conclusions are plausible, they would be significantly strengthened by a more explicit discussion of these limitations and additional analytical clarifications regarding individual-level consistency.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript entitled "Essential function reflected in the phylodynamics of a multigene family - the pir genes of malaria parasites" by Jackson and colleagues investigates the global phylogeny of pir genes across 14 Plasmodium species and one Hepatocystis species. The authors also focus on the functional characterization of the conserved ortholog pirC1 and claim that pirC1 is not the founder of the family and that it plays an essential role in blood-stage growth.

      Strengths:

      Overall, the manuscript is well written and interesting, as it combines comparative genomics and evolutionary analysis with functional experiments. The phylogenetic analysis is rigorous and represents a major strength of the manuscript.

      Weaknesses:

      The general conclusions regarding the potential function of this gene family are not fully supported by the data presented. The manuscript moves too quickly from growth phenotype and localization studies to a specific mechanistic model. The discussion argues that PIRC1 may be involved in nutrient acquisition, host sensing, or metabolic support, but the data provided do not directly support these functions, and the manuscript in its present form remains speculative. Although the manuscript includes some experimental results, it lacks direct mechanistic validation of the specific functions of the pir genes, including pirC1. In its current form, the study does not yet establish a definitive role for pirC1 in metabolic processes.