10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      Summary:

      In the ecological interactions between wild plants and specialized herbivorous insects, structural innovation-based diversification of secondary metabolites often occurs. In this study, Agrawal et al. utilized two milkweed species (Asclepias curassavica and Asclepias incarnata) and the specialist Monarch butterfly (Danaus plexippus) as a model system to investigate the effects of two N,S-cardenolides - formed through structural diversification and innovation in A. curassavica-on the growth, feeding, and chemical sequestration of D. plexippus, compared to other conventional cardenolides. Additionally, the study examined how cardenolide diversification resulting from the formation of N,S-cardenolides influences the growth and sequestration of D. plexippus. On this basis, the research elucidates the ecophysiological impact of toxin diversity in wild plants on the detoxification and transport mechanisms of highly adapted herbivores.

      Strengths:

      The study is characterized by the use of milkweed plants and the specialist Monarch butterfly, which represent a well-established model in chemical ecology research. On one hand, these two organisms have undergone extensive co-evolutionary interactions; on the other hand, the butterfly has developed a remarkable capacity for toxin sequestration. The authors, building upon their substantial prior research in this field and earlier observations of structural evolutionary innovation in cardenolides in A. curassavica, proposed two novel ecological hypotheses. While experimentally validating these hypotheses, they introduced the intriguing concept of a "non-additive diversity effect" of trace plant secondary metabolites when mixed, contrasting with traditional synergistic perspectives, in their impact on herbivores.

      Weaknesses:

      The manuscript has two main weaknesses. First, as a study reliant on the control of compound concentrations, the authors did not provide sufficient or persuasive justification for their selection of the natural proportions (and concentrations) of cardenolides. The ratios of these compounds likely vary significantly across different environmental conditions, developmental stages, pre- and post-herbivory, and different plant tissues. The ecological relevance of the "natural proportions" emphasized by the authors remains questionable. Furthermore, the same compound may even exert different effects on herbivorous insects at different concentrations. The authors should address this issue in detail within the Introduction, Methods, or Discussion sections.

      Second, the study was conducted using leaf discs in an in vitro setting, which may not accurately reflect the responses of Monarch butterflies on living plants. This limitation undermines the foundation for the novel ecological theory proposed by the authors. If the observed phenomena could be validated using specifically engineered plant lines-such as those created through gene editing, knockdown, or overexpression of key enzymes involved in the synthesis of specific N,S-cardenolides - the findings would be substantially more compelling.

    2. Reviewer #2 (Public review):

      This study examined the effects of several cardenolides, including N,S-ring containing variants, on sequestration and performance metrics in monarch larvae. The authors confirm that some cardenolides, which are toxic to non-adapted herbivores, are sequestered by monarchs and enhance performance. Interestingly, N,S-ring-containing cardenolides did not have the same effects and were poorly sequestered, with minimal recovery in frass, suggesting an alternate detoxification or metabolic strategy. These N,S-containing compounds are also known to be less potent defences against non-adapted herbivores. The authors further report that mixtures of cardenolides reduce herbivore performance and sequestration compared to single compounds, highlighting the important role of phytochemical diversity in shaping plant-herbivore interactions.

      Overall, this study is clearly written, well-conducted and has the potential to make a valuable contribution to the field. However, I have one major concern regarding the interpretations of the mixture results. From what I understand of the methods, all tested mixtures contain all five compounds. As such, it is not possible to determine whether reduced performance and sequestration result from the complete mixture or from the presence of a single compound, such as voruscharin for performance and uscharin for sequestration. For instance, if all compounds except voruscharin (or uscharin) were combined, would the same pattern emerge? I suspect not, since the effects of the individual N,S-containing compounds alone are generally similar to those of the full mixture (Figure S3). By taking the average of all single compounds, the individual effects of the N,S-containing ones are being inflated by the non-N,S-containing ones (in the main text, Figure 4). In the mix, of course, they are not being 'diluted', as they are always present. This interpretation is further supported by the fact that in the equimolar mix, the relative proportion of voruscharin decreases (from 50% in the 'real mix'), and the target measurements of performance and sequestration tend to increase in the equimolar mix compared to the real mix.

      Despite this issue, the discussion of mixtures in the context of plant defence against both adapted and non-adapted herbivores is fascinating and convincing. The rationale that mixtures may serve as a chemical tool-kit that targets different sets of herbivores is compelling. The non-N,S cardenolides are effective against non-adapted herbivores and the N,S-containing cardenolides are effective against adapted herbivores. However, the current experiments focus exclusively on an adapted species. It would be especially interesting to test whether such mixtures reduce overall herbivory when both adapted and non-adapted species are present.

      It remains possible that mixtures, even in the absence of voruscharin or uscharin, genuinely reduce sequestration or performance; however, this would need to be tested directly to address the abovementioned concern.

    1. Reviewer #1 (Public review):

      Bajohr and colleagues propose a transcription factor-driven approach to generating bonafide oligodendrocyte lineage cells (OLCs) from primary mouse astrocytes. Ectopic expression of Olig2, Sox10, or Nkx6.2 in isolated astrocytes produced a range of OLC-like cell states, with Sox10 emerging from lineage tracing and single cell RNA sequencing experiments as the most successful transcription factor in driving direct lineage reprogramming. The authors strengthened their claims with an unbiased, deep learning perturbation model to predict genetic drivers of the astrocyte cluster to OLC cluster transition observed in their scRNA seq dataset. Here, Sox10 surfaced in the top ten correlated genes, and the top transcription factor, mediating this fate shift. Altogether, this paper presents an interesting approach to generate OLCs, a cell type historically difficult to procure, from primary mouse astrocytes to study this lineage in development and disease and perhaps repopulate it in dysmyelinating conditions. While this certainly addresses a technical gap in the field, authors defined iOLCs as ones with lineage-specific gene expression and morphological characteristics, lacking any functional analysis to assess the reprogrammed cells' capacity to myelinate. This comment and other critiques are discussed below.

      While Sox10 and Mbp expression in iOLCs, as confirmed by IHC, is a promising result suggesting that ectopic Sox10 instructs transduced cells to develop into cells of myelinating potential, functional confirmation is essential. As mentioned in the discussion, the absence of a substrate for myelination may have also contributed to the low DLR efficiency. Co-culturing Sox10 iOLCs with primary neurons and examining the cells' potential to engage and enwrap axons would greatly strengthen the authors' claim that this could be an effective therapeutic approach to myelin regeneration in vivo, or even a technical approach to studying myelin dynamics in vitro.

      In Figure 1B, it appears that Mbp expression in tdTomato+ cells decreases in Sox10 transduced iOLs during the observed time period. Can the authors elaborate on this result, given that MBP expression is crucial for myelination and should, if anything, increase with time?

      The authors acknowledge that there is a conversion of tdTomato- zsGreen+ cells with an astrocyte-like morphology to OLC cells expressing Mbp following Sox10 induction (Supplementary figure 5C,D). While they note the diversity of the astrocyte lineage in the discussion, further analysis should be applied to this subset of cells to confirm the subset of astrocyte or progenitor-like cell type that gives rise to their cell endpoint of interest (Sox10-driven Mbp+ iOLs).

      Finally, ectopic expression of Olig2 and Sox10 in primary astrocytes resulted in very different OLC subtypes, as evidenced by OLC marker expression seen in IHC and the subclustering of these cell types in scRNA seq. Although this diversity in OLC type and generation efficiency follows with previous reports showing that these two transcription factors vary in effect, might the authors further discuss this discrepancy given that the two transcription factors regulate one another (as mentioned in the introduction) and should theoretically give rise to more similar cells? Perhaps due to the lower specificity of Olig2 in marking a pure OLC population relative to Sox10?

    1. Reviewer #1 (Public review):

      This study explores the connectivity patterns that could lead to fast and slow undulating swim patterns in larval zebrafish using a simplified theoretical framework. The authors show that a pattern of connectivity based only on inhibition is sufficient to produce realistic patterns with a single frequency. Two such networks couple with inhibition but with distinct time constants can produce a range of frequencies. Adding excitatory connections further increases the range of obtainable frequencies, albeit at the expense of sudden transitions in mid-frequency range.

      Strengths:

      (1) This is an eloquent approach to answering the question of how spinal locomotor circuits generate coordinated activity using a theoretical approach based on moving bump models of brain activity.

      (2) The models make specific predictions on patterns of connectivity while discounting the role of connectivity strength or neuronal intrinsic properties in shaping the pattern.

      (3) The models also propose that there is an important association between cell-type-specific intersegmental patterns and the recruitment of speed-selective subpopulations of interneurons.

      (4) Having a hierarchy of models creates a compelling argument for explaining rhythmicity at the network level. Each model builds on the last and reveals a new perspective on how network dynamics can control rhythmicity. I liked that each model can be used to probe questions in the next/previous model.

      Comments on revisions:

      I am very happy to see the simplified biophysical model supporting the original findings. The authors have done an excellent job addressing my comments.

      Just a small note, please change C. Elegans to C. elegans.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to show that connectivity patterns within spinal circuits composed of specific excitatory and inhibitory connectivity and with varying degrees of modularity could achieve tail beats at various frequencies as well as proper left-right coordination and rostrocaudal propagation speeds.

      Strengths:

      The model is simple and the connectivity patterns explored are well supported by the literature

      The conclusions are intuitive and support many experimental studies on zebrafish spinal circuits for swimming. The simulations provide strong support for the sufficiency of connectivity patterns to produce and control many hallmark features of swimming in zebrafish

      Weaknesses:

      The authors have addressed my previous concerns well. I have no further concerns.

    3. Reviewer #3 (Public review):

      Summary:

      Central pattern generator (CPG) circuits underly rhythmic motor behaviors. Till date, it is thought that these CPG networks are rather local and multiple CPG circuits are serially connected to allow locomotion across the entire body. Distributed CPG networks that incorporate long-range connections have not been proposed although such connectivity has been experimentally shown for several different spinal populations. In this manuscript, the authors use this existing literature on long-range spinal interneuron connectivity to build a new computational model that reproduces basic features of locomotion like left-right alternation, rostrocaudal propagation and independent control of frequency and amplitude. Interestingly, the authors show that a model solely based on inhibitory neurons can recapitulate these basic locomotor features. Excitatory sources were then added that increased the dynamic range of frequencies generated. Finally, the authors were also able to reproduce experimentally observed consequences of cell-type-specific ablations showing that local and long range, cell-type-specific connectivity could be sufficient for generating locomotion.

      Strengths:

      This work is novel, providing an interesting alternative of distributed CPGs to the local networks traditionally predicted. It shows cell type-specific network connectivity is as important if not more than intrinsic cell properties for rhythmogenesis and that inhibition plays a crucial role in shaping locomotor features. Given the importance of local CPGs in understanding motor control, this alternative concept will be of broad interest to the larger motor control field including invertebrate and vertebrate species.

      Weaknesses:

      The main weaknesses were addressed in the revision.

    1. Reviewer #1 (Public review):

      The authors aim to predict ecological suitability for transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data.

    2. Reviewer #2 (Public review):

      Summary:

      The geographic range of highly pathogenic avian influenza cases changed substantially around the period 2020, and there is much interest in understanding why. Since 2020 the pathogen irrupted in the Americas and the distribution in Asia changed dramatically. This study aimed to determine which spatial factors (environmental, agronomic and socio-economic) explain the change in numbers and locations of cases reported since 2020 (2020--2023). That's a causal question which they address by applying correlative environmental niche modelling (ENM) approach to the avian influenza case data before (2015--2020) and after 2020 (2020--2023) and separately for confirmed cases in wild and domestic birds. To address their questions they compare the outputs of the respective models, and those of the first global model of the HPAI niche published by Dhingra et al 2016.

      ENM is a correlative approach useful for extrapolating understandings based on sparse geographically referenced observational data over un- or under-sampled areas with similar environmental characteristics in the form of a continuous map. In this case, because the selected covariates about land cover, use, population and environment are broadly available over the entire world, modelled associations between the response and those covariates can be projected (predicted) back to space in the form of a continuous map of the HPAI niche for the entire world.

      Strengths:

      The authors are clear about expected bias in the detection of cases, such geographic variation in surveillance effort (testing of symptomatic or dead wildlife, testing domestic flocks) and in general more detections near areas of higher human population density (because if a tree falls in a forest and there is no-one there, etc), and take steps to ameliorate those. The authors use boosted regression trees to implement the ENM, which typically feature among the best performing models for this application (also known as habitat suitability models). They ran replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. Their code and data is provided, though I did not verify that the work was reproducible.

      The paper can be read as a partial update to the first global model of H5Nx transmission by Dhingra and others published in 2016 and explicitly follows many methodological elements. Because they use the same covariate sets as used by Dhingra et al 2016 (including the comparisons of the performance of the sets in spatial cross-validation) and for both time periods of interest in the current work, comparison of model outputs is possible. The authors further facilitate those comparisons with clear graphics and supplementary analyses and presentation. The models can also be explored interactively at a weblink provided in text, though it would be good to see the model training data there too.

      The authors' comparison of ENM model outputs generated from the distinct HPAI case datasets is interesting and worthwhile, though for me, only as a response to differently framed research questions.

      Weaknesses:

      This well-presented and technically well-executed paper has one major weakness to my mind. I don't believe that ENM models were an appropriate tool to address their stated goal, which was to identify the factors that "explain" changing HPAI epidemiology.

      Comments on the revised version from the editors:

      We are extremely grateful to the authors for presenting a thoughtful and respectful point by point rebuttal to the prior reviewers' comments. After reading these comments carefully, we conclude that there is a straightforward strongly held disagreement between the authors and the reviewers as to the validity of the methods (Ecological Niche Modeling) for this particular dataset. Please note that the two reviewers have substantial expertise in the area of Ecologic Niche Modeling. We elected not to reach out to the reviewers for a third set of comments as we do not think their overall opinions will change, and wish to be respectful of their time.

      To allow readers a balanced assessment of the paper, we intend to publish your rebuttal comments in full. It is our hope that interested readers can weigh both sides of this respectful and interesting debate in order to reach their own conclusions about the strength of evidence presented in your manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Davis and co-authors used many mouse models to investigate mechanisms that regulate the contractility of mouse popliteal collecting vessels, primarily chronotropy. Many of the mechanisms studied were previously shown to regulate pressure-induced constriction in small arteries. The authors use prior literature from the vasculature as a framework to test similar concepts in lymphatic vessels. The mouse models used provide evidence for and against the involvement of multiple proteins in regulating chronotropy and other contractile properties in lymphatic vessels. They propose that mechano-activation of GNAQ/GNA11-coupled GPCRs generates IP3, which induces Ca2+ release through IP3R1 and drives depolarization through the activation of ANO1 Cl- channels. Major concerns include the author's major conclusion that GNAQ/GNA11-coupled GPCRs contribute to chronotropy. This conclusion is not supported by the data presented.

      Strengths:

      One major strength of the study lies in the vast number of mouse knockout models that were used to test the importance of ion channels and G protein signaling pathways in the regulation of lymphatic vessel contractility. In this regard, the study is a valiant effort. The authors achieved several objectives to find that ANO1 and IP3R1 regulate chronotropy, and many other potential proteins do not regulate chronotropy. This study will have a major impact on the field if additional support for G proteins is provided.

      Weaknesses:

      Major conclusions concerning the involvement of G proteins are drawn from the global Gna11 knockout mouse models. This conclusion is weak. Global Gna11 knockout mice are highly likely to have a multifactorial phenotype that could create significant differences in the data. Control experiments need to be performed on vessels from the global knockout mice if these major conclusions are to be made. Similarly, pharmacological tools or alternative approaches to manipulate G proteins should be used to support the data from these mouse models to draw these major conclusions.

      The Gnaq smKO mice are the most specific G protein model studied here. However, there is no phenotype. Do not discuss trends in the data. If the data are not significant, conclude so. If more experiments are required to reach significance, provide more data in the manuscript.

      The conclusions repeatedly refer to a signaling pathway wherein the upstream component is GPCRs, which activate G proteins. While this may be the case, no GPCRs were identified here, and the involvement of G proteins is questionable, as the authors outline in lines 693-695 and noted above. The conclusions should be tempered, including in the abstract, unless additional experiments are performed to support the involvement of G proteins. Perhaps then the authors may be able to infer that GPCRs are involved.

      Line 318. The point regarding the choice to use popliteal vessels versus IALVs will be unclear to the uninitiated, particularly as the authors previously used IALVs. Including additional justification in the text and/or data from IALVs in Figure 1, which compares IALVs to popliteal vessels, would better explain the logic.

      The conclusions drawn for TRPC6 and TRPC3 are less convincing. Germline global knockout mice, which are known to undergo compensation, were used, and high data variability is apparent. Using TRPC3 and TRPC6 blockers in the mouse models studied in Figure 4 would strengthen the arguments made regarding these proteins.

      Did you perform power analysis to ensure that experimental numbers were sufficient to conclude that no statistical difference exists between datasets? If not, this needs to be done. For example, data shown in Figure 5C for tone and 6C for frequency and tone appear to be significantly different, but are concluded not to be so.

      At the end of each result section, a concluding statement is made regarding the effects on pressure-induced chronotrophy. In many cases, there are additional effects of manipulating protein expression on other contractile properties. One example is for TRPC3 and TRPC6 (lines 414-416), but others are TRPV4, TRPV3, ENaC, Kir, Cav3.1/3.2, etc. Some interpretation is in the Discussion, but the concluding statements at the end of each result section should be expanded to summarize what the authors think the other significant differences in the data represent.

      Kv7.4 channels. You state you have data (not shown) with linopiridine and XE991. Why not show those results here to support the experiments with the Kcnq4 smKO mice? Otherwise, I suggest you remove the statement from the unpublished data.

      Figure 13A. Kcnj2 is modestly expressed in LECs, but very little is present in LMCs. This likely underlies the effect of barium. If you remove the endothelium, does the effect of barium disappear? While this is not the major focus of the study, the effects of barium are dramatic, and it should be made clear whether this is due to inhibition of Kir channels in smooth muscle or endothelial cells.

      Figure 18C tone. Several values for losartan look different but are not labelled as such. Please clarify and discuss if different.

      The manuscript should include raw data traces in figures that show the major pathways that you conclude regulate chronotropy.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Davis et al. embarked on the quest for the molecular elements responsible for the regulation of lymphatic phasic contractile activity in response to variation of transmural pressure, a mechanism (termed pressure-induced lymphatic chronotropy by the authors) critical for drainage of interstitial fluid from the tissue and transport of lymph back to the blood circulation. Their aim was to investigate the mechanism(s) involved in the pressure-induced regulation of lymphatic pumping, and test whether activation of cation channels, shown in other systems to play mechanosensitive roles are directly at play, and/or whether mechano-activation of GNAQ/GNA11-coupled GPCRs is necessary to generate second messengers to activate those channels, as it has been suggested for the regulation of myogenic tone in arteries. To achieve their goal, the authors used their well-described, highly reliable protocols of mouse lymphatic vessel isolation, pressure myography, and data acquisition to obtain frequency-pressure relationships and other contractile function parameters from transgenic mice where specific channels or molecular elements of interest have been ablated. They combined these data with scRNAseq analysis of these gene targets to determine their respective role and levels of expression in lymphatic muscle cells. Their conclusion is that none of the exhaustive list of tested ion channels was critical, except ANO1 Cl channels, part of the contractile pacemaker mechanism, but that transmural pressure activates GNAQ/GNA11-coupled GPCRs, which generate IP3 to induce SR Ca2+ release through IP3R1 and activate ANO1-mediated depolarization.

      Strengths:

      The manuscript's strengths reside primarily in very robust, clean, and unequivocal pressure myography data and analysis. The research team is mastering these techniques they developed more than a decade ago and have implemented in mouse lymphatics to study their contractile properties, with consistent and convincing outcomes. They also provide data from an impressive list of transgenic mice in order to determine the role of the targeted gene in pressure-induced lymphatic chronotropy, relying on pharmacological small molecule inhibitors only when necessary. Finally, the use of scRNAseq analysis they gathered from previously published datasets brings novelty with respect to the expression of the genes of interest in all populations of cells comprising the lymphatic vessels, but more critically, to validate or contrast the potential impact of genetic alteration of the given gene on the ability of lymphatic muscles to respond to a change in pressure.

      Weaknesses:

      The main weakness may reside in the fact that while the authors provide a convincing demonstration that GNAQ/GNA11 are involved in the regulation of the F-P relationship, they give little evidence of the involvement of "upstream" receptors. Indeed, inhibition of AT1R, shown to be involved in myogenic regulation of arteries (a phenomenon the authors rightfully compare to pressure-induced lymphatic chronotropy), didn't lead toa similar effect (decrease in F-P) in lymphatic vessels. Arguably, other GPCRs might be involved in lymphatic vessels, but as such information is not provided in the manuscript, the author's conclusions should be dampened. More in-depth discussion would be required. In fact, it can be argued that the discussion is very restricted with respect to the amount of data and information the manuscript provides.

      Overall, the authors convincingly achieved their aim by performing an impressive number of technically challenging experiments, leading to solid datasets. While these support their main conclusions, a more elaborate discussion might be required to refine them.

      This study is likely to have an important impact on the field as it provides some answers to the lingering question of how lymphatic vessels regulate their contractile activity to variation in transmural pressure and certainly proposes an experimental means to further explore and address that question.

    3. Reviewer #3 (Public review):

      In this manuscript, Davis and colleagues aimed to identify the molecular sensors and signaling cascade that enable collecting lymphatic vessels to increase their spontaneous contraction frequency in response to intraluminal pressure (pressure-induced chronotropy). They tested whether the process is similar to blood vessel myogenic constriction by relying on cation channels (TRPC6, TRPM4, PKD2, PIEZO1, etc.) or instead require the activation of G-protein-coupled receptors (presumably mechanosensitive GNAQ/GNA11-coupled receptors), using ex vivo pressure myography of mouse popliteal lymphatics, smooth muscle-specific conditional knockouts, quantitative PCR validation, and single-cell RNA sequencing for target prioritization. The authors convincingly demonstrate that pressure-induced chronotropy does not require the cation channels implicated in arterial myogenic tone but is blunted by deletion of GNAQ/GNA11 or IP3 receptor 1, supporting a model of GPCR > IP3 > Ca2+ release > Cl⁻ channel activation > depolarization. The core conclusion is robust. The work redefines lymphatic pacemaking as G-protein-coupled receptor-dependent mechanotransduction, distinct from arterial mechanisms, and provides a genetically validated toolkit that is useful for studying lymphatic function and dysfunction.

      Strengths:

      (1) The data are of high quality and highly sensitive functional readouts

      (2) The systematic genetic targeting is a major strength that overcomes pharmacological artifacts

      (3) Careful quantitative analyses of frequency-pressure slopes

      Weaknesses:

      (1) The use of inguinal-axillary vessels for single-cell RNA sequencing rather than the popliteal segment studied functionally.

      (2) No direct testing of the specific G-protein-coupled receptor involved.

    1. Reviewer #1 (Public review):

      Summary:

      This useful study provides incomplete evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The study reinforces findings that VZ vaccine lowers AD risk and suggests that this vaccine may be an effect modifier of A-P's protective effect. Strengths of the study include two extremely large cohorts, including a massive validation cohort in the US. Statistical analyses are sound, and the effect sizes are significant and meaningful. The CI curves are certainly impressive.

      Weaknesses include the inability to control for potentially important confounding variables. In my view, the findings are intriguing but remain correlative / hypothesis generating rather than causative. Significant mechanistic work needs to be done to link interventions which limit the impact of Toxoplasmosis and VZV reactivation on AD.

      Weaknesses:

      Major:

      (1) Most of the individuals in the study received A-P for malaria prophylaxis as it is not first line for Toxo treatment. Many (probably most) of these individuals were likely to be Toxo negative (~15% seropositive in the US), thereby eliminating a potential benefit of the drug in most people in the cohort. Finally, A-P is not a first line treatment for Toxo because of lower efficacy.

      (2) A-P exposure may be a marker of subtle demographic features not captured in the dataset such as wealth allowing for global travel and/or genetic predisposition to AD. This raises my suspicion of correlative rather than casual relationships between A-P exposure and AD reduction. The size of the cohort does not eliminate this issue, but rather narrows confidence intervals around potentially misleading odds ratios which have not been adjusted for the multitude of other variables driving incident AD.

      (3) The relationship between herpes virus reactivation and Toxo reactivation seems speculative.

      (4) A direct effect on A-P on AD lesions independent on infection is not considered as a hypothesis. Given the limitations above and effects on metabolic pathways, it probably should be. The Toxo hypothesis would be more convincing if the authors could demonstrate an enhanced effect of the drug in Toxo positive individuals without no effect in Toxo negative individuals.

      Minor:

      (5) "Clinically meaningful" should be eliminated from the discussion given that this is correlative evidence.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript examines the association between atovaquone/proguanil use, zoster vaccination, toxoplasmosis serostatus and Alzheimer's Disease, using 2 databases of claims data. The manuscript is well written and concise. The major concerns about the manuscript center around the indications of atovaquone/proguanil use, which would not typically be active against toxoplasmosis at doses given, and the lack of control for potential confounders in the analysis.

      Strengths:

      (1) Use of 2 databases of claims data.

      (2) Unbiased review of medications associated with AD, which identified zoster vaccination associated with decreased risk of AD, replicating findings from other studies.

      Weaknesses:

      (1) Given that atovaquone/proguanil is likely to be given to a healthy population who is able to travel, concern that there are unmeasured confounders driving the association.

      (2) The dose of atovaquone in atovaquone/proguanil is unlikely to be adequate suppression of toxo (much less for treatment/elimination of toxo), raising questions about the mechanism.

      (3) Unmeasured bias in the small number of people who had toxoplasma serology in the TriNetX cohort.

    1. Reviewer #1 (Public review):

      Summary:

      The work of Bechara Rahme and colleagues provides an explanation as to how bacterially infected flies eventually die. While widespread tissue and multiorgan damage are to be expected in the latest stages of a systemic infection, the mechanisms leading to the host's death remain unresolved. To this end, this work illustrates the role of PrtA, a metalloproteinase found within Outer Membrane Vesicles (OMVs) secreted by Serratia marcescens, in inducing neuronal apoptosis and paralysis before death. Another interesting aspect of the work is the compromise of blood blood-brain barrier (BBB) by OMVs. BBB is different between mammals and flies; however, it merits scientific attention.

      Strengths:

      The strength of evidence lies in a wealth of experiments involving disparate innate immune mechanisms that either contribute (Imd, PPO1/2, Nox, Duox, SOD2) or oppose (hemocytes and Hayan protease) host defense. Moreover, the role of neuronal JNK and apoptic signaling is shown to contribute to host death.

      Genetics is supported by experiments using chemical treatments (Vitamin C and mito-TEMPO) as host-protecting antioxidants, and the biochemical purification and quantification of OMVs and the PrtA protease.

      Weaknesses:

      However, the reliance on non-isogenised flies to provide quantitative data is unsafe, and at this point, the strength of the evidenceis apparently incomplete. The mutant flies used for the genes Key, Myd88, Hayan, and Nos are doubtfully comparable to the control fly strains used in terms of the general genetic background. The latter is of utmost importance in assessing quantitative traits.

      The general background difference between control and test flies is also an issue when using tissue-specific expression via GAL4/UAS, because the UAS lines used are only apparently but not truly isogenic to the w flies used as controls.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigate the mechanisms underlying the virulence of OMVs using a Drosophila model. They reveal a complex interplay between host defenses and OMV pathogenicity. Although the study enhances our understanding of Drosophila innate immunity, additional evidence is needed to strengthen the conclusions.

      Strengths:

      (1) In Figure 1, Toll pathway mutants infected with OMVs displayed three distinct phenotypic outcomes: mildly enhanced resistance to OMV infection, a response similar to that of the control, or increased susceptibility. Therefore, in addition to Imd and Kenny mutants from the Imd pathway, further mutants, such as Relish and PGRP-LC, should be examined to assess whether the Imd pathway is involved in host defense against OMVs.

      (2) Plasmatocytes clear particles via phagocytosis or endocytosis. However, flies lacking all hemocytes showed increased resistance to OMV challenge, raising the question of whether hemocytes actually aid the pathogen. To explore this hypothesis, the uptake of fluorescently tagged OMVs should be examined.

      (3) Hayan cleaves PPO into active PO. However, Hayan and PPO mutants exhibit opposite phenotypes upon OMV injection, raising the question of whether OMV-induced pathogenesis is linked to melanization.

      (4) Puckered mRNA levels were used as a read-out for JNK pathway activity. A transient induction of the JNK pathway was observed in head and thorax tissues. It would be beneficial if the authors could directly examine JNK activation in neuronal cells using immunostaining for pJNK.

      (5) In Figure 4B, the kayak was knocked down using the pan-neuronal driver elav-Gal4. To confirm the specificity and validity of this observation, the experiment should be repeated using another neural-specific driver.

      Weaknesses:

      It is unclear how many Serratia marcescens cells a 69 nL injection of 0.1 ng/nL OMVs corresponds to.

    3. Reviewer #3 (Public review):

      Summary:

      The authors investigate deficiencies in various immune responses, and also the prtA toxin's role in OMV toxicity. Some key interpretations are that the Imd pathway contributes to preventing OMV toxicity, but not Toll, and that Hayan and Eater somehow mediate OMV or PrtA toxicity. This descriptive effort is a solid set of experiments, although some experimental results may require further validation.

      Strengths:

      The breadth of experiments tests multiple immune parameters, providing a systematic effort that ensures a number of potentially relevant interactions can be recovered. Certain findings, such as the PrtA toxicity to flies, appear solid, and some interesting findings regarding Hayan and eater will be of interest to the fly immunity field.

      Weaknesses:

      It appears almost all results rely on the use of a single mutant representing the deletion of the gene. It's not clear if the mutations are always in the same genetic background, but this can be clarified. There are a couple of results that are confusing and may be internally contradicting, and should be additionally validated and clarified.

    1. Reviewer #1 (Public review):

      The investigators elegantly utilized a single-cell co-assay of RNA and ATAC seq to unveil the heterogeneous gene regulatory networks in Ewing sarcoma. The authors should be commended on their ability to identify multiple unique modules of gene regulation of Ewing sarcoma utilizing complex computational methods between numerous Ewing sarcoma cell lines. Additionally, they complemented their single-cell findings with xenografts as well as primary Ewing sarcoma patient tumors - validating the intratumoral heterogeneous gene regulatory networks of Ewing sarcoma. More importantly, they have revealed that exogenous TGF-β may modify these distinct epigenetic and transcriptional signatures within Ewing sarcoma tumors. Overall, the manuscript highlights an important discovery of the heterogenous gene regulatory programming of Ewing sarcoma and further highlights the role that TGFB plays within the tumor microenvironment of Ewing sarcoma. There are some areas of ambiguity that require clarification to increase the impact of the manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      This work by Waltner et. al. provides a comprehensive single-cell multiomics analysis of plasticity in gene regulatory networks present in Ewing sarcoma using single-cell RNA-sequencing (scRNA-seq) and single-cell assay for transposase accessible chromatin with sequencing (scATAC-seq). They find that Ewing sarcoma cell line models have distinct patterns of chromatin accessibility compared to non-Ewing sarcoma models, and that there is significant variability across Ewing sarcoma cell lines, and sometimes within a single cell line. These differences across models are linked to 3 distinct gene regulatory modules, 2 of which are present across the range of model systems studied here. The first modules present across models are activated when the fusion is expressed and include genes enriched for the known EWSR1::FLI1 response element, GGAA microsatellites, along with other neural crest transcription factors. The other module primarily consists of genes repressed by EWSR1::FLI1, which are activated in EWSR1::FLI1-low states. Interestingly, EWSR1::FLI1-low cells have already been tied to more migratory and metastatic phenotypes, and the data here suggest these cells are more responsive to external signals from TGF-β, and this may be mediated through FOSL2-mediated gene regulation. While there are some minor additional validation studies that can be performed to strengthen a few individual analyses, this is a technically rigorous study, with a variety of different analytical techniques used to address similar questions, and this approach elevates confidence in the answers provided. This is further strengthened by the diverse set of model systems used, including patient-derived cell lines, cell line xenograft models, patient-derived xenografts, mining available single-cell data from patient samples, and validation of the gene modules identified in a larger set of patient microarray samples. In whole, this study provides a valuable resource for understanding heterogeneity, plasticity, and gene expression networks in Ewing sarcoma. This may be useful for future studies of metastatic disease and may also provide a framework for similar questions in other fusion-driven sarcomas.

      Strengths:

      There are a few core strengths in this study. First is the number and diversity of Ewing sarcoma models studied, spanning commonly used cell lines, patient-derived xenografts, and patient samples. The second is the large array of rigorous and orthogonal approaches used to uncover the identity and function of various gene modules. This includes an array of informatics techniques, as well as specific modulation of cell line models in culture. A third is confirmation that different gene expression programs are present in the same tumor using spatial transcriptomic analysis. Lastly, the authors have made all of their data and code accessible, enabling continued use of this dataset as a resource for others.

      Weaknesses:

      As highlighted by the authors, this study is somewhat limited by the small number of single-cell data from patient samples that are publicly available. Much of the analysis comes from cell lines. Additionally, they focus only on one type of signal that may modulate cell plasticity, and there are likely to be many others. Lastly, there are a few weak spots in the data. Some of this likely arises from the underlying complexity of the data, the generally sparse nature of scATAC data, and the biological heterogeneity present in the cell lines studied. The most pronounced weakness was in the analysis of transcription factors that dictate gene expression in the distinct modules, as well as the response to TGF-β. While some specific transcription factors showed module-specific expression consistent with the computational prediction in Figure 2, others did not likely due to additional factors not tested here. Likewise, the same transcription factors did not always show consistent enrichment in the gene modules that responded to TGF-β treatment when analyzed across cell lines. On the whole, these are relatively minor weaknesses and do not diminish the value of this study.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigate the effects of aging on auditory system performance in understanding temporal fine structure (TFS), using both behavioral assessments and physiological recordings from the auditory periphery, specifically at the level of the auditory nerve. This dual approach aims to enhance understanding of the mechanisms underlying observed behavioral outcomes. The results indicate that aged animals exhibit deficits in behavioral tasks for distinguishing between harmonic and inharmonic sounds, which is a standard test for TFS coding. However, neural responses at the auditory nerve level do not show significant differences when compared to those in young, normal-hearing animals. The authors suggest that these behavioral deficits in aged animals are likely attributable to dysfunctions in the central auditory system, potentially as a consequence of aging.To further investigate this hypothesis, the study includes an animal group with selective synaptic loss between inner hair cells and auditory nerve fibers, a condition known as cochlear synaptopathy (CS). CS is a pathology associated with aging and is thought to be an early indicator of hearing impairment. Interestingly, animals with selective CS showed physiological and behavioral TFS coding similar to that of the young normal-hearing group, contrasting with the aged group's deficits. Despite histological evidence of significant synaptic loss in the CS group, the study concludes that CS does not appear to affect TFS coding, either behaviorally or physiologically.

      Strengths:

      This study addresses a critical health concern, enhancing our understanding of mechanisms underlying age-related difficulties in speech intelligibility, even when audiometric thresholds are within normal limits. A major strength of this work is the comprehensive approach, integrating behavioral assessments, auditory nerve (AN) physiology, and histology within the same animal subjects. This approach enhances understanding of the mechanisms underlying the behavioral outcomes and provides confidence in the actual occurrence of synapse loss and its effects.The study carefully manages controlled conditions by including five distinct groups: young normal-hearing animals, aged animals, animals with CS induced through low and high doses, and a sham surgery group. This careful setup strengthens the study's reliability and allows for meaningful comparisons across conditions. Overall, the manuscript is well-structured, with clear and accessible writing that facilitates comprehension of complex concepts.

      Weakness:

      The stimulus and task employed in this study are very helpful for behavioral research, and using the same stimulus setup for physiology is advantageous for mechanistic comparisons. However, I have some concerns about the limitations in auditory nerve (AN) physiology. Due to practical constraints, it is not feasible to record from a large enough population of fibers that covers a full range of best frequencies (BFs) and spontaneous rates (SRs) within each animal. This raises questions about how representative the physiological data are for understanding the mechanism in behavioral data. I am curious about the authors' interpretation of how this stimulus setup might influence results compared to methods used by Kale and Heinz (2010), who adjusted harmonic frequencies based on the characteristic frequency (CF) of recorded units. While, the harmonic frequencies in this study are fixed across all CFs, meaning that many AN fibers may not be tuned closely to the stimulus frequencies. If units are not responsive to the stimulus further clarification on detecting mistuning and phase locking to TFS effects within this setup would be valuable. Given the limited number of units per condition-sometimes as few as three for certain conditions-I wonder if CF-dependent variability might impact the results of the AN data in this study and discussing this factor can help with better understanding the results. While the use of the same stimuli for both behavioral and physiological recordings is understandable, a discussion on how this choice affects interpretation would be beneficial. In addition a 60 dB stimulus could saturate high spontaneous rate (HSR) AN fibers, influencing neural coding and phase-locking to TFS. Potentially separating SR groups, could help address these issues and improve interpretive clarity.

      A deeper discussion on the role of fiber spontaneous rate could also enhance the study. How might considering SR groups affect AN results related to TFS coding? While some statistical measures are included in the supplement, a more detailed discussion in the main text could help in interpretation.

      Although Figure S2 indicates no change in median SR, the high-dose treatment group lacks LSR fibers, suggesting a different distribution based on SR for different animal groups, as seen in similar studies on other species. A histogram of these results would be informative, as LSR fiber loss with CS-whether induced by ouabain in gerbils or noise in other animals-is well documented (e.g., Furman et al., 2013).

      Although ouabain effects on gerbils have been explored in previous studies, since these data is already seems to be recorded for the animal in this study, a brief description of changes in auditory brainstem response (ABR) thresholds, wave 1 amplitudes, and tuning curves for animals with cochlear synaptopathy (CS) in this study would be beneficial. This would confirm that ouabain selectively affects synapses without impacting outer hair cells (OHCs). For aged animals, since ABR measurements were taken, comparing hearing differences between normal and aged groups could provide insights into the pathologies besides CS in aged animals. Additionally, examining subject variability in treatment effects on hearing and how this correlates with behavior and physiology would yield valuable insights. If limited space maybe a brief clarification or inclusion in supplementary could be good enough.

      Another suggestion is to discuss the potential role of MOC efferent system and effect of anesthesia in reducing efferent effects in AN recordings. This is particularly relevant for aged animals, as CS might affect LSR fibers, potentially disrupting the medial olivocochlear (MOC) efferent pathway. Anesthesia could lessen MOC activity in both young and aged animals, potentially masking efferent effects that might be present in behavioral tasks. Young gerbils with functional efferent systems might perform better behaviorally, while aged gerbils with impaired MOC function due to CS might lack this advantage. A brief discussion on this aspect could potentially enhance mechanistic insights.

      Lastly, although synapse counts did not differ between the low-dose treatment and NH I sham groups, separating these groups rather than combining them with the sham might reveal differences in behavior or AN results, particularly regarding the significance of differences between aged/treatment groups and the young normal-hearing group.

    2. Reviewer #2 (Public review):

      Summary:

      Using a gerbil model, the authors tested the hypothesis that loss of synapses between sensory hair cells and auditory nerve fibers (which may occur due to noise exposure or aging) affects behavioral discrimination of the rapid temporal fluctuations of sounds. In contrast to previous suggestions in the literature, their results do not support this hypothesis; young animals treated with a compound that reduces the number of synapses did not show impaired discrimination compared to controls. Additionally, their results from older animals showing impaired discrimination suggest that age-related changes aside from synaptopathy are responsible for the age-related decline in discrimination.

      Strengths:

      (1) The rationale and hypothesis are well-motivated and clearly presented.

      (2) The study was well conducted with strong methodology for the most part, and good experimental control. The combination of physiological and behavioral techniques is powerful and informative. Reducing synapse counts fairly directly using ouabain is a cleaner design than using noise exposure or age (as in other studies), since these latter modifiers have additional effects on auditory function.

      (3) The study may have a considerable impact on the field. The findings could have important implications for our understanding of cochlear synaptopathy, one of the most highly researched and potentially impactful developments in hearing science in the past fifteen years.

      Weaknesses:

      (1) I have concerns that the gerbils may not have been performing the behavioral task using temporal fine structure information.

      Human studies using the same task employed a filter center frequency that was (at least) 11 times the fundamental frequency (Marmel et al., 2015; Moore and Sek, 2009). Moore and Sek wrote: "the default (recommended) value of the centre frequency is 11F0." Here, the center frequency was only 4 or 8 times the fundamental frequency (4F0 or 8F0). Hence, relative to harmonic frequency, the harmonic spacing was considerably greater in the present study. However, gerbil auditory filters are thought to be broader than those in human. In the revised version of the manuscript, the authors provide modelling results suggesting that the excitation patterns were discriminable for the 4F0 conditions, but may not have been for the 8F0 conditions. These results provide some reassurance that the 8F0 discriminations were dependent on temporal cues, but the description of the model lacks detail. Also, the authors state that "thus, for these two conditions with harmonic number N of 8 the gerbils cannot rely on differences in the excitation patterns but must solve the task by comparing the temporal fine structure." This is too strong. Pulsed tone intensity difference limens (the reference used for establishing whether or not the excitation pattern cues were usable) may not be directly comparable to profile-analysis-like conditions, and it has been argued that frequency discrimination may be more sensitive to excitation pattern cues than predicted from a simple comparison to intensity difference limens (Micheyl et al. 2013, https://doi.org/10.1371/journal.pcbi.1003336).

      I'm also somewhat concerned that the masking noise used in the present study was too low in level to mask cochlear distortion products. Based on their excitation pattern modelling, the authors state (without citation) that "since the level of excitation produced by the pink noise is less than 30 dB below that produced by the complex tones, distortion products will be masked." The basis for this claim is not clear. In human, distortion products may be only ~20 dB below the levels of the primaries (referenced to an external sound masker / canceller, which is appropriate, assuming that the modelling reported in the present paper did not include middle-ear effects; see Norman-Haignere and McDermott, 2016, doi: 10.1016/j.neuroimage.2016.01.050). Oxenham et al. (2009, doi: 10.1121/1.3089220) provide further cautionary evidence on the potential use of distortion product cues when the background noise level is too low (in their case the relative level of the noise in the compromised condition was only a little below that used in the present study). The masking level used in the present study may have been sufficient, but it would be useful to have some further reassurance on this point.

      (2) The synapse reductions in the high ouabain and old groups were relatively small (mean of 19 synapses per hair cell compared to 23 in the young untreated group). In contrast, in some mouse models of the effects of noise exposure or age, a 50% reduction in synapses is observed, and in the human temporal bone study of Wu et al. (2021, https://doi.org/10.1523/JNEUROSCI.3238-20.2021) the age-related reduction in auditory nerve fibres was ~50% or greater for the highest age group across cochlear location. It could be simply that the synapse loss in the present study was too small to produce significant behavioral effects. Hence, although the authors provide evidence that in the gerbil model the age-related behavioral effects are not due to synaptopathy, this may not translate to other species (including human).

      (3) The study was not pre-registered, and there was no a priori power calculation, so there is less confidence in replicability than could have been the case. Only three old animals were used in the behavioral study, which raises concerns about the reliability of comparisons involving this group. Statistical analyses on very small samples can be unreliable due to problems of power, generalisability, and susceptibility to outliers.

    3. Reviewer #3 (Public review):

      This study is a part of the ongoing series of rigorous work from this group exploring neural coding deficits in the auditory nerve, and dissociating the effects of cochlear synaptopathy from other age-related deficits. They have previously shown no evidence of phase-locking deficits in the remaining auditory nerve fibers in quiet-aged gerbils. Here, they study the effects of aging on the perception and neural coding of temporal fine structure cues in the same Mongolian gerbil model.

      They measure TFS coding in the auditory nerve using the TFS1 task which uses a combination of harmonic and tone-shifted inharmonic tones which differ primarily in their TFS cues (and not the envelope). They then follow this up with a behavioral paradigm using the TFS1 task in these gerbils. They test young normal hearing gerbils, aged gerbils, and young gerbils with cochlear synaptopathy induced using the neurotoxin ouabain to mimic synapse losses seen with age.

      In the behavioral paradigm, they find that aging is associated with decreased performance compared to the young gerbils, whereas young gerbils with similar levels of synapse loss do not show these deficits. When looking at the auditory nerve responses, they find no differences in neural coding of TFS cues across any of the groups. However, aged gerbils show an increase in the representation of periodicity envelope cues (around f0) compared to young gerbils or those with induced synapse loss. The authors hence conclude that synapse loss by itself doesn't seem to be important for distinguishing TFS cues, and rather the behavioral deficits with age are likely having to do with the misrepresented envelope cues instead.

      The manuscript is well written, and the data presented are robust. Some of the points below will need to be considered while interpreting the results of the study, in its current form. These considerations are addressable if deemed necessary, with some additional analysis in future versions of the manuscript.

      Spontaneous rates - Figure S2 shows no differences in median spontaneous rates across groups. But taking the median glosses over some of the nuances there. Ouabain (in the Bourien study) famously affects low spont rates first, and at a higher degree than median or high spont rates. It seems to be the case (qualitatively) in figure S2 as well, with almost no units in the low spont region in the ouabain group, compared to the other groups. Looking at distributions within each spont rate category and comparing differences across the groups might reveal some of the underlying causes for these changes. Given that overall, the study reports that low-SR fibers had a higher ENV/TFS log-z-ratio, the distribution of these fibers across groups may reveal specific effects of TFS coding by group.

      [Update: The revised manuscript has addressed these issues]

      Threshold shifts - It is unclear from the current version if the older gerbils have changes in hearing thresholds, and whether those changes may be affecting behavioral thresholds. The behavioral stimuli appear to have been presented at a fixed sound level for both young and aged gerbils, similar to the single unit recordings. Hence, age-related differences in behavior may have been due to changes in relative sensation level. Approaches such as using hearing thresholds as covariates in the analysis will help explore if older gerbils still show behavioral deficits.

      [Update: The issue of threshold shifts with aging gerbils is still unresolved in my opinion. From the revised manuscript, it appears that aged gerbils have a 36dB shift in thresholds. While the revised manuscript provides convincing evidence that these threshold shifts do not affect the auditory nerve tuning properties, the behavioral paradigm was still presented at the same sound level for young and aged animals. But a potential 36 dB change in sensation level may affect behavioral results. The authors may consider adding thresholds as covariates in analyses or present any evidence that behavioral thresholds are plateaued along that 30dB range].

      Task learning in aged gerbils - It is unclear if the aged gerbils really learn the task well in two of the three TFS1 test conditions. The d' of 1 which is usually used as the criterion for learning was not reached in even the easiest condition for aged gerbils in all but one condition for the aged gerbils (Fig. 5H) and in that condition, there doesn't seem to be any age-related deficits in behavioral performance (Fig. 6B). Hence dissociating the inability to learn the task from the inability to perceive TFS 1 cues in those animals becomes challenging.

      [Update: The revised manuscript sufficiently addresses these issues, with the caveat of hearing threshold changes affecting behavioral thresholds mentioned above].

      Increased representation of periodicity envelope in the AN - the mechanisms for increased representation of periodicity envelope cues is unclear. The authors point to some potential central mechanisms but given that these are recordings from the auditory nerve what central mechanisms these may be is unclear. If the authors are suggesting some form of efferent modulation only at the f0 frequency, no evidence for this is presented. It appears more likely that the enhancement may be due to outer hair cell dysfunction (widened tuning, distorted tonotopy). Given this increased envelope coding, the potential change in sensation level for the behavior (from the comment above), and no change in neural coding of TFS cues across any of the groups, a simpler interpretation may be -TFS coding is not affected in remaining auditory nerve fibers after age-related or ouabain induced synapse loss, but behavioral performance is affected by altered outer hair cell dysfunction with age.

      [Update: The revised manuscript has addressed these issues]

      Emerging evidence seems to suggest that cochlear synaptopathy and/or TFS encoding abilities might be reflected in listening effort rather than behavioral performance. Measuring some proxy of listening effort in these gerbils (like reaction time) to see if that has changed with synapse loss, especially in the young animals with induced synaptopathy, would make an interesting addition to explore perceptual deficits of TFS coding with synapse loss.

      [Update: The revised manuscript has addressed these issues]

    1. Reviewer #1 (Public review):

      Summary:

      Grasper et al. present a combined analysis of the role of temporal mutagenesis in cancer, which includes both theoretical investigation and empirical analysis of point mutations in TCGA cancer patient cohorts. They find that temporal elevated mutation rates contribute to cancer fitness by allowing fast adaptation when the fitness drops (due to previous deleterious mutations). This may be relevant in the case of tumor suppressor genes (TSG), which follow the 2-hit hypothesis (i.e., biallelic 2 mutations are necessary to deactivate TS), and in cases where temporal mutagenesis occurs (e.g. high APOBEC, ROS). They provide evidence that this scenario is likely to occur in patients, in some cancer types. This is an interesting and potentially important result that merits the attention of the target audience. Nonetheless, I have some questions (detailed below) regarding the design of the study, the tools and parametrization of the theoretical analysis and the empirical analysis - that I think if addressed would make the paper more solid and the conclusion more substantiated.

      Strengths:

      Combined theoretical investigation with empirical analysis of cancer patients

      Weaknesses:

      Parametrization and systematic investigation of theoretical tools and their relevant to tumor evolution

      Comments on revisions:

      The authors have adequately addressed my suggestions. I think some of the details provided in some of the replies to my comments (specifically with regard to my points 1, 4, 6ii; minor point 6) could be integrated into relevant text in the introduction , discussion and methods, to help the readers follow better the model and its interpretation - but this is up to the authors to decide what to emphasize.

    2. Reviewer #2 (Public review):

      This work presents theoretical results concerning the effect of punctuated mutation on multistep adaptation along with empirical analysis of multistep adaptation in cancer. The empirical results are claimed to demonstrate the acceleration of multistep adaptation predicted theoretically. However, there is an important disconnect between the theoretical results and the empirical observations, such that it is not clear that punctuated mutation can produce the phenomena observed empirically. Furthermore, there are other plausible explanations for the empirical observations.

      The theoretical work emphasizes the positive effect of punctuated mutation on the rate of crossing a "fitness valley", i.e., multistep adaptation where the first mutation is deleterious. The empirical work, however, focuses on inactivation of both alleles of a tumor suppressor gene (TSG), for which the first mutation--inactivation of one gene copy--is expected to be neutral or slightly advantageous, not maladaptive as suggested by the authors. Pairs of genes with putative synergystic effects were also analyzed, but there is no indication that these generally involve fitness valleys either.

      This disconnect is most glaring in Figure 4, in which the simulations are supposed to confirm that punctuated mutation can produce the empirical phenomena reported for TSG inactivation. If this is the case, it should be possible to produce such results in simulations in which inactivation of just one allele is neutral. Instead, simulations assuming a substantial fitness penalty (0.05) for the first mutation are presented. Contrary to what is claimed in the text (line 212), this is not a "biologically realistic" parameter value for TSG inactivation. The insensitivity of results to the size the fitness penalty is irrelevant: a substantial fitness penalty is qualitatively different from no penalty at all.

      The paper does report a small (15%) effect of punctuation on the rate of multistep adaptation in the absence of a fitness valley. This effect is much smaller than the fourfold increase in the presence of a fitness valley. The results presented--a single stochastic run for each condition--are insufficient to establish that there is any effect at all: if we assume that the number of pairs of fixations (about 150-180 in each simulation) is Poisson distributed, the 15% difference is not statistically significant.

      Assuming that this effect is genuine, it is likely due to a mutation rate that is unrealisitcally high (considering that "rescue" requires inactivation of a particular gene). Theoretical considerations suggest that punctuated mutation has little or no effect in the absence of a fitness valley in the limit of low mutation rate:

      (A1) The authors' theoretical results for a Galton-Watson process (SI2) imply that there is no effect without a fitness valley in that limit. This is so because there is no effect in the "supercritical" regime. Cancer cells must be supercritical (otherwise there would be no net growth), and a neutral or advantangeous mutant would remain in the supercritical regime.

      (A2) Fig. S2D indicates, as far as I can tell from the colors, that punctuation makes little or no difference to the rate of adaptation in the absence of a fitness valley, i.e., for vertical axis values of 1 or more. I am not sure why the authors (line 129) point to this figure as evidence that punctuation speeds two-step adaptation when the first mutation is not maladaptive; the figure appears to say that it does not. The fraction of events due to "stochastic tunneling" of course increases with punctuation, but that does not change the fact that adaptation is no faster.

      (A3) The authors' verbal argument to the contrary (line 124ff) is flawed. Despite the fact that even a mildly advantageous mutant is likely to go extinct, its expected frequency only increases with time, and that of a neutral allele remains constant over time. Thus, the average number of opportunities for a second mutation does not decrease with time since the first mutation, as it does when the first muation is deleterious.

      (A4) I ran some simulations for a Wright-Fisher population, and they seem to confirm the lack of an effect in the low mutation rate limit.

      Thus, it is unclear whether punctuated mutation can explain the reported phenomena or should be expected to have major effects on the rate or nature of cancer cell adaptation.

      I would also note that routes to inactivation of both copies of a TSG that are not accelerated by punctuation will dilute any effects of punctuation. An example is a single somatic mutation followed by loss of heterozygosity. Such mechanisms are not included in the theoretical analysis nor assessed empirically. If, for example, 90% of double inactivations were the result of such mechanisms with a constant mutation rate, a factor of two effect of punctuated mutagenesis would increase the overall rate by only 10%. Consideration of the rate of apparent inactivation of just one TSG copy and of deletion of both copies would shed some light on the importance of this consideration.

      Several factors besides the effects of punctuated mutation might explain or contribute to the empirical observations. Though these are now mentioned in the paper, I will list them here for clarity:

      (B1) High APOBEC3 activity can select for inactivation of TSGs (references in Butler and Banday 2023, PMID 36978147). This could explain the empirical correlations.

      (B2) Without punctuation, the rate of multistep adaptation is expected to rise more than linearly with mutation rate. Thus, if APOBEC signatures are correlated with a high mutation rate due to the action of APOBEC, this alone could explain the correlation with TSG inactivation.

      (B3) The nature of mutations caused by APOBEC might explain the results. Notably, one of the two APOBEC mutation signatures, SBS13, is particularly likely to produce nonsense mutations. The authors count both nonsense and missense mutations, but nonsense mutations are more likely to inactivate the gene, and hence to be selected.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, Jeong and Choi examine neural correlates of behavior during a naturalistic foraging task in which rats must dynamically balance resource acquisition (foraging) with the risk of threat. Rats first learn to forage for sucrose reward from a spout, and when a threat is introduced (an attack-like movement from a "LobsterBot"), they adjust their behavior to continue foraging while balancing exposure to the threat, adopting anticipatory withdraw behaviors to avoid encounter with the LobsterBot. Using electrode recordings targeting the medial prefrontal cortex (mPFC), they identify heterogenous encoding of task variables across prelimbic and infralimbic cortex neurons, including correlates of distance to the reward/threat zone and correlates of both anticipatory and reactionary avoidance behavior. Based on analysis of population responses, they show that prefrontal cortex switches between different regimes of population activity to process spatial information or behavioral responses to threat in a context-dependent manner. Characterization of the heterogenous coding scheme by which frontal cortex represents information in different goal states is an important contribution to our understanding of brain mechanisms underlying flexible behavior in ecological settings.

      Strengths:

      As many behavioral neuroscience studies employ highly controlled task designs, relatively less is generally known about how the brain organizes navigation and behavioral selection in naturalistic settings, where environment states and goals are more fluid. Here, the authors take advantage of a natural challenge faced by many animals - how to forage for resources in an unpredictable environment - to investigate neural correlates of behavior when goal states are dynamic. They investigate how prefrontal cortex (mPFC) activity is structured to support different functional "modes" (here, between a navigational mode and a threat-sensitive foraging mode) for flexible behavior. Overall, an important strength and real value of this study is the design of the behavioral experiment, which is trial-structured, permitting strong statistical methods for neural data analysis, yet still rich enough for unconstrained, natural behavior structured by the animal's volitional goals. The experiment is also phased to measure behavioral changes as animals first encounter a threat, and then learn to adapt their foraging strategy to its presence. Characterization of this adaptation process is itself quite interesting and sets a foundation for further study of threat learning and risk management in the foraging context. Finally, the characterization of single-neuron and population dynamics in mPFC in this naturalistic setting with fluid goal states is an important contribution to the field. Previous studies have identified neural correlates of spatial and behavioral variables in frontal cortex, but how these representations are structured, or how they are dynamically adjusted when animals shift their goals, has been less clear. The authors synthesize their main conclusions into a conceptual model for how mPFC could encode task variables in a context-dependent manner, and provide a useful framework for thinking about circuit-level mechanisms that may support mode switching.

      Weaknesses:

      The task design in this study is intentionally stimulus-rich and places minimal constraint on the animal to preserve naturalistic behavior, and this introduces some confounds that place some limits on the interpretability of neural responses. For example, some variables which are the target of neural correlation analysis, such as spatial/proximity coding and coding of threat and threat-related behaviors, are naturally entwined. In their revisions, the authors have included extensive analyses and control conditions to disambiguate these confounds. Within the limits of their task design, this provides compelling evidence that mPFC neurons encode threat, decision, and spatial information in a context-dependent manner. Future experiment designs, which intentionally separate task contexts (e.g. navigation vs. foraging), could serve to further clarify the structure of coding across contexts and/or goal states.

      While the study provides an important advance in our understanding of mPFC coding structure under naturalistic conditions, the study still lacks functional manipulations to establish any form of causality. This limitation is acknowledged in the text, and the report is careful not to over interpret suggestions of causal contribution, instead setting a foundation for future investigations.

    2. Reviewer #2 (Public review):

      Summary:

      Jeong & Choi (2023) use a semi-naturalistic paradigm to tackle the question of how the activity of neurons in the mPFC might continuously encode different functions. They offer two possibilities: either there are separate dedicated populations encoding each function, or cells alter their activity dependent on the current goal of the animal. In a threat-avoidance task rats procurred sucrose in an area of a chamber where, after remaining there for some amount of time, a 'Lobsterbot' robot attacked. In order to initiate the next trial rats had to move through the arena to another area before returning to the robot encounter zone. Therefore the task has two key components: threat avoidance and navigating through space. Recordings in the IL and PL of the mPFC revealed encoding that depended on what stage of the task the animal was currently engaged in. When animals were navigating, neuronal ensembles in these regions encoded distance from the threat. However, whilst animals were directly engaged with the threat and simultaneously consuming reward, it was possible to decode from a subset of the population whether animals would evade the threat. Therefore the authors claim that neurons in the mPFC switched between two functional modes: representing allocentric spatial information, and representing egocentric information pertaining to the reward and threat. Finally, the authors propose a conceptual model based on these data whereby this switching of population encoding is driven by either bottom-up sensory information or top-down arbitration.

      Strengths:

      Whilst these multiple functions of activity in the mPFC have generally been observed in tasks dedicated to the study of a singular function, less work has been done in contexts where animals continuously switch between different modes of behaviour in a more natural way. Being able to assess whether previous findings of mPFC function apply in natural contexts is very valuable to the field, even outside of those interested in the mPFC directly. This also speaks to the novelty of the work; although mixed selectivity encoding of threat assessment and action selection has been demonstrated in some contexts (e.g. Grunfeld & Likhtik, 2018) understanding the way in which encoding changes on-the-fly in a self-paced task is valuable both for verifying whether current understanding holds true and for extending our models of functional coding in the mPFC.

      The authors are also generally thoughtful in their analyses and use a variety of approaches to probe the information encoded in the recorded activity. In particular, they use relatively close analysis of behaviour as well as manipulating the task itself by removing the threat to verify their own results. The use of such a rich task also allows them to draw comparisons, e.g. in different zones of the arena or different types of responses to threat, that a more reduced task would not otherwise allow. Additional in-depth analyses in the updated version of the manuscript, particularly the feature importance analysis, as well as complimentary null findings (a lack of cohesive place cell encoding, and no difference in location coding dependent on direction of trajectory) further support the authors' conclusion that populations of cells in the mPFC are switching their functional coding based on task context rather than behaviour per se. Finally, the authors' updated model schematic proposes an intriguing and testable implementation of how this encoding switch may be manifested by looking at differentiable inputs to these populations.

      Weaknesses:

      The main existing weakness of this study is that its findings are correlational (as the authors highlight in the discussion). Future work might aim to verify and expand the authors' findings - for example, whether the elevated response of Type 2 neurons directly contributes to the decision-making process or just represents fear/anxiety motivation/threat level - through direct physiological manipulation. However, I appreciate the challenges of interpreting data even in the presence of such manipulations and some of the additional analyses of behaviour, for example the stability of animals' inter-lick intervals in the E-zone, go some way towards ruling out alternative behavioural explanations. Yet the most ideal version of this analysis is to use a pose estimation method such as DeepLabCut to more fully measure behavioural changes. This, in combination with direct physiological manipulation, would allow the authors to fully validate that the switching of encoding by this population of neurons in the mPFC has the functional attributes as claimed here.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates how various behavioral features are represented in the medial prefrontal cortex (mPFC) of rats engaged in a naturalistic foraging task. The authors recorded electrophysiological responses of individual neurons as animals transitioned between navigation, reward consumption, avoidance, and escape behaviors. Employing a range of computational and statistical methods, including artificial neural networks, dimensionality reduction, hierarchical clustering, and Bayesian classifiers, the authors sought to predict from neural activity distinct task variables (such as distance from the reward zone and the success or failure of avoidance behavior). The findings suggest that mPFC neurons alternate between at least two distinct functional modes, namely spatial encoding and threat evaluation, contingent on the specific location.

      Strengths:

      This study attempt to address an important question: understanding the role of mPFC across multiple dynamic behaviors. The authors highlight the diverse roles attributed to mPFC in previous literature and seek to explain this apparent heterogeneity. They designed an ethologically relevant foraging task that facilitated the examination of complex dynamic behavior, collecting comprehensive behavioral and neural data. The analyses conducted are both sound and rigorous.

      Weaknesses:

      Because the study still lacks experimental manipulation, the findings remain correlational. The authors have appropriately tempered their claims regarding the functional role of the mPFC in the task. The nature of the switch between functional modes encoding distinct task variables (i.e., distance to reward, and threat-avoidance behavior type) is not established. Moreover, the evidence presented to dissociate movement from these task variables is not fully convincing, particularly without single-session video analysis of movement. Specifically, while the new analyses in Figure 7 are informative, they may not fully account for all potential confounding variables arising from changes in context or behavior.

      Comments on revisions:

      The authors have addressed my previous recommendations.

    1. Reviewer #1 (Public review):

      In this manuscript, Chen et al. investigate the role of the membrane estrogen receptor GPR30 in spinal mechanisms of neuropathic pain. Using a wide variety of techniques, they first provide convincing evidence that GPR30 expression is restricted to neurons within the spinal cord, and that GPR30 neurons are well-positioned to receive descending input from the primary sensory cortex (S1). In addition, the authors put their findings in the context the previous knowledge in the field, presenting evidence demonstrating that GRP30 is expressed in the majority of CCK-expressing spinal neurons. Overall, this manuscript furthers our understanding of neural circuity that underlies neuropathic pain and will be of broad interest to neuroscientists, especially those interested in somatosensation. Nevertheless, the manuscript would be strengthened by additional analyses and clarification of data that is currently presented.

      Strengths:

      The authors present convincing evidence for expression of GPR30 in the spinal cord that is specific to spinal neurons. Similarly, complementary approaches including pharmacological inhibition and knockdown of GPR30 are used to demonstrate a role for the receptor in driving nerve injury-induced pain in rodent models.

      Weaknesses:

      Although steps were taken to put their data into the broader context of what is already known about the spinal circuitry of pain, more considerations and analyses would help the authors better achieve their goal. For instance, to determine whether GPR30 is expressed in excitatory or inhibitory neurons, more selective markers for these subtypes should be used over CamK2. Moreover, quantitative analysis of the extent of overlap between GPR30+ and CCK+ spinal neurons is needed to understand the potential heterogeneity of the GPR30 spinal neuron population, and to interpret experiments characterizing descending SI inputs onto GPR30 and CCK spinal neurons. Filling these gaps in knowledge would make their findings more solid.

      Revised Manuscript Update:

      In their revised manuscript, Chen et al. have added additional data that establishes GPR30 spinal neurons as a population of excitatory neurons, half of which express CCK. These data help to position GPR30 neurons in the existing framework of spinal neuron populations that contribute to neuropathic pain, strengthening the author's findings.

      I have no new recommendations to the author's following this round of revisions.

    2. Reviewer #3 (Public review):

      Summary:

      The authors convincingly demonstrate that a population of CCK+ spinal neurons in the deep dorsal horn express the G protein coupled estrogen receptor GPR30 to modulate pain sensitivity in the chronic constriction injury (CCI) model of neuropathic pain in mice. Using complementary pharmacological and genetic knockdown experiments they convincingly show that GPR30 inhibition or knockdown reverses mechanical, tactile and thermal hypersensitivity, conditioned place aversion, and c-fos staining in the spinal dorsal horn after CCI. They propose that GPR30 mediates an increase in postsynaptic AMPA receptors after CCI using slice electrophysiology which may underlie the increased behavioral sensitivity. They then use anterograde tracing approaches to show that CCK and GPR30 positive neurons in the deep dorsal horn may receive direct connections from primary somatosensory cortex. Chemogenetic activation of these dorsal horn neurons proposed to be connected to S1 increased nociceptive sensitivity in a GPR30 dependent manner. Overall, the data are very convincing and the experiments are well conducted and adequately controlled. The potential role of direct connections from S1 for descending modulation of pain and the endogenous mechanism(s) activating GPR30 will be interesting to test in future studies.

      Strengths:

      The experiments are very well executed and adequately controlled throughout the manuscript. The data are nicely presented and supportive of a role for GPR30 signaling in the spinal dorsal horn influencing nociceptive sensitivity following CCI. The authors also did an excellent job of using complementary approaches to rigorously test their hypothesis.

      Weaknesses:

      While the viral tracing demonstrates a potential connection between S1 and CCK+ or GPR30+ spinal neurons, no direct evidence is provided for S1 in facilitating any activity of these neurons in the dorsal horn.

      Comments on the latest version:

      The authors have done a good job addressing previous critiques and have appropriately revised the manuscript and conclusions.

    1. Reviewer #1 (Public review):

      In the Late Triassic (around 230 Ma ago), southern Wales and adjacent parts of England were a karst landscape. The caves and crevices accumulated remains of small vertebrates. These fossil-rich fissure fills are being exposed in limestone quarrying. In 2022 (reference 13 of the article), a partial articulated skeleton and numerous isolated bones from one fissure fill were named Cryptovaranoides microlanius and described as the oldest known squamate - the oldest known animal, by some 50 Ma, that is more closely related to snakes and some extant lizards than to other extant lizards. This would have considerable consequences for our understanding of the evolution of squamates and their closest relatives, especially for its speed and absolute timing, and was supported in the same paper by phylogenetic analyses based on different datasets.

      In 2023, the present authors published a rebuttal (ref. 18) to the 2022 paper, challenging anatomical interpretations and the irreproducible referral of some of the isolated bones to Cryptovaranoides. Modifying the datasets accordingly, they found Cryptovaranoides outside Squamata and presented evidence that it is far outside. In 2024 (ref. 19), the original authors defended most of their original interpretation and presented some new data, some of it from newly referred isolated bones. The present article discusses anatomical features and the referral of isolated bones in more detail, documents some clear misinterpretations, argues against the widespread but not justifiable practice of referring isolated bones to the same species as long as there is merely no known evidence to the contrary, further argues against comparing newly recognized fossils to lists of diagnostic characters from the literature as opposed to performing phylogenetic analyses and interpreting the results, and finds Cryptovaranoides outside Squamata again.

      Although a few of the character discussions can probably still be improved, I see no sign that the discussion is going in circles or otherwise becoming unproductive. I can even imagine that the present contribution will end it.

    2. Reviewer #2 (Public review):

      Congratulations on this revised manuscript on the phylogenetic affinities of Cryptovaranoides, and thank you for your modifications to this manuscript following review.

      This manuscript offers a careful review of the features used to hypothesize the placement of Cryptovaranoides within crown Squamata and instead suggests that this taxon represents an earlier-diverging reptile. This work therefore reconciles morphological and molecular data regarding lizard origins, which is an important contribution to the field of vertebrate paleontology.

      The authors have improved their manuscript following reviewer comments and now provide more thorough comparisons with other early reptiles and archosauromorphs, an improvement over early versions of this paper. Changes to these comparative descriptions provide important rationale concerning the absence of superficially squamate-like features in Cryptovaranoides.

      The evolutionary relationships of Cryptovaranoides among reptiles will certainly be a matter of debate until detailed anatomical descriptions of this taxon and other putative lepidosauromorphs are published. However, it can now be said with confidence that the presence of any crown squamate in the Permian or Triassic is unlikely and should be met with skepticism, the same sort of skepticism provided in this manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      The study provides an interesting contribution to our understanding of Cryptovaranoides relationships, which is a matter of intensive debate among researchers. The authors have modified the manuscript according to most of my suggestions. My main concerns are about the wording of some statements but the authors have the right to put it as they want in the end. Overall the discussion and data are well prepared. I would recommend to publish the manuscript after very minor revisions.

      Strengths:

      Detailed analysis of the discussed characters. Illustrations of some comparative materials.

      Weaknesses:

      Abstract: "Our team challenged this identification and instead suggested †Cryptovaranoides had unclear affinities to living reptiles"

      Unfortunately I have to disagree again. "unclear affinities to living reptiles" can mean anything including a crown lizard. First, the 2023 paper clearly rejected the squamate hypothesis and presented some evidence that potentially places Cryptovaranoides among Archosauromorpha. In this context "unclear where it would belong within the latter" does not really matter. Second, we are not discussing here if Cryptovaranoides is a squamate or a stem-squamate. We have many more options on the table, so "unclear affinities" is too imprecise. Please change it to "could be an archosauromorph or an indeterminate neodiapsid" in the abstract to show the scale of conflicting evidence.

    1. Reviewer #1 (Public review):

      Summary and Strengths:

      The very well-written manuscript by Lövestam et al. from the Scheres/Goedert groups entitled "Twelve phosphomimetic mutations induce the assembly of recombinant full-length human tau into paired helical filaments" demonstrates the in vitro production of the so-called paired helical filament Alzheimer's disease (AD) polymorph fold of tau amyloids through the introduction of 12 point mutations that attempt to mimic the disease-associated hyper-phosphorylation of tau. The presented work is very important because it enables disease-related scientific work, including seeded amyloid replication in cells, to be performed in vitro using recombinant-expressed tau protein.

      Comments on revised version:

      The manuscript is significantly improved, as also indicated by Reviewer 2, with the 100% formation of the PHF and the additional experiments to elucidate on the potential mechanism by the PTMs. This is a great work.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses an important impediment in the field of Alzheimer's disease (AD) and tauapathy research by showing that 12 specific phosphomimetic mutations in full-length tau allow the protein to aggregate into fibrils with the AD fold and the fold of chronic traumatic encephalopathy fibrils in vitro. The paper presents comprehensive structural and cell based seeding data indicating the improvement of their approach over previous in vitro attempts on non-full-length tau constructs. The main weaknesses of this work results from the fact that only up to 70% of the tau fibrils form the desired fibril polymorphs. In addition, some of the figures are of low quality and confusing.

      Strengths:

      This study provides significant progress towards a very important and timely topic in the amyloid community, namely the in vitro production of tau fibrils found in patients.

      The 12 specific phosphomimetic mutations presented in this work will have an immediate impact in the field since they can be easily reproduced.

      Multiple high-resolution structures support the success of the phosphomimetic mutation approach.

      Additional data show the seeding efficiency of the resulting fibrils, their reduced tendency to bundle, and their ability to be labeled without affecting core structure or seeding capability.

      Comments on revised version:

      Generally, I am satisfied with the revisions. Specifically, the new results showing 100% formation of PHF is a significant improvement.

    1. Reviewer #1 (Public review):

      Summary:

      Activation of thermogenesis by cold exposure and dietary protein restriction are two lifestyle changes that impact health in humans and lead to weight loss in model organisms, here the mouse. How these affect liver and adipose tissues has not been thoroughly investigated side by side. In mice, the authors show that the responses to methionine restriction and cold exposure are tissue-specific while the effects on beige adipose are somewhat similar.

      Strengths:

      The strength of the work is the comparative approach, using transcriptomics and bioinformatic analyses to investigate the tissue-specific impact. The work was performed in mouse models and is state-of-the-art. This represents an important resource for researchers in the field of protein restriction and thermogenesis.

      Weaknesses:

      The findings are descriptive and the conclusions remain associative. The work is limited to mouse physiology and the human implications have not been investigated yet.

    2. Reviewer #2 (Public review):

      Summary:

      This study provides a library of RNA sequencing analysis from brown fat, liver and white fat of mice treated with two stressors - cold challenge and methionine restriction - alone and in combination (interaction between diet and temperature). They characterize the physiologic response of the mice to the stressors, including effects on weight, food intake and metabolism. This paper provides evidence that while both stressors increase energy expenditure, there are complex tissue-specific responses in gene expression, with additive, synergistic and antagonistic responses seen in different tissues.

      Strengths:

      The study design and implementation is solid and well-controlled. Their writing is clear and concise. The authors do an admirable job of distilling the complex transcriptome data into digestible information for presentation in the paper. Most importantly, they do not over reach in their interpretation of their genomic data, keeping their conclusions appropriately tied to the data presented. The discussion is well thought out addresses some interesting points raised by their results.

      Weaknesses:

      The major weakness of the paper is the almost complete reliance on RNA sequencing data, but it is presented as a transcriptomic resource.

    3. Reviewer #3 (Public review):

      Summary:

      Ruppert et al. present a well-designed 2×2 factorial study directly comparing methionine restriction (MetR) and cold exposure (CE) across liver, iBAT, iWAT, and eWAT, integrating physiology with tissue-resolved RNA-seq. This approach allows a rigorous assessment of where dietary and environmental stimuli act additively, synergistically, or antagonistically. Physiologically, MetR progressively increases energy expenditure (EE) at 22{degree sign}C and lowers RER, indicating a lipid utilization bias. By contrast, a 24-hour 4 {degree sign}C challenge elevates EE across all groups and eliminates MetR-Ctrl differences. Notably, changes in food intake and activity do not explain the MetR effect at room temperature.

      Strengths:

      The data convincingly support the central claim: MetR enhances EE and shifts fuel preference to lipids at thermoneutrality, while CE drives robust EE increases regardless of diet and attenuates MetR-driven differences. Transcriptomic analysis reveals tissue-specific responses, with additive signatures in iWAT and CE-dominant effects in iBAT. The inclusion of explicit diet×temperature interaction modeling and GSEA provides a valuable transcriptomic resource for the field.

      Comments on revisions:

      The authors have addressed any concerns I had.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript describes a study examining the relationship between microsaccades and covert attention. This question has been widely investigated, with numerous studies showing that during sustained fixation, when subjects covertly attend to a peripheral stimulus, microsaccades tend to be biased toward the attended location. Here, the authors ask whether this microsaccade bias reflects a shift of covert attention or the maintenance of covert attention. They conclude that the bias is primarily driven by attention shifts, a finding that also helps reconcile the seemingly conflicting results of prior research, where the bias was questioned in paradigms that largely involved attention maintenance rather than shifting.

      Strengths:

      The paradigm and conclusions appear sound and supported by the results. A large sample size was used.

      Weaknesses:

      Weaknesses are mostly related to how the authors enforced fixation in the task, and clarifications are needed regarding some methodological details. A more direct comparison of the effect in the two experimental conditions is missing.

    2. Reviewer #2 (Public review):

      Summary:

      This study aims to test the hypothesis that microsaccades are linked to the shifting of spatial attention, rather than the maintenance of attention at the cued location. In two experiments, participants were required to judge an orientation change at either a validly cued location (80% of the time) or an invalidly cued location (20% of the time). This change was presented at varying intervals (ranging from 500 to 3,200 ms) after cue onset. Accuracy and reaction times both showed attentional benefits at the valid versus invalid location across the different cue-target intervals. In contrast, microsaccade biases were time-dependent. The authors report a directional bias primarily observed around 400 ms after the cue, with later intervals (particularly in Experiment 2) exhibiting no biases in microsaccade direction towards the cued location. The authors argue that this finding supports their initial hypothesis that microsaccade biases reflect shifts in attention, but that maintaining attention at the cued location after an attention shift is not correlated with microsaccade direction.

      Strengths:

      The results are straightforward given the chosen experimental design. The manuscript is clearly written, and the presentation of the study and its visualisations are both of a high standard.

      Weaknesses:

      The major weakness of this paper is its incremental contribution to a widely studied phenomenon. The link between attention and microsaccades has been the subject of extensive research over the past two decades. This study merely provides a limited overview of the key insights gained from these papers and discussions. In fact, it attempts to summarise previous work by stating that many experiments found a link, while others did not, and provides only a relatively small number of references. To make a significant contribution, I believe the authors should evaluate the field more thoroughly, rather than merely scratching the surface.

      The authors then present a potential solution to the conflicting past findings, arguing that attention should be considered a dynamic process that can be broken down into an attention shift and a sustained attention phase. Although the authors present this as a novel concept, I cannot think of anyone in the field who considers spatial attention to be a static entity. Nevertheless, I was curious to see how the authors would attempt to determine the precise timing of the attention shift and manipulate the different stages individually. However, the authors only varied the interval between the onset of the attention cue and the test stimulus, failing to further pinpoint their dynamic attention concept.

      The current version of the experiment, therefore, takes a correlational approach, similar to initial studies by Engbert and Kliegl (2003) and Hafed and Clark (2002). Meanwhile, we have learned a great deal about the link between microsaccades and attention. Below, I will list just a few of these findings to demonstrate how much we already know. It is important to note that, while the present study cites some of these papers, it does not provide a clear overview of how the current study goes beyond previous research.

      (1) Yuval-Greenberg and colleagues (2014) presented stimuli contingent on online-detected microsaccades. A postcue indicated the target for a visual task, and the target could be congruent or incongruent with the microsaccade direction. The authors showed higher visual accuracy in congruent trials. The authors cited that paper, but it is still important to emphasize how this study already tried to go beyond purely correlational links on a single trial level.

      (2) The Desimone lab (Lower et al., 2018) showed that firing rates in monkey V4 and IT were increased when a microsaccade was generated in the direction of the attended target.

      (3) However, attention can modulate responses in the superior colliculus even in the absence of microsaccades (Yu et al., 2022)

      (4) Similarly, Poletti, Rucci & Carrasco (2017) observed attentional modulations in the absence of microsaccades, or comparable attention effects irrespective of whether a microsaccade occurred or not (Roberts & Carrasco, 2019).

      Thus, in light of these insights, I believe the current study only adds incrementally to our understanding of the link between microsaccades and spatial attention.

      In general, it is important to have an independent measure of the dynamics of an attention shift. I think a shift of 200-600 ms is quite long, and defining this interval is rather arbitrary. Why consider such a long delay as the shift? Rather than taking a data-driven approach to defining an interval for an attention shift, it would be more convincing to derive an interval of interest based on past research or an independent measure.

      The present analyses report microsaccade statistics across all trials, but do not directly link single-trial microsaccades to accuracy. Similarly, reaction times and accuracy were analyzed only with respect to valid vs. invalid trials. Here, it would be important to link the findings between microsaccades and performance on a single-trial level. For instance, can the authors report reaction times and accuracy also separately for trials with vs. without microsaccades, and for trials with congruent vs. incongruent microsaccades?

      The study would benefit greatly from including a neutral condition to substantiate claims of attentional benefits and costs. It is highly probable that invalid trials would also demonstrate costs in terms of reaction times and accuracy. It would be interesting to observe whether directional biases in microsaccades are also evident when compared to a neutral condition.

    1. Reviewer #1 (Public review):

      Summary:

      The authors report intracranial EEG findings from 12 epilepsy patients performing an associative recognition memory task under the influence of scopolamine. They show that scopolamine administered before encoding disrupts hippocampal theta phenomena and reduces memory performance, and that scopolamine administered after encoding but before retrieval impairs hippocampal theta phenomena (theta power, theta phase reset) and neural reinstatement but does not impair memory performance. This is an important study with exciting, novel results and translational implications. The manuscript is well-written, the analyses are thorough and comprehensive, and the results seem robust.

      Strengths:

      (1) Very rare experimental design (intracranial neural recordings in humans coupled with pharmacological intervention).

      (2) Extensive analysis of different theta phenomena.

      (3) Well-established task with different conditions for familiarity versus recollection.

      (4) Clear presentation of findings and excellent figures.

      (5) Translational implications for diseases with cholinergic dysfunction (e.g., AD).

      (6) Findings challenge existing memory models, and the discussion presents interesting novel ideas.

      Weaknesses:

      (1) One of the most important results is the lack of memory impairment when scopolamine is administered after encoding but before retrieval (scopolamine block 2). The effect goes in the same direction as for scopolamine during encoding (p = 0.15). Could it be that this null effect is simply due to reduced statistical power (12 subjects with only one block per subject, while there are two blocks per subject for the condition with scopolamine during encoding), which may become significant with more patients? Is there actually an interaction effect indicating that memory impairment is significantly stronger when scopolamine is applied before encoding (Figure 1d)? Similar questions apply to familiarity versus recollection (lines 78-80). This is a very critical point that could alter major conclusions from this study, so more discussion/analysis of these aspects is needed. If there are no interaction effects, then the statements in lines 84-86 (and elsewhere) should be toned down.

      (2) Further, could it simply be that scopolamine hadn't reached its major impact during retrieval after administration in block 2? Figure 2e speaks in favor of this possibility. I believe this is a critical limitation of the experimental design that should be discussed.

      (3) It is not totally clear to me why slow theta was excluded from the reinstatement analysis. For example, despite an overall reduction in theta power, relative patterns may have been retained between encoding and recall. What are the results when using 1-128 Hz as input frequencies?

      (4) In what way are the results affected by epileptic artifacts occurring during the task (in particular, IEDs)?

    2. Reviewer #2 (Public review):

      Summary:

      In this study, performed in human patients, the authors aimed at dissecting out the role of cholinergic modulation in different types of memory (recollection-based vs familiarity and novelty-based) and during different memory phases (encoding and retrieval). Moreover, their goal was to obtain the electrophysiological signature of cholinergic modulation on network activity of the hippocampus and the entorhinal cortex.

      Strengths:

      The authors combined cognitive tasks and intracranial EEG recordings in neurosurgical epilepsy patients. The study confirms previous evidence regarding the deleterious effects of scopolamine, a muscarinic acetylcholine receptor antagonist, on memory performance when administered prior to the encoding phase of the task. During both encoding and retrieval phases, scopolamine disrupts the power of theta oscillations in terms of amplitude and phase synchronization. These results raise the question of the role of theta oscillations during retrieval and the meaning of scopolamine's effect on retrieval-associated theta rhythm without cognitive changes. The authors clearly discussed this issue in the discussion session.<br /> A major point is the finding that the scopolamine-mediated effect is selective for recollection-based memory and not for familiarity- and novelty-based memory.

      The methodology used is powerful, and the data underwent a detailed and rigorous analysis.

      Weaknesses:

      A limited cohort of patients; the age of the patients is not specified in the table.

    1. Joint Public Review:

      Summary

      Non-alcoholic fatty liver disease (NAFLD) is a widespread metabolic disease associated with obesity. Endoplasmic reticulum and calcium dysregulation are hallmarks of NAFLD. Here, the authors explore whether the secreted liver protein transthyretin (TTR), which has been previously shown to modulate calcium signaling in the context of insulin resistance, could also impact NAFLD. The study is motivated by a small cohort of NASH patients who show elevated TTR levels. The authors then overexpress TTR in two mouse obesogenic models, which leads to elevated liver lipid deposition. In contrast, liver-specific TTR knockdown improves some liver lipid levels, reduces inflammation markers, and improves glucose tolerance, overall improving the NAFLD markers. These phenotypic findings are overall convincing and largely consistent in two different diet models.

      Because of TTR's connection to calcium regulation, the authors then assess whether the knockdown affects ER stress and impacts SERCA2 expression. However, the direct mechanistic evidence supporting the central claim that TTR physically interacts with and inhibits the SERCA2 calcium pump is preliminary and requires further validation. Whether the broader effects on lipid accumulation, inflammation markers, and glucose tolerance are mechanistically connected remains to be determined.

      Strengths

      The premise of the study is built on prior work from the authors identifying a link between increased transthyretin secretion and the development of insulin resistance, a related obesity condition. The in vivo studies are comprehensive, using human NASH samples, two distinct diet-induced mouse models (HFD and GAN), and in vitro hepatocyte models. The phenotypic data showing that TTR knockdown alleviates steatosis, inflammation, and insulin resistance are robust and convincing across these systems.

      Weaknesses

      The mechanistic studies in Figures 6-9 are incomplete. There are several issues encompassing experimental design, rigor, and interpretation that, if properly addressed, would make the study much stronger.

      (1) Exogenous TTR that is endocytosed by cells is unlikely to ever find itself inside the lumen of the ER. Conversely, endogenous TTR that is produced in cells and that has not yet been secreted is almost certain to have an ER lumenal localization (as in Figures 7B and 9A, and where an apparent colocalization with SERCA is likely to be incidental). In a model where TTR, acting as a hepatokine, has inhibitory effects on SERCA, these would almost certainly be realized from the cytosolic side of the ER membrane-a region inaccessible to lumenal endogenous TTR. It is possible that the overexpression and knockdown of endogenous TTR have the effects seen due to its secretion and uptake (that is, cell-non-autonomous effects), but this possibility was not directly tested through Transwell or similar assays. Given the identity of TTR as a secretory pathway client protein, the only localization data for TTR that are unexpected are those suggesting an ER localization of exogenously added TTR (Figure 7A), but this localization seems to involve only a minor population of TTR, is hindered by a technical issue with cell permeabilization (see below), and lacks orthogonal approaches to convincingly demonstrate meaningful localization of exogenous TTR at the ER membrane.

      (2) The experimental logic in Figure 8 is problematic. The authors use Thapsigargin (Tg), a potent and specific SERCA inhibitor, to probe SERCA function. However, since both Tg and TTR are proposed to inhibit SERCA2, the design lacks a critical control to demonstrate that TTR's effects are indeed mediated through SERCA2. SERCA2 activity should, in principle, be fully and irreversibly inhibited by Tg treatment, especially using such a high concentration (5 µM). If TTR's effect on calcium flux is exclusively through SERCA2, then SERCA2 impairment by TTR should have no additional effect in the presence of Tg, as Tg would already be maximally inhibiting the pump. The current data (Figures 8G-H) showing an effect of TTR-KD even with Tg present is difficult to interpret and may suggest off-target or compensatory mechanisms.

      (3) The coIP data in Figure 9 need to be better controlled, including by overexpression of FLAG- and MYC-tagged irrelevant proteins, ideally also localized to the ER. The coIP of overexpressed TTR with endogenous SERCA in Figure 9D, in addition to requiring a more rigorous control, is itself of relatively low quality, with the appearance of a possible gel/blotting artifact.

      (4) The ER stress markers in Figure 6 are not convincing. Molecular weight markers and positive controls (for example, livers from animals injected with tunicamycin) are missing. In addition, the species of ATF6 that is purportedly being detected (cleaved or full-length) is not indicated, and this protein is also notoriously difficult to detect with convincing specificity in mouse tissues. As well, CHOP protein is usually not detectable in control normal diet mouse livers, raising questions of whether the band identified as CHOP is, in fact, CHOP. These issues, along with the observation that ER stress-regulated RNAs are not altered (Figure S5), raise the question of whether ER stress is involved at all. Likewise, the quantification of SERCA2 levels from Figure 6 requires more rigor. For all blots, it isn't clear that analyzing only 3 or 4 of the animals provides adequate and unbiased power to detect differences; in addition, in Figure 6C, at least the SERCA2 exposure (assuming SERCA2 is being specifically detected; see above) is well beyond the linear range of quantification.

      In addition, the following important issues were raised:

      (5) n=4 for overexpression might not provide adequate statistical power.

      (6) The error for human NASH samples and controls in Figure 1A is surprisingly small. Larger gene expression data sets from NASH cohorts exist and should be used to test the finding in a larger population.

      (7) For experiments involving two independent variables (e.g., diet and TTR manipulation, as in Figures 2, 3, 4, 5), a Two-way ANOVA must be used instead of One-way ANOVA or t-tests. Also, the ND-TTR-KD group is missing - these data are an essential control to show the specificity of the knockdown and its effects in a non-diseased state.

      (8) Figure 7A: The co-localization signal between TTR-Alexa488 and the ER marker is not strong or convincing, which could be due to the inappropriate immunofluorescence protocol used, of permeabilization prior to fixation. The standard and recommended order is fixation first (to preserve cellular architecture), followed by permeabilization.

    1. Reviewer #1 (Public review):

      In this paper, Stanojcic and colleagues attempt to map sites of DNA replication initiation in the genome of the African trypanosome, Trypanosoma brucei. Their approach to this mapping is to isolate 'short-nascent strands' (SNSs), a strategy adopted previously in other eukaryotes (including in the related parasite Leishmania major), which involves isolation of DNA molecules whose termini contain replication-priming RNA. By mapping the isolated and sequenced SNSs to the genome (SNS-seq), the authors suggest that they have identified origins, which they localise to intergenic (strictly, inter-CDS) regions within polycistronic transcription units and suggest display very extensive overlap with previously mapped R-loops in the same loci. Finally, having defined locations of SNS-seq mapping, they suggest they have identified G4 and nucleosome features of origins, again using previously generated data. Though there is merit in applying a new approach to understand DNA replication initiation in T. brucei, where previous work has used MFA-seq and ChIP of a subunit of the Origin Replication Complex (ORC), there are two significant deficiencies in the study that must be addressed to ensure rigour and accuracy.

      (1) The suggestion that the SNS-seq data is mapping DNA replication origins that are present in inter-CDS regions of the polycistronic transcription units of T. brucei is novel and does not agree with existing data on the localisation of ORC1/CDC6, and it is very unclear if it agrees with previous mapping of DNA replication by MFA-seq due to the way the authors have presented this correlation. For these reasons, the findings essentially rely on a single experimental approach, which must be further tested to ensure SNS-seq is truly detecting origins. Indeed, in this regard, the very extensive overlap of SNS-seq signal with RNA-DNA hybrids should be tested further to rule out the possibility that the approach is mapping these structures and not origins.

      (2) The authors' presentation of their SNS-seq data is too limited and therefore potentially provides a misleading view of DNA replication in the genome of T. brucei. The work is presented through a narrow focus on SNS-seq signal in the inter-CDS regions within polycistronic transcription units, which constitute only part of the genome, ignoring both the transcription start and stop sites at the ends of the units and the large subtelomeres, which are mainly transcriptionally silent. The authors must present a fuller and more balanced view of SNS-seq mapping across the whole genome to ensure full understanding and clarity.

    2. Reviewer #2 (Public review):

      Summary:

      Stanojcic et al. investigate the origins of DNA replication in the unicellular parasite Trypanosoma brucei. They perform two experiments, stranded SNS-seq and DNA molecular combing. Further, they integrate various publicly available datasets, such as G4-seq and DRIP-seq, into their extensive analysis. Using this data, they elucidate the structure of the origins of replication. In particular, they find various properties located at or around origins, such as polynucleotide stretches, G-quadruplex structures, regions of low and high nucleosome occupancy, R-loops, and that origins are mostly present in intergenic regions. Combining their population-level SNS-seq and their single-molecule DNA molecular combing data, they elucidate the total number of origins as well as the number of origins active in a single cell.

      Strengths:

      (1) A very strong part of this manuscript is that the authors integrate several other datasets and investigate a large number of properties around origins of replication. Data analysis clearly shows the enrichment of various properties at the origins, and the manuscript concludes with a very well-presented model that clearly explains the authors' understanding and interpretation of the data.

      (2) The DNA combing experiment is an excellent orthogonal approach to the SNS-seq data. The authors used the different properties of the two experiments (one giving location information, one giving single-molecule information) well to extract information and contrast the experiments.

      (3) The discussion is exemplary, as the authors openly discuss the strengths and weaknesses of the approaches used. Further, the discussion serves its purpose of putting the results in both an evolutionary and a trypanosome-focused context.

      Weaknesses:

      I have major concerns about the origin of replication sites determined from the SNS-seq data. As a caveat, I want to state that, before reading this manuscript, SNS-seq was unknown to me; hence, some of my concerns might be misplaced.

      (1) I do not understand why SNS-seq would create peaks. Replication should originate in one locus, then move outward in both directions until the replication fork moving outward from another origin is encountered. Hence, in an asynchronous population average measurement, I would expect SNS data to be broad regions of + and -, which, taken together, cover the whole genome. Why are there so many regions not covered at all by reads, and why are there such narrow peaks?

      (2) I am concerned that up to 96% percent of all peaks are filtered away. If there is so much noise in the data, how can one be sure that the peaks that remain are real? Specifically, if the authors placed the same number of peaks as was measured randomly in intergenic regions, would 4% of these peaks pass the filtering process by chance?

      (3) There are 3 previous studies that map origins of replication in T. brucei. Devlin et al. 2016, Tiengwe et al. 2012, and Krasiļņikova et al. 2025 (https://doi.org/10.1038/s41467-025-56087-3), all with a different technique: MFA-seq. All three previous studies mostly agree on the locations and number of origins. The authors compared their results to the first two, but not the last study; they found that their results are vastly different from the previous studies (see Supplementary Figure 8A). In their discussion, the authors defend this discrepancy mostly by stating that the discrepancy between these methods has been observed in other organisms. I believe that, given the situation that the other studies precede this manuscript, it is the authors' duty to investigate the differences more than by merely pointing to other organisms. A conclusion should be reached on why the results are different, e.g., by orthogonally validating origins absent in the previous studies.

      (4) Some patterns that were identified to be associated with origins of replication, such as G-quadruplexes and nucleosomes phasing, are known to be biases of SNS-seq (see Foulk et al. Characterizing and controlling intrinsic biases of lambda exonuclease in nascent strand sequencing reveals phasing between nucleosomes and G-quadruplex motifs around a subset of human replication origins. Genome Res. 2015;25(5):725-735. doi:10.1101/gr.183848.114).

      Are the claims well substantiated?:

      My opinion on whether the authors' results support their conclusions depends on whether my concerns about the sites determined from the SNS-seq data can be dismissed. In the case that these concerns can be dismissed, I do think that the claims are compelling.

      Impact:

      If the origins of replication prove to be distributed as claimed, this study has the potential to be important for two fields. Firstly, in research focused on T. brucei as a disease agent, where essential processes that function differently than in mammals are excellent drug targets. Secondly, this study would impact basic research analyzing DNA replication over the evolutionary tree, where T. brucei can be used as an early-divergent eukaryotic model organism.

    1. Reviewer #1 (Public review):

      Summary:

      The novel advance by Wang et al is in the demonstration that, relative to a standard extinction procedure, the retrieval-extinction procedure more effectively suppresses responses to a conditioned threat stimulus when testing occurs just minutes after extinction. The authors provide solid evidence to show that this "short-term" suppression of responding involves engagement of the dorsolateral prefrontal cortex.

      Strengths:

      Overall, the study is well-designed and the results are valuable. There are, however, a few issues in the way that it is introduced and discussed. It would have been useful if the authors could have more explicitly related the results to a theory - it would help the reader understand why the results should have come out the way that they did. More specific comments are presented below.

      Please note: The authors appear to have responded to my original review twice. It is not clear that they observed the public review that I edited after the first round of revisions. As part of these edits, I removed the entire section titled Clarifications, Elaborations and Edits

      Theory and Interpretation of Results

      (1) It is difficult to appreciate why the first trial of extinction in a standard protocol does NOT produce the retrieval-extinction effect. This applies to the present study as well as others that have purported to show a retrieval-extinction effect. The importance of this point comes through at several places in the paper. E.g., the two groups in study 1 experienced a different interval between the first and second CS extinction trials; and the results varied with this interval: a longer interval (10 min) ultimately resulted in less reinstatement of fear than a shorter interval. Even if the different pattern of results in these two groups was shown/known to imply two different processes, there is nothing in the present study that addresses what those processes might be. That is, while the authors talk about mechanisms of memory updating, there is little in the present study that permits any clear statement about mechanisms of memory. The references to a "short-term memory update" process do not help the reader to understand what is happening in the protocol.

      In reply to this point, the authors cite evidence to suggest that "an isolated presentation of the CS+ seems to be important in preventing the return of fear expression." They then note the following: "It has also been suggested that only when the old memory and new experience (through extinction) can be inferred to have been generated from the same underlying latent cause, the old memory can be successfully modified (Gershman et al., 2017). On the other hand, if the new experiences are believed to be generated by a different latent cause, then the old memory is less likely to be subject to modification. Therefore, the way the 1st and 2nd CS are temporally organized (retrieval-extinction or standard extinction) might affect how the latent cause is inferred and lead to different levels of fear expression from a theoretical perspective." This merely begs the question: why might an isolated presentation of the CS+ result in the subsequent extinction experiences being allocated to the same memory state as the initial conditioning experiences?<br /> This is not addressed in the paper. The study was not designed to address this question; and that the question did not need to be addressed for the set of results to be interesting. However, understanding how and why the retrieval-extinction protocol produces the effects that it does in the long-term test of fear expression would greatly inform our understanding of how and why the retrieval-extinction protocol has the effects that it does in the short-term tests of fear expression. To be clear; the results of the present study are very interesting - there is no denying that. I am not asking the authors to change anything in response to this point. It simply stands as a comment on the work that has been done in this paper and the area of research more generally.

      (2) The discussion of memory suppression is potentially interesting but raises many questions. That is, memory suppression is invoked to explain a particular pattern of results but I, as the reader, have no sense of why a fear memory would be better suppressed shortly after the retrieval-extinction protocol compared to the standard extinction protocol; and why this suppression is NOT specific to the cue that had been subjected to the retrieval-extinction protocol. I accept that the present study was not intended to examine aspects of memory suppression, and that it is a hypothesis proposed to explain the results collected in this study. I am not asking the authors to change anything in response to this point. Again, it simply stands as a comment on the work that has been done in this paper.

      (3) The authors have inserted the following text in the revised manuscript: "It should be noted that while our long-term amnesia results were consistent with the fear memory reconsolidation literatures, there were also studies that failed to observe fear prevention (Chalkia, Schroyens, et al., 2020; Chalkia, Van Oudenhove, et al., 2020; Schroyens et al., 2023). Although the memory reconsolidation framework provides a viable explanation for the long-term amnesia, more evidence is required to validate the presence of reconsolidation, especially at the neurobiological level (Elsey et al., 2018). While it is beyond the scope of the current study to discuss the discrepancies between these studies, one possibility to reconcile these results concerns the procedure for the retrieval-extinction training. It has been shown that the eligibility for old memory to be updated is contingent on whether the old memory and new observations can be inferred to have been generated by the same latent cause (Gershman et al., 2017; Gershman and Niv, 2012). For example, prevention of the return of fear memory can be achieved through gradual extinction paradigm, which is thought to reduce the size of prediction errors to inhibit the formation of new latent causes (Gershman, Jones, et al., 2013). Therefore, the effectiveness of the retrieval-extinction paradigm might depend on the reliability of such paradigm in inferring the same underlying latent cause." ***It is perfectly fine to state that "the effectiveness of the retrieval-extinction paradigm might depend on the reliability of such paradigm in inferring the same underlying latent cause..." This is not uninteresting; but it also isn't saying much. Ideally, the authors would have included some statement about factors that are likely to determine whether one is or isn't likely to see a retrieval-extinction effect, grounded in terms of the latent state theories that have been invoked here. Presumably, the retrieval-extinction protocol has variable effects because of procedural differences that affect whether subjects infer the same underlying latent cause when shifted into extinction. Surely, the clinical implications of any findings are seriously curtailed unless one understands when a protocol is likely to produce an effect; and why the effect occurs at all? This question is rhetorical. I am not asking the authors to change anything in response to this point. Again, it stands as a comment on the work that has been done in this paper; and remains a comment after insertion of the new text, which is acknowledged and appreciated.

      (4) The authors find different patterns of responses to CS1 and CS2 when they were tested 30 min after extinction versus 24 h after extinction. On this basis, they infer distinct memory update mechanisms. However, I still can't quite see why the different patterns of responses at these two time points after extinction need to be taken to infer different memory update mechanisms. That is, the different patterns of responses at the two time points could be indicative of the same "memory update mechanism" in the sense that the retrieval-extinction procedure induces a short-term memory suppression that serves as the basis for the longer-term memory suppression (i.e., the reconsolidation effect). My pushback on this point is based on the notion of what constitutes a memory update mechanism; and is motivated by what I take to be a rather loose use of language/terminology in the reconsolidation literature and this paper specifically (for examples, see the title of the paper and line 2 of the abstract).

      To be clear: I accept the authors' reply that "The focus of the current manuscript is to demonstrate that the retrieval-extinction paradigm can also facilitate a short-term fear memory deficit measured by SCR". However, I disagree with the claim that any short-term fear memory deficit must be indicative of "update mechanisms other than reconsolidation", which appears on Line 27 in the abstract and very much indicates the spirit of the paper. To make the point: the present study has examined the effectiveness of a retrieval-extinction procedure in suppressing fear responses 30 min, 6 hours and 24 hours after extinction. There are differences across the time points in terms of the level of suppression, its cue specificity, and its sensitivity to manipulation of activity in the dlPFC. This is perfectly interesting when not loaded with additional baggage re separable mechanisms of memory updating at the short and long time points: there is simply no evidence in this study or anywhere else that the short-term deficit in suppression of fear responses has anything whatsoever to do with memory updating. It can be exactly what is implied by the description: a short-term deficit in the suppression of fear responses. Again, this stands as a comment on the work that has been done; and remains a comment for the revised paper.

      (5) It is not clear why thought control ability ought to relate to any aspect of the suppression that was evident in the 30 min tests - that is, I accept the correlation between thought control ability and performance in the 30 min tests but would have liked to know why this was looked at in the first place and what, if anything, it means. The issue at hand is that, as best as I can tell, there is no theory to which the result from the short- and long-term tests can be related. The attempts to fill this gap with reference to phenomena like retrieval-induced forgetting are appreciated but raise more questions than answers. This is especially clear in the discussion, where it is acknowledged/stated: "Inspired by the similarities between our results and suppression-induced declarative memory amnesia (Gagnepain et al., 2017), we speculate that the retrieval-extinction procedure might facilitate a spontaneous memory suppression process and thus yield a short-term amnesia effect. Accordingly, the activated fear memory induced by the retrieval cue would be subjected to an automatic fear memory suppression through the extinction training (Anderson and Floresco, 2022)." There is nothing in the subsequent discussion to say why this should have been the case other than the similarity between results obtained in the present study and those in the literature on retrieval induced forgetting, where the nature of the testing is quite different. Again, this is simply a comment on the work that has been done - no change is required for the revised paper.

    2. Reviewer #2 (Public review):

      Summary

      The study investigated whether memory retrieval followed soon by extinction training results in a short-term memory deficit when tested - with a reinstatement test that results in recovery from extinction - soon after extinction training. Experiment 1 documents this phenomenon using a between-subjects design. Experiment 2 used a within-subject control and sees that the effect is also observed in a control condition. In addition, it also revealed that if testing is conducted 6 hours after extinction, there is not effect of retrieval prior to extinction as there is recovery from extinction independently of retrieval prior to extinction. A third Group also revealed that retrieval followed by extinction attenuates reinstatement when the test is conducted 24 hours later, consistent with previous literature. Finally, Experiment 3 used continuous theta-burst stimulation of the dorsolateral prefrontal cortex and assessed whether inhibition of that region (vs a control region) reversed the short-term effect revealed in Experiments 1 and 2. The results of control groups in Experiment 3 replicated the previous findings (short-term effect), and the experimental group revealed that these can be reversed by inhibition of the dorsolateral prefrontal cortex.

      Strengths

      The work is performed using standard procedures (fear conditioning and continuous theta-burst stimulation) and there is some justification of the sample sizes. The results replicate previous findings - some of which have been difficult to replicate and this needs to be acknowledged - and suggest that the effect can also be observed in a short-term reinstatement test.

      The study establishes links between the memory reconsolidation and retrieval-induced forgetting (or memory suppression) literatures. The explanations that have been developed for these are distinct and the current results integrate these, by revealing that the DLPFC activity involved in retrieval-extinction short-term effect. There is thus some novelty in the present results, but numerous questions remain unaddressed.

      Weakness

      The fear acquisition data is converted to a differential fear SCR and this is what is analysed (early vs late). However, the figure shows the raw SCR values for CS+ and CS- and therefore it is unclear whether acquisition was successful (despite there being an "early" vs "late" effect - no descriptives are provided).

      In Experiment 1 (Test results) it is unclear whether the main conclusion stems from a comparison of the test data relative to the last extinction trial ("we defined the fear recovery index as the SCR difference between the first test trial and the last extinction trial for a specific CS") or the difference relative to the CS- ("differential fear recovery index between CS+ and CS-"). It would help the reader assess the data if Fig 1e presents all the indexes (both CS+ and CS-). In addition, there is one sentence which I could not understand "there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (P=0.048)". The p value suggests that there is a difference, yet it is not clear what is being compared here. Critically, any index taken as a difference relative to the CS- can indicate recovery of fear to the CS+ or absence of discrimination relative to the CS-, so ideally the authors would want to directly compare responses to the CS+ in the reminder and no-reminder groups. In the absence of such comparison, little can be concluded, in particular if SCR CS- data is different between groups. The latter issue is particularly relevant in Experiment 2, in which the CS- seems to vary between groups during the test and this can obscure the interpretation of the result.

      In experiment 1, the findings suggest that there is a benefit of retrieval followed by extinction in a short-term reinstatement test. In Experiment 2, the same effect is observed to a cue which did not undergo retrieval before extinction (CS2+), a result that is interpreted as resulting from cue-independence, rather than a failure to replicate in a within-subjects design the observations of Experiment 1 (between-subjects). Although retrieval-induced forgetting is cue-independent (the effect on items that are supressed [Rp-] can be observed with an independent probe), it is not clear that the current findings are similar, and thus that the strong parallels made are not warranted. Here, both cues have been extinguished and therefore been equally exposed during the critical stage.

      The findings in Experiment 2 suggest that the amnesia reported in experiment 1 is transient, in that no effect is observed when the test is delayed by 6 hours. The phenomena whereby reactivated memories transition to extinguished memories as a function of the amount of exposure (or number of trials) is completely different from the phenomena observed here. In the former, the manipulation has to do with the number of trials (or total amount of time) that the cues are exposed. In the current Experiment 2, the authors did not manipulate the number of trials but instead the retention interval between extinction and test. The finding reported here is closer to a "Kamin effect", that is the forgetting of learned information which is observed with intervals of intermediate length (Baum, 1968). Because the Kamin effect has been inferred to result from retrieval failure, it is unclear how this can be explained here. There needs to be much more clarity on the explanations to substantiate the conclusions.

      There are many results (Ryan et al., 2015) that challenge the framework that the authors base their predictions on (consolidation and reconsolidation theory), therefore these need to be acknowledged. These studies showed that memory can be expressed in the absence of the biological machinery thought to be needed for memory performance. The authors should be careful about statements such as "eliminate fear memores" for which there is little evidence.

      The parallels between the current findings and the memory suppression literature are speculated in the general discussion, and there is the conclusion that "the retrieval-extinction procedure might facilitate a spontaneous memory suppression process". Because one of the basic tenets of the memory suppression literature is that it reflects an "active suppression" process, there is no reason to believe that in the current paradigm the same phenomenon is in place, but instead it is "automatic". In other words, the conclusions make strong parallels with the memory suppression (and cognitive control) literature, yet the phenomena that they observed is thought to be passive (or spontaneous/automatic). Ultimately, it is unclear why 10 mins between the reminder and extinction learning will "automatically" supress fear memories. Further down in the discussion it is argued that "For example, in the well-known retrieval-induced forgetting (RIF) phenomenon, the recall of a stored memory can impair the retention of related long-term memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner". I did not follow with the time delay between manipulation and test (20 mins) would speak about whether the process is controlled or automatic. In addition, the links with the "latent cause" theoretical framework are weak if any. There is little reason to believe that one extinction trial, separated by 10 mins from the rest of extinction trials, may lead participants to learn that extinction and acquisition have been generated by the same latent cause.

      Among the many conclusions, one is that the current study uncovers the "mechanism" underlying the short-term effects of retrieval-extinction. There is little in the current report that uncovers the mechanism, even in the most psychological sense of the mechanism, so this needs to be clarified. The same applies to the use of "adaptive".

      Whilst I could access the data in the OFS site, I could not make sense of the Matlab files as there is no signposting indicating what data is being shown in the files. Thus, as it stands, there is no way of independently replicating the analyses reported.<br /> The supplemental material shows figures with all participants, but only some statistical analyses are provided, and sometimes these are different from those reported in the main manuscript. For example, the test data in Experiment 1 is analysed with a two-way ANOVA with main effects of group (reminder vs no-reminder) and time (last trial of extinction vs first trial of test) in the main report. The analyses with all participants in the sup mat used a mixed two-way ANOVA with group (reminder vs no reminder) and CS (CS+ vs CS-). This makes it difficult to assess the robustness of the results when including all participants. In addition, in the supplementary materials there are no figures and analyses for Experiment 3.

      One of the overarching conclusions is that the "mechanisms" underlying reconsolidation (long term) and memory suppression (short term) phenomena are distinct, but memory suppression phenomena can also be observed after a 7-day retention interval (Storm et al., 2012), which then questions the conclusions achieved by the current study.

      References:

      Baum, M. (1968). Reversal learning of an avoidance response and the Kamin effect. Journal of Comparative and Physiological Psychology, 66(2), 495.<br /> Chalkia, A., Schroyens, N., Leng, L., Vanhasbroeck, N., Zenses, A. K., Van Oudenhove, L., & Beckers, T. (2020). No persistent attenuation of fear memories in humans: A registered replication of the reactivation-extinction effect. Cortex, 129, 496-509.<br /> Ryan, T. J., Roy, D. S., Pignatelli, M., Arons, A., & Tonegawa, S. (2015). Engram cells retain memory under retrograde amnesia. Science, 348(6238), 1007-1013.<br /> Storm, B. C., Bjork, E. L., & Bjork, R. A. (2012). On the durability of retrieval-induced forgetting. Journal of Cognitive Psychology, 24(5), 617-629.

      Comments on revisions:

      Thanks to the authors for trying to address my concerns.

      (1 and 2) My point about evidence for learning relates to the fact that in none of the experiments an increase in SCR to the CSs+ is observed during training (in Experiment 1 CS+/CS- differences are even present from the outset), instead what happens is that participants learn to discriminate between the CS+ and CS- and decrease their SCR responding to the safe CS-. This begs the question as to what is being learned, given that the assumption is that the retrieval-extinction treatment is concerned with the excitatory memory (CS+) rather than the CS+/CS- discrimination. For example, Figures 6A and 6B have short/Long term amnesia in the right axes, but it is unclear from the data what memory is being targeted. In Figure 6C, the right panels depicting Suppression and Reconsolidation mechanisms suggest that it is the CS+ memory that is being targeted. Because the dependent measure (differential SCR) captures how well the discrimination was learned (this point relates to point 2 which the authors now acknowledge that there are differences between groups in responding to the CS-), then I struggle to see how the data supports these CS+ conclusions. The fact that influential papers have used this dependent measure (i.e., differential SCR) does not undermine the point that differences between groups at test are driven by differences in responding to the CS-.

      (3, 4 and 5) The authors have qualified some of the statements, yet I fail to see some of these parallels. Much of the discussion is speculative and ultimately left for future research to address.

      (6) I can now make more sense of the publicly available data, although the files would benefit from an additional column that distinguishes between participants that were included in the final analyses (passed the multiple criteria = 1) and those who did not (did not pass the criteria = 0). Otherwise, anyone who wants to replicate these analyses needs to decipher the multiple inclusion criteria and apply it to the dataset.

    1. Reviewer #1 (Public review):

      Summary:

      The study examines human biases in a regime-change task, in which participants have to report the probability of a regime change in the face of noisy data. The behavioral results indicate that humans display systematic biases, in particular, overreaction in stable but noisy environments and underreaction in volatile settings with more certain signals. fMRI results suggest that a frontoparietal brain network is selectively involved in representing subjective sensitivity to noise, while the vmPFC selectively represents sensitivity to the rate of change.

      Strengths:

      - The study relies on a task that measures regime-change detection primarily based on descriptive information about the noisiness and rate of change. This distinguishes the study from prior work using reversal-learning or change-point tasks in which participants are required to learn these parameters from experiences. The authors discuss these differences comprehensively.

      - The study uses a simple Bayes-optimal model combined with model fitting, which seems to describe the data well. The model is comprehensively validated.

      - The authors apply model-based fMRI analyses that provide a close link to behavioral results, offering an elegant way to examine individual biases.

      Weaknesses:

      The authors have adequately addressed most of my prior concerns.

      My only remaining comment concerns the z-test of the correlations. I agree with the non-parametric test based on bootstrapping at the subject level, providing evidence for significant differences in correlations within the left IFG and IPS.

      However, the parametric test seems inadequate to me. The equation presented is described as the Fisher z-test, but the numerator uses the raw correlation coefficients (r) rather than the Fisher-transformed values (z). To my understanding, the subtraction should involve the Fisher z-scores, not the raw correlations.

      More importantly, the Fisher z-test in its standard form assumes that the correlations come from independent samples, as reflected in the denominator (which uses the n of each independent sample). However, in my opinion, the two correlations are not independent but computed within-subject. In such cases, parametric tests should take into account the dependency. I believe one appropriate method for the current case (correlated correlation coefficients sharing a variable [behavioral slope]) is explained here:

      Meng, X.-l., Rosenthal, R., & Rubin, D. B. (1992). Comparing correlated correlation coefficients. Psychological Bulletin, 111(1), 172-175. https://doi.org/10.1037/0033-2909.111.1.172

      It should be implemented here:

      Diedenhofen B, Musch J (2015) cocor: A Comprehensive Solution for the Statistical Comparison of Correlations. PLoS ONE 10(4): e0121945. https://doi.org/10.1371/journal.pone.0121945

      My recommendation is to verify whether my assumptions hold, and if so, perform a test that takes correlated correlations into account. Or, to focus exclusively on the non-parametric test.

      In any case, I recommend a short discussion of these findings and how the authors interpret that some of the differences in correlations are not significant.

    2. Reviewer #3 (Public review):

      This study concerns how observers (human participants) detect changes in the statistics of their environment, termed regime shifts. To make this concrete, a series of 10 balls are drawn from an urn that contains mainly red or mainly blue balls. If there is a regime shift, the urn is changed over (from mainly red to mainly blue) at some point in the 10 trials. Participants report their belief that there has been a regime shift as a % probability. Their judgement should (mathematically) depend on the prior probability of a regime shift (which is set at one of three levels) and the strength of evidence (also one of three levels, operationalized as the proportion of red balls in the mostly-blue urn and vice versa). Participants are directly instructed of the prior probability of regime shift and proportion of red balls, which are presented on-screen as numerical probabilities. The task therefore differs from most previous work on this question in that probabilities are instructed rather than learned by observation, and beliefs are reported as numerical probabilities rather than being inferred from participants' choice behaviour (as in many bandit tasks, such as Behrens 2007 Nature Neurosci).

      The key behavioural finding is that participants over-estimate the prior probability of regime change when it is low, and under estimate it when it is high; and participants over-estimate the strength of evidence when it is low and under-estimate it when it is high. In other words participants make much less distinction between the different generative environments than an optimal observer would. This is termed 'system neglect'. A neuroeconomic-style mathematical model is presented and fit to data.

      Functional MRI results how that strength of evidence for a regime shift (roughly, the surprise associated with a blue ball from an apparently red urn) is associated with activity in the frontal-parietal orienting network. Meanwhile, at time-points where the probability of a regime shift is high, there is activity in another network including vmPFC. Both networks show individual differences effects, such that people who were more sensitive to strength of evidence and prior probability show more activity in the frontal-parietal and vmPFC-linked networks respectively.

      Strengths

      (1) The study provides a different task for looking at change-detection and how this depends on estimates of environmental volatility and sensory evidence strength, in which participants are directly and precisely informed of the environmental volatility and sensory evidence strength rather than inferring them through observation as in most previous studies<br /> (2) Participants directly provide belief estimates as probabilities rather than experimenters inferring them from choice behaviour as in most previous studies<br /> (3) The results are consistent with well-established findings that surprising sensory events activate the frontal-parietal orienting network whilst updating of beliefs about the word ('regime shift') activates vmPFC.

      Weaknesses

      (1) The use of numerical probabilities (both to describe the environments to participants, and for participants to report their beliefs) may be problematic because people are notoriously bad at interpreting probabilities presented in this way, and show poor ability to reason with this information (see Kahneman's classic work on probabilistic reasoning, and how it can be improved by using natural frequencies). Therefore the fact that, in the present study, people do not fully use this information, or use it inaccurately, may reflect the mode of information delivery.

      (2) Although a very precise model of 'system neglect' is presented, many other models could fit the data.

      For example, you would get similar effects due to attraction of parameter estimates towards a global mean - essentially application of a hyper-prior in which the parameters applied by each participant in each block are attracted towards the experiment-wise mean values of these parameters. For example, the prior probability of regime shift ground-truth values [0.01, 0.05, 0.10] are mapped to subjective values of [0.037, 0.052, 0.069]; this would occur if observers apply a hyper-prior that the probability of regime shift is about 0.05 (the average value over all blocks). This 'attraction to the mean' is a well-established phenomenon and cannot be ruled out with the current data (I suppose you could rule it out by comparing to another dataset in which the mean ground-truth value was different).

      More generally, any model in which participants don't fully use the numerical information they were given would produce apparent 'system neglect'. Four qualitatively different example reasons are: 1. Some individual participants completely ignored the probability values given. 2. Participants did not ignore the probability values given, but combined them with a hyperprior as above. 3. Participants had a reporting bias where their reported beliefs that a regime-change had occurred tend to be shifted towards 50% (rather than reporting 'confident' values such 5% or 95%). 4. Participants underweighted probability outliers resulting in underweighting of evidence in the 'high signal diagnosticity' environment (10.1016/j.neuron.2014.01.020 )

      In summary I agree that any model that fits the data would have to capture the idea that participants don't differentiate between the different environments as much as they should, but I think there are a number of qualitatively different reasons why they might do this - of which the above are only examples - hence I find it problematic that the authors present the behaviour as evidence for one extremely specific model.

      (3) Despite efforts to control confounds in the fMRI study, including two control experiments, I think some confounds remain.

      For example, a network of regions is presented as correlating with the cumulative probability that there has been a regime shift in this block of 10 samples (Pt). However, regardless of the exact samples shown, doesn't Pt always increase with sample number (as by the time of later samples, there have been more opportunities for a regime shift)? Unless this is completely linear, the effect won't be controlled by including trial number as a co-regressor (which was done).

      On the other hand, two additional fMRI experiments are done as control experiments and the effect of Pt in the main study is compared to Pt in these control experiments. Whilst I admire the effort in carrying out control studies, I can't understand how these particular experiment are useful controls. For example in experiment 3 participants simply type in numbers presented on the screen - how can we even have an estimate of Pt from this task?

      (4) The Discussion is very long, and whilst a lot of related literature is cited, I found it hard to pin down within the discussion, what the key contributions of this study are. In my opinion it would be better to have a short but incisive discussion highlighting the advances in understanding that arise from the current study, rather than reviewing the field so broadly.

      Editors’ note: Reviewer #2 was unavailable to re-review the manuscript. Reviewer #3 was added for this round of review to ensure two reviewers and because of their expertise in the computational and modelling aspects of the work.

    1. Reviewer #1 (Public review):

      Summary:

      Silbaugh, Koster and Hansel investigated how the cerebellar climbing fiber (CF) signals influence neuronal activity and plasticity in mouse primary somatosensory (S1) cortex. They found that optogenetic activation of CFs in the cerebellum modulates responses of cortical neurons to whisker stimulation in a cell-type-specific manner and suppresses potentiation of layer 2/3 pyramidal neurons induced by repeated whisker stimulation. This suppression of plasticity by CF activation is mediated through modulation of VIP- and SST-positive interneurons. Using transsynaptic tracing and chemogenetic approaches, the authors identified a pathway from the cerebellum through the zona incerta and the thalamic posterior medial (POm) nucleus to the S1 cortex, which underlies this functional modulation.

      The authors have addressed all the necessary points.

    2. Reviewer #2 (Public review):

      Summary:

      The authors examined long-distance influence of climbing fiber (CF) signaling in the somatosensory cortex by manipulating whiskers through stimulation. Also, they examined CF signaling using two-photon imaging and mapped projections from the cerebellum to somatosensory cortex using transsynaptic tracing. As a final manipulation, they used chemogenetics to perturb parvalbumin positive neurons in the zona incerta and recorded from climbing fibers.

      Strengths:

      There are several strengths to this paper. The recordings were carefully performed and AAVs used were selective and specific for the cell-types and pathways being analyzed. In addition, the authors used multiple approaches that support climbing fiber pathways to distal regions of the brain. This work will impact the field and describes nice methods to target difficult to reach brain regions, such as the inferior olive.

      No weaknesses noted.

    3. Reviewer #3 (Public review):

      Summary:

      The authors developed an interesting novel paradigm to probe the effects of cerebellar climbing fiber activation on short-term adaptation of somatosensory neocortical activity during repetitive whisker stimulation. Normally, RWS potentiated whisker responses in pyramidal cells and weakly suppressed them in interneruons, lasting for at least 1h. Crusii Optogenetic climbing fiber activation during RWS reduced or inverted these adaptive changes. This effect was generally mimicked or blocked with chemogenetic SST or VIP activation/suppression as predicted based on their "sign" in the circuit.

      Strengths:

      The central finding about CF modulation of S1 response adaptation is interesting, important, and convincing, and provides a jumping-off point for the field to start to think carefully about cerebellar modulation of neocortical plasticity.

      Weaknesses:

      The SST and VIP results appeared slightly weaker statistically, but I do not personally think this detracts from the importance of the initial finding (if there are multiple underlying mechanisms, modulating one may reproduce only a fraction of the effect size). I found the suggestion that zona incerta may be responsible for the cerebellar effects on S1 to be a more speculative result (it is not so easy with existing technology to effectively modulate this type of polysynaptic pathway), but this may be an interesting topic for the authors to follow up on in more detail in the future.

      Comments on revisions:

      The authors have appropriately addressed my comments.

    1. Reviewer #1 (Public review):

      The study aims to determine the role of Slit-Robo signaling in the development and patterning of cardiac innervation, a key process in heart development. Despite the well-studied roles of Slit axon guidance molecules in the development of the central nervous system, their roles in the peripheral nervous system are less clear. Thus, the present study addresses an important question. The study uses genetic knockout models to investigate how Slit2, Slit3, Robo1, and Robo2 contribute to cardiac innervation

      Using constitutive and cell type-specific knockout mouse models, they show that the loss of endothelial-derived Slit2 reduces cardiac innervation. Additionally, Robo1 knockout, but not Robo2 knockout, recapitulated the Slit2 knockout effect on cardiac innervation, leading to the conclusion that Slit2-Robo1 signaling drives sympathetic innervation in the heart. Finally, the authors also show a reduction in isoproterenol-stimulated heart rate but not basal heart rate in the absence of endothelial Slit2.

      The conclusions of this paper are mostly well supported by the data, but there are several limitations:

      (1) It is well established that Slit ligands undergo proteolytic cleavage, generating N- and C-terminal fragments with distinct biological functions. Full-length Slit proteins and their fragments differ in cell association, with the N-terminal fragment typically remaining membrane-bound, while the C-terminal fragment is more diffusible. This distinction is crucial when evaluating the role of Slit proteins secreted by different cell types in the heart. However, this study does not examine or discuss the specific contributions of different Slit2 fragments, limiting its mechanistic insight into how Slit2 regulates cardiac innervation. While these points are mentioned in the discussion, they are not incorporated into the interpretation of the data or the presented model.

      (2) The endothelial-specific deletion of Slit2 leads to its loss in endothelial cells across various organs and tissues in the developing embryo. Therefore, the phenotypes observed in the heart may be influenced by defects in other parts of the embryo, such as the CNS or sympathetic ganglia, and this possibility cannot be ruled out. The data presented in the manuscript does not dissect the relative contributions of endothelial Slit2 loss in the heart versus secondary effects arising from other organ systems. Without tissue-specific rescue or complementary conditional models, it remains unclear whether the observed cardiac phenotypes are a direct consequence of local endothelial Slit2 deficiency or an indirect outcome of broader developmental perturbations.

    2. Reviewer #2 (Public review):

      The aims of investigating Slit-Robo signaling in cardiac innervation were achieved by the experiments designed. The authors demonstrate that endothelial Slit2 signaling through Robo1 drives sympathetic innervation. While questions remain regarding signal regulation and interplay between established axon guidance signals and the further role of other Slit ligands and Robo expression in endothelium, the results strongly support the conclusions drawn.<br /> Writing and presentation are easy to follow and well structured. Appropriate controls are used, statistical analysis applied appropriately, and experiments directly test aims following a logical story.<br /> The authors demonstrate a novel mechanism for Slit-Robo signaling in cardiac sympathetic innervation. The data establishes a framework for future studies.

      The authors have updated their discussion to highlight the need for investigation of the role of proteolytic cleavage of Slit2 as well as the potential for defects in other tissues due to endothelial knockout of Slit2 influencing cardiac innervation.

    1. Reviewer #2 (Public review):

      Summary:

      This work presents a modality-agnostic decoder trained on a large fMRI dataset (SemReps-8K), in which subjects viewed natural images and corresponding captions. The decoder predicts stimulus content from brain activity irrespective of the input modality and performs on par with-or even outperforms-modality-specific decoders. Its success depends more on the diversity of brain data (multimodal vs. unimodal) than on whether the feature-extraction models are visual, linguistic, or multimodal. Particularly, the decoder shows strong performance in decoding imagery content. These results suggest that the modality-agnostic decoder effectively leverages shared brain information across image and caption tasks.

      Strengths:

      (1) The modality-agnostic decoder compellingly leverages multimodal brain information, improving decoding accuracy-particularly for non-sensory input such as captions-showing high methodological and application value.

      (2) The dataset is a substantial and well-controlled contribution, with >8,000 image-caption trials per subject and careful matching of stimuli across modalities-an essential resource for testing theories about different representational modalities.

      Weakness:

      In the searchlight analysis aimed at identifying modality-invariant representations, although the combined use of four decoding conditions represents a relatively strict approach, the underlying logic remains unclear. The modality-agnostic decoder has demonstrated strong sensitivity in decoding brain activity, as shown earlier in the paper, whereas the cross-decoding with modality-specific decoders is inherently more conservative. If, as the authors note, the modality-agnostic decoder might have learned to leverage different features to project stimuli from different modalities, then taking the union of conditions would seem more appropriate. Conversely, if the goal is to obtain a more conservative result, why not focus solely on the cross-decoding conditions? The relationships among the four decoding conditions are not clearly delineated, and the contrasts between them might themselves yield valuable insights. As it stands, however, the logic of the current approach is not straightforward.

    2. Reviewer #3 (Public review):

      Summary:

      The authors recorded brain responses while participants viewed images and captions. The images and captions were taken from the COCO dataset, so each image has a corresponding caption and each caption has a corresponding image. This enabled the authors to extract features from either the presented stimulus or the corresponding stimulus in the other modality. The authors trained linear decoders to take brain responses and predict stimulus features. "Modality-specific" decoders were trained on brain responses to either images or captions while "modality-agnostic" decoders were trained on brain responses to both stimulus modalities. The decoders were evaluated on brain responses while the participants viewed and imagined new stimuli, and prediction performance was quantified using pairwise accuracy. The authors reported the following results:

      (1) Decoders trained on brain responses to both images and captions can predict new brain responses to either modality.

      (2) Decoders trained on brain responses to both images and captions outperform decoders trained on brain responses to a single modality.

      (3) Many cortical regions represent the same concepts in vision and language.

      (4) Decoders trained on brain responses to both images and captions can decode brain responses to imagined scenes.

      Strengths:

      This is an interesting study that addresses important questions about modality-agnostic representations. Previous work has shown that decoders trained on brain responses to one modality can be used to decode brain responses to another modality. The authors build on these findings by collecting a new multimodal dataset and training decoders on brain responses to both modalities.

      To my knowledge, SemReps-8K is the first dataset of brain responses to vision and language where each stimulus item has a corresponding stimulus item in the other modality. This means that brain responses to a stimulus item can be modeled using visual features of the image, linguistic features of the caption, or multimodal features derived from both the image and the caption. The authors also employed a multimodal one-back matching task which forces the participants to activate modality-agnostic representations. Overall, SemReps-8K is a valuable resource that will help researchers answer more questions about modality-agnostic representations.

      The analyses are also very comprehensive. The authors trained decoders on brain responses to images, captions, and both modalities, and they tested the decoders on brain responses to images, caption, and imagined scenes. They extracted stimulus features using a range of visual, linguistic, and multimodal models. The modeling framework appears rigorous and the results offer new insights into the relationship between vision, language, and imagery. In particular, the authors found that decoders trained on brain responses to both images and captions were more effective at decoding brain responses to imagined scenes than decoders trained on brain responses to either modality in isolation. The authors also found that imagined scenes can be decoded from a broad network of cortical regions.

      Weaknesses:

      The characterization of "modality-agnostic" and "modality-specific" decoders seems a bit contradictory. There are three major choices when fitting a decoder: the modality of the training stimuli, the modality of the testing stimuli, and the model used to extract stimulus features. However, the authors characterize their decoders based on only the first choice-"modality-specific" decoders were trained on brain responses to either images or captions while "modality-agnostic" decoders were trained on brain responses to both stimulus modalities. I think that this leads to some instances where the conclusions are inconsistent with the methods and results.

      First, the authors suggest that "modality-specific decoders are not explicitly encouraged to pick up on modality-agnostic features during training" (line 137) while "modality-agnostic decoders may be more likely to leverage representations that are modality-agnostic" (line 140). However, whether a decoder is required to learn modality-agnostic representations depends on both the training responses and the stimulus features. Consider the case where the stimuli are represented using linguistic features of the captions. When you train a "modality-specific" decoder on image responses, the decoder is forced to rely on modality-agnostic information that is shared between the image responses and the caption features. On the other hand, when you train a "modality-agnostic" decoder on both image responses and caption responses, the decoder has access to the modality-specific information that is shared by the caption responses and the caption features, so it is not explicitly required to learn modality-agnostic features. As a result, while the authors show that "modality-agnostic" decoders outperform "modality-specific" decoders in most conditions, I am not convinced that this is because they are forced to learn more modality-agnostic features.

      Second, the authors claim that "modality-specific decoders can be applied only in the modality that they were trained on" while "modality-agnostic decoders can be applied to decode stimuli from multiple modalities, even without knowing a priori the modality the stimulus was presented in" (line 47). While "modality-agnostic" decoders do outperform "modality-specific" decoders in the cross-modality conditions, it is important to note that "modality-specific" decoders still perform better than expected by chance (figure 5). It is also important to note that knowing about the input modality still improves decoding performance even for "modality-agnostic" decoders, since it determines the optimal feature space-it is better to decode brain responses to images using decoders trained on image features, and it is better to decode brain responses to captions using decoders trained on caption features.

      Comments on revised version:

      The revised version benefits from clearer claims and more precise terminology (i.e. classifying the decoders as "modality-agnostic" or "modality-specific" while classifying the representations as "modality-invariant" or "modality-dependent").

      While the modality-agnostic decoders outperform the modality-specific decoders, I am still not convinced that this is because they are "explicitly trained to leverage the shared information in modality-invariant patterns of the brain activity". On one hand, the high-level feature spaces may each contain some amount of modality-invariant information, so even modality-specific decoders can capture some modality-invariant information. On the other hand, I do not see how training the modality-agnostic decoders on responses to both modalities necessitates that they learn modality-invariant representations beyond those that are learned by the modality-specific decoders.

    1. Reviewer #1 (Public review):

      This work compiles a comprehensive atlas of ncORFs across mammalian tissues and cell types, derived from reanalysis of ~400 public ribosome profiling datasets. The authors then evaluate cross-species conservation and functional signatures, proposing that evolutionarily ancient ncORFs tend to have higher translation potential, stronger expression, and closer relationships with canonical coding sequences.

      Strengths:

      In general, the study provides a large-scale and timely resource of annotated ncORFs, which could be broadly useful for the community. The authors collected ~400 public ribosome profiling datasets for annotations of ncORFs, which, to my best knowledge, is the largest collection of data for such a purpose. The catalog could facilitate future investigations into ncORF biology and broaden understanding of the coding potential of the "non-coding" genome.

      Weaknesses:

      Based on the ncORF catalog, some of the analyses were not properly done. Some of the results are descriptive.

      (1) Bias and representations of the data source. Public ribo-seq datasets are unevenly distributed across tissues and cell lines, raising concerns about heterogeneity and underrepresentation of certain contexts. This may limit the generalizability of the catalog.

      (2) The discussion on modular domains of ncORFs is unclear, and the claim that they may originate via TE-related mechanisms is not well supported. Stronger evidence or clearer reasoning is needed.

      (3) The conservation comparisons are not fully convincing. Figure S7 shows only mild differences between ncORFs and CDS, and statistical significance is not clearly demonstrated.<br /> Comparisons with other non-coding RNAs should be added, and overlapping sequences between ncORFs and CDS should be excluded to avoid bias.

      (4) Figure 3 indicates that some ncORFs are subject to evolutionary constraints. This is not surprising. The authors should provide further analyses on more detailed features of these "conserved" ncORFs vs. the "non-conserved" ones. Some pretty informative works have been done in Drosophila, worms, mice, and humans. Figure 3 suggests some ncORFs are under evolutionary constraint, but this is not unexpected. More granular analyses contrasting "conserved" versus "non-conserved" ncORFs would be informative. In fact, small ORFs, especially uORFs, have been extensively studied for their functions and cross-species conservation. The authors should explicitly show what is new here in their analyses.

      (5) Translation levels are reported using RPF counts. However, translation efficiency (normalized by RNA expression) is a more appropriate measure to account for expression heterogeneity.

      (6) The correlation analyses between ncORF translation levels and PhyloCSF are confusing and largely descriptive. These sections need sharper framing and clearer conclusions.

      (7) Public ribo-seq datasets, generated by different research labs, are known for their strong batch effects. Representations of tissues and cells are also very unbalanced. Therefore, the co-translation analysis between ncORFs and canonical CDS is not well controlled. This should be done by referring to a recent large-scale ribo-seq meta-analysis (Nat Biotechnol. 2025. doi: 10.1038/s41587-025-02718-5).

    2. Reviewer #2 (Public review):

      Summary:

      Chang et al. attempted to analyze a large number of ribo-seq datasets through a standardized pipeline, identifying novel non-canonical ORFs and elucidating their evolutionary and expression characteristics.

      Strengths:

      (1) The datasets analyzed by the authors are sufficiently comprehensive, and the use of standardized pipelines ensures excellent analytical consistency.

      (2) Their analyses of ORF evolution and co-expression further deepen our understanding of these ORFs.

      Weaknesses:

      (1) The authors primarily conducted analyses through bioinformatics, lacking sufficient wet-lab experimental evidence.

      (2) Regarding the evolution of non-canonical ORFs, a considerable amount of prior work already exists. The authors need to further clarify what new insights and discoveries they have made based on the analysis of such a large dataset.

    1. Reviewer #1 (Public review):

      Summary:

      RNA modification has emerged as an important modulator of protein synthesis. Recent studies found that mRNA can be acetylated (ac4c), which can alter mRNA stability and translation efficiency. The role of ac4c mRNA in the brain has not been studied. In this paper, the authors convincingly show that ac4c occurs selectively on mRNAs localized at synapses, but not cell-wide. The ac4c "writer" NAT10 is highly expressed in hippocampal excitatory neurons. Using NAT10 conditional KO mice, decreasing levels of NAT10 resulted in decreases in ac4c of mRNAs and also showed deficits in LTP and spatial memory. These results reveal a potential role for ac4c mRNA in memory consolidation.

      This is a new type of mRNA regulation that seems to act specifically at synapses, which may help elucidate the mechanisms of local protein synthesis in memory consolidation. Overall, the studies are well carried out and presented. There is some confusion over training/learning vs memory, and the precise mRNAs that require ac4c to carry out memory consolidation are not clear. The specificity of changes occurring only at the end of training, rather than after each day of training, is interesting and warrants some investigation. This timeframe is puzzling because the authors show that ac4c can dynamically increase within 1 hour after cLTP.

      Strengths:

      (1) The studies show that mRNA acetylation (ac4c) occurs selectively at mRNAs localized to synaptic compartments (using synaptoneurosome preps).

      (2) The authors identify a few key mRNAs acetylated and involved in plasticity and memory - e.g., Arc.

      (3) The authors show that Ac4c is induced by learning and neuronal activity (cLTP).

      (4) The studies show that the ac4c "writer" NAT10 is expressed in hippocampal excitatory neurons and may be relocated to synapses after cLTP/learning induction.

      (5) The authors used floxed NAT10 mice injected with AAV-Cre in the hippocampus (NAT10 cKO) to show that NAT10 may play a role in LTP maintenance and memory consolidation (using the Morris Water Maze).

      Weaknesses:

      (1) The authors use a confusing timeline for their behavioral experiments, i.e, day 1 is the first day of training in the MWM, and day 6 is the probe trial, but in reality, day 6 is the first day after the last training day. So this is really day 1 post-training, and day 20 is 14 days post-training.

      (2) The authors inaccurately use memory as a term. During the training period in the MWM, the animals are learning, while memory is only probed on day 6 (after learning). Thus, day 6 reflects memory consolidation processes after learning has taken place.

      (3) The NAT10 cKO mice are useful to test the causal role of NAT10 in ac4a and plasticity/memory, but all the experiments used AAV-CRE injections in the dorsal hippocampus that showed somewhat modest decreases in total NAT10 protein levels. For these experiments, it would be better to cross the NAT10 floxed animals to CRE lines where a better knockdown of NAT10 can be achieved, with less variability.

      (4) Because knockdown is only modest (~50%), it is not clear if the remaining ac4c on mRNAs is due to remaining NAT10 protein or due to an alternative writer (as the authors pose).

    2. Reviewer #2 (Public review):

      This is an interesting study that shows that mRNA acetylation at synapses is dynamically regulated at synapses by spatial memory in the mouse hippocampus. The dynamic changes of ac4C-mRNAs regulated by memory were validated by methods including ac4C dot-blot and liquid 13 chromatography-tandem mass spectrometry (LC-MS/MS).

      Here are some comments for consideration by readers and authors:

      (1) It is known that synaptosomes are contaminated with glial tissue. In the study, the authors also show that NAT0 is expressed in glia. So the candidate mRNAs identified by acRIP-seq might also be mixed with glial mRNAs. Are the GO BP terms shown in Figure 3A specifically chosen, or unbiasedly listed for all top ones?

      (2) Where does NAT10-mediated mRNA acetylation take place within cells generally? Is there evidence that NAT10 can catalyze mRNA acetylation in the cytoplasm?

      (3) "The NAT10 proteins were significantly reduced in the cytoplasm (S2 fraction) but increased in the PSD fraction at day 6 after memory (Figures 5J and 5K)." The authors argue that the translocation of NAT10 from soma to synapses accounts for these changes. The increase of NAT10 protein in the PSD fraction can be understood. However, it is quite surprising that the NAT10 proteins were significantly reduced in the cytoplasm (S2 fraction), considering the amount of NAT10 in soma is much more abundant in synapses. The small increase in synaptic NAT10 might not be enough to cause a decrease in soma NAT10 protein level.

      (4) It is difficult to separate the effect on mRNA acetylation and protein mRNA acetylation when doing the loss of function of NAT10.

    1. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Richter and colleagues comprehensively investigate the cell wall recycling pathway in the model alphaproteobacterium Caulobacter crescentus using biochemical, imaging, and genetic approaches. They clearly demonstrate that this organism encodes a functional peptidoglycan recycling pathway and demonstrate the activities of many enzymes and transporters within this pathway. They leverage imaging and growth assays to demonstrate that mutants in peptidoglycan recycling have varying degrees of beta-lactam sensitivity as well as morphological and cell division defects. They propose that, rather than impacting the levels or activity of the major beta-lactamase, BlaA, defects in PG recycling lead to beta-lactam sensitivity by limiting the availability of new cell wall precursors. The findings will be of interest to those in the field of bacterial cell wall biochemistry, antibiotics and antibiotic resistance, and bacterial morphogenesis.

      Strengths:

      Overall, the manuscript is laid out logically, and the data are comprehensive, quantitative, and rigorous. The mutants and their phenotypes will be a valuable resource for Caulobacter researchers.

      Weaknesses:

      The only major missing piece is the complementation of mutants to demonstrate that loss of the targeted gene is responsible for the observed phenotypes.

    2. Reviewer #2 (Public review):

      Summary:

      Pia Richter et al. investigated the peptidoglycan (PG) recycling metabolism in the alpha-proteobacterium Caulobacter crescentus. The authors first identified a functional recycling pathway in this organism, which is similar to the Pseudomonas route, and they characterized two key enzymes (NagZ, AmiR) of this pathway, showing that AmiR differs in specificity from the AmpD counterpart of E. coli. Further, they studied the effects of deletions within the PG recycling pathway (ampG, amiR, nagZ, sdpA, blaA, nagA1, nagA2, amgK, nagK mutants), showing filamentation and cell widening, thereby revealing a link between PG recycling and cell division. Finally, they provide a link between PG recycling and beta-lactam sensitivity in C. crescents that is not caused by activation of a beta-lactamase, but rather is a result of reduced supply of PG building blocks increasing the sensitivity of penicillin-binding proteins.

      Strengths:

      This work adds to the understanding of the role of PG recycling in alpha-proteobacteria, which significantly differ in their mode of cell wall growth from the better studied gamma-proteobacteria.

      Weaknesses:

      The findings are not entirely novel as recent studies by Modi et al. 2025 mBio (studying C. crescentus) and Gilmore & Cava 2022 Nat. Commun. (studying Agrobacterium tumefaciens) came to similar conclusions.

    1. Reviewer #1 (Public review):

      The paper by Chen et al describes the role of neuronal themo-TRPV3 channels in the firing of cortical neurons at fever temperature range. The authors began by demonstrating that exposure to infrared light increasing ambient temperature causes body temperature rise to fever level above 38{degree sign}C. Subsequently, they showed that at the fever temperature of 39{degree sign}C, the increased spike threshold (ST) increased in both populations (P12-14 and P7-8) of cortical excitatory pyramidal neurons (PNs). However, the spike number only decreased in P7-8 PNs, while it remained stable in P12-14 PNs at 39{degree sign}C. In addition, the fever temperature also reduced the late peak postsynaptic potential (PSP) in P12-14 PNs. The authors further characterized the firing properties of cortical P12-14 PNs, identifying two types: STAY PNs that retained spiking at 30{degree sign}C, 36{degree sign}C and 39{degree sign}C, and STOP PNs that stopped spiking upon temperature change. They further extended their and analysis and characterization to striatal medium spiny neurons (MSNs) and found that STAY MSNs and PNs shared same ST temperature sensitivity. Using small molecule tools, they further identified that themo-TRPV3 currents in cortical PNs increased in response to temperature elevation, but not TRPV4 currents. The authors concluded that during fever, neuronal firing stability is largely maintained by sensory STAY PNs and MSNs that express functional TRPV3 channels. Overall, this study is well designed and executed with substantial controls, some interesting findings and quality of data.

      Comments on revisions:

      My previous concerns have been addressed in this revised manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      The authors studied the excitability of layer 2/3 pyramidal neurons in response to layer four stimulation at temperatures ranging from 30 to 39{degree sign}C in P7-8, P12-P14, and P22-P24 animals. They also measure brain temperature and spiking in vivo in response to externally applied heat. Some pyramidal neurons continue to fire action potentials in response to stimulation at 39{degree sign}C and are referred to as "stay neurons." Stay neurons have unique properties, aided by the expression of the TRPV3 channel.

      Strengths:

      The authors focused on layer 2/3 neuronal excitability at three developmental stages: during the window of susceptibility to febrile seizures, before the window opens, and after it closes.

      Electrophysiological experiments are rigorously performed and carefully interpreted.

      The cellular electrophysiology is further confirmed. The authors compared the seizure susceptibility of TRPV3 knockout, heterozygous, and wild-type mice. EEG recording would have strengthened the study, but they are challenging in this age group.

      Finally, the authors studied TRPV3 expression with immunohistochemistry.

    3. Reviewer #3 (Public review):

      Summary:

      This important study combines in vitro and in vivo recording to determine how the firing of cortical and striatal neurons changes during a fever range temperature rise (37-40 oC). The authors found that certain neurons will start, stop, or maintain firing during these body temperature changes. The authors further suggested that the TRPV3 channel plays a role in maintaining cortical activity during fever.

      Strengths:

      The topic of how the firing pattern of neurons changes during fever is unique and interesting. The authors carefully used in vitro electrophysiology assays to study this interesting topic.

      Weaknesses:

      (1) In vivo recording is a strength of this study. However, data from in vivo recording is only shown in Fig 5A,B. This reviewer suggests the authors further expand on the analysis of the in vivo Neuropixels recording. For example, to show single spike waveforms and raster plots to provide more information on the recording. The authors can also separate the recording based on brain regions (cortex vs striatum) using the depth of the probe as a landmark to study the specific firing of cortical neurons and striatal neurons. It is also possible to use published parameters to separate the recording based on spike waveform to identify regular principal neurons vs fast-spiking interneurons. Since the authors studied E/I balance in brain slices, it would be very interesting to see whether the "E/I balance" based on the firing of excitatory neurons vs fast-spiking interneurons might be changed or not in the in vivo condition.

      (2) The author should propose a potential mechanism for how TRPV3 helps to maintain cortical activity during fever. Would calcium influx-mediated change of membrane potential be the possible reason? Making a summary figure to put all the findings into perspective and propose a possible mechanism would also be appreciated.

      (3) The author studied P7-8, P12-14, and P20-26 mice. How do these ages correspond to the human ages? it would be nice to provide a comparison to help the reader understand the context better.

      Comments on revisions:

      In this revised version, the authors nicely addressed my critiques. I have no more comments to make.

    1. Reviewer #1 (Public review):

      Summary:

      In the study by Roeder and colleagues, the authors aim to identify the psychophysiological markers of trust during the evaluation of matching or mismatching AI decision-making. Specifically, they aim to characterize through brain activity how the decision made by an AI can be monitored throughout time in a two-step decision-making task. The objective of this study is to unfold, through continuous brain activity recording, the general information processing sequence while interacting with an artificial agent, and how internal as well as external information interact and modify this processing. Additionally, the authors provide a subset of factors affecting this information processing for both decisions.

      Strengths:

      The study addresses a wide and important topic of the value attributed to AI decisions and their impact on our own confidence in decision-making. It especially questions some of the factors modulating the dynamical adaptation of trust in AI decisions. Factors such as perceived reliability, type of image, mismatch, or participants' bias toward one response or the other are very relevant to the question in human-AI interactions.

      Interestingly, the authors also question the processing of more ambiguous stimuli, with no real ground truth. This gets closer to everyday life situations where people have to make decisions in uncertain environments. Having a better understanding of how those decisions are made is very relevant in many domains.

      Also, the method for processing behavioral and especially EEG data is overall very robust and is what is currently recommended for statistical analyses for group studies. Additionally, authors provide complete figures with all robustness evaluation information. The results and statistics are very detailed. This promotes confidence, but also replicability of results.

      An additional interesting method aspect is that it is addressing a large window of analysis and the interaction between three timeframes (evidence accumulation pre-decision, decision-making, post-AI decision processing) within the same trials. This type of analysis is quite innovative in the sense that it is not yet a standard in complex experimental designs. It moves forward from classical short-time windows and baseline ERP analysis.

      Weaknesses:

      This manuscript raises several conceptual and theoretical considerations that are not necessarily answered by the methods (especially the task) used. Even though the authors propose to assess trust dynamics and violations in cooperative human-AI teaming decision-making, I don't believe their task resolves such a question. Indeed, there is no direct link between the human decision and the AI decision. They do not cooperate per se, and the AI decision doesn't seem, from what I understood to have an impact on the participants' decision making. The authors make several assumptions regarding trust, feedback, response expectation, and "classification" (i.e., match vs. mismatch) which seem far stretched when considering the scientific literature on these topics.

      Unlike what is done for the data processing, the authors have not managed to take the big picture of the theoretical implications of their results. A big part of this study's interpretation aims to have their results fit into the theoretical box of the neural markers of performance monitoring.

      Overall, the analysis method was very robust and well-managed, but the experimental task they have set up does not allow to support their claim. Here, they seem to be assessing the impact of a mismatch between two independent decisions.

      Nevertheless, this type of work is very important to various communities. First, it addresses topical concerns associated with the introduction of AI in our daily life and decisions, but it also addresses methodological difficulties that the EEG community has been having to move slowly away from the static event-based short-timeframe analyses onto a more dynamic evaluation of the unfolding of cognitive processes and their interactions. The topic of trust toward AI in cooperative decision making has also been raised by many communities, and understanding the dynamics of trust, as well as the factors modulating it, is of concern to many high-risk environments, or even everyday life contexts. Policy makers are especially interested in this kind of research output.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated how "AI-agent" feedback is perceived in an ambiguous classification task, and categorised the neural responses to this. They asked participants to classify real or fake faces, and presented an AI-agent's feedback afterwards, where the AI-feedback disagreed with the participants' response on a random 25% of trials (called mismatches). Pre-response ERP was sensitive to participants' classification as real or fake, while ERPs after the AI-feedback were sensitive to AI-mismatches, with stronger N2 and P3a&b components. There was an interaction of these effects, with mismatches after a "Fake" response affecting the N2 and those after "Real" responses affecting P3a&b. The ERPs were also sensitive to the participants' response biases, and their subjective ratings of the AI agent's reliability.

      Strengths:

      The researchers address an interesting question, and extend the AI-feedback paradigm to ambiguous tasks without veridical feedback, which is closer to many real-world tasks. The in-depth analysis of ERPs provides a detailed categorisation of several ERPs, as well as whole-brain responses, to AI-feedback, and how this interacts with internal beliefs, response biases, and trust in the AI-agent.

      Weaknesses:

      There is little discussion of how the poor performance (close to 50% chance) may have affected performance on the task, such as by leading to entirely random guessing or overreliance on response biases. This can change how error-monitoring signals presented, as they are affected by participants' accuracy, as well as affecting how the AI feedback is perceived.

      The task design and performance make it hard to assess how much it was truly measuring "trust" in an AI agent's feedback. The AI-feedback is yoked to the participants' performance, agreeing on 75% of trials and disagreeing on 25% (randomly), which is an important difference from the framing provided of human-AI partnerships, where AI-agents usually act independently from the humans and thus disagreements offer information about the human's own performance. In this task, disagreements are uninformative, and coupled with the at-chance performance on an ambiguous task, it is not clear how participants should be interpreting disagreements, and whether they treat it like receiving feedback about the accuracy of their choices, or whether they realise it is uninformative. Much greater discussion and justification are needed about the behaviour in the task, how participants did/should treat the feedback, and how these affect the trust/reliability ratings, as these are all central to the claims of the paper.

      There are a lot of EEG results presented here, including whole-brain and window-free analyses, so greater clarity on which results were a priori hypothesised should be given, along with details on how electrodes were selected for ERPs and follow-up tests.

    3. Reviewer #3 (Public review):

      The current paper investigates neural correlates of trust development in human-AI interaction, looking at EEG signatures locked to the moment that AI advice is presented. The key finding is that both human-response-locked EEG signatures (the CPP) and post-AI-advice signatures (N2, P3) are modulated by trust ratings. The study is interesting, however, it does have some clear and sometimes problematic weaknesses:

      (1) The authors did not include "AI-advice". Instead, a manikin turned green or blue, which was framed as AI advice. It is unclear whether participants viewed this as actual AI advice.

      (2) The authors did not include a "non-AI" control condition in their experiment, such that we cannot know how specific all of these effects are to AI, or just generic uncertain feedback processing.

      (3) Participants perform the task at chance level. This makes it unclear to what extent they even tried to perform the task or just randomly pressed buttons. These situations likely differ substantially from a real-life scenario where humans perform an actual task (which is not impossible) and receive actual AI advice.

      (4) Many of the conclusions in the paper are overstated or very generic.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript provides a comprehensive systematic analysis of envelope-containing Ty3/gypsy retrotransposons (errantiviruses) across metazoan genomes, including both invertebrates and ancient animal lineages. Using iterative tBLASTn mining of over 1,900 genomes, the authors catalog 1,512 intact retrotransposons with uninterrupted gag, pol, and env open reading frames. They show that these elements are widespread-present in most metazoan phyla, including cnidarians, ctenophores, and tunicates-with active proliferation indicated by their multicopy status. Phylogenetic analyses distinguish "ancient" and "insect" errantivirus clades, while structural characterization (including AlphaFold2 modeling) reveals two major env types: paramyxovirus F-like and herpesvirus gB-like proteins. Although bot envelope types were identified in previous analyses two decades ago, the evolutionary provenance of these envelope genes was almost rudimentary and anecdotal (I can say this because I authored one of these studies). The results in the present study support an ancient origin for env acquisition in metazoan Ty3/gypsy elements, with subsequent vertical inheritance and limited recombination between env and pol domains. The paper also proposes an expanded definition of 'errantivirus' for env-carrying Ty3/gypsy elements outside Drosophila.

      Strengths:

      (1) Comprehensive Genomic Survey:<br /> The breadth of the genome search across non-model metazoan phyla yields an impressive dataset covering evolutionary breadth, with clear documentation of search iterations and validation criteria for intact elements.

      (2) Robust Phylogenetic Inference:<br /> The use of maximum likelihood trees on both pol and env domains, with thorough congruence analysis, convincingly separates ancient from lineage-specific elements and demonstrates co-evolution of env and pol within clades.

      (3) Structural Insights:<br /> AlphaFold2-based predictions provide high-confidence structural evidence that both env types have retained fusion-competent architectures, supporting the hypothesis of preserved functional potential.

      (4) Novelty and Scope:<br /> The study challenges previous assumptions of insect-centric or recent env acquisition and makes a compelling case for a Pre-Cambrian origin, significantly advancing our understanding of animal retroelement diversity and evolution. THIS IS A MAJOR ADVANCE.

      (5) Data Transparency:<br /> I appreciate that all data, code, and predicted structures are made openly available, facilitating reproducibility and future comparative analyses.

      Major Weaknesses

      (1) Functional Evidence Gaps:<br /> The work rests largely on sequence and structure prediction. No direct expression or experimental validation of envelope gene function or infectivity outside Drosophila is attempted, which would be valuable to corroborate the inferred roles of these glycoproteins in non-insect lineages. At least for some of these species, there are RNA-seq datasets that could be leveraged.

      (2) Horizontal Transfer vs. Loss Hypotheses:<br /> The discussion argues primarily for vertical inheritance, but the somewhat sporadic phylogenetic distributions and long-branch effects suggest that loss and possibly rare horizontal events may contribute more than acknowledged. Explicit quantitative tests for horizontal transfer, or reconciliation analyses, would strengthen this conclusion. It's also worth pointing out that, unlike retrotransposons that can be found in genomes, any potential related viral envelopes must, by definition, have a spottier distribution due to sampling. I don't think this challenges any of the conclusions, but it must be acknowledged as something that could affect the strength of this conclusion

      (3) Limited Taxon Sampling for Certain Phyla:<br /> Despite the impressive breadth, some ancient lineages (e.g., Porifera, Echinodermata) are negative, but the manuscript does not fully explore whether this reflects real biological absence, assembly quality, or insufficient sampling. A more systematic treatment of negative findings would clarify claims of ubiquity. However, I also believe this falls beyond the scope of this study.

      (4) Mechanistic Ambiguity:<br /> The proposed model that env-containing elements exploit ovarian somatic niches is plausible but extrapolated from Drosophila data; for most taxa, actual tissue specificity, lifecycle, or host interaction mechanisms remain speculative and, to me, a bit unreasonable.

      Minor Weaknesses:

      (1) Terminology and Nomenclature:<br /> The paper introduces and then generalizes the term "errantivirus" to non-insect elements. While this is logical, it may confuse readers familiar with the established, Drosophila-centric definition if not more explicitly clarified throughout. I also worry about changes being made without any input from the ICTV nomenclature committee, which just went through a thorough reclassification. Nevertheless, change is expected, and calling them all errantiviruses is entirely reasonable.

      (2) Figures and Supplementary Data Navigation:<br /> Some key phylogenies and domain alignments are found only in supplementary figures, occasionally hindering readability for non-expert audiences. Selected main-text inclusion of representative trees would benefit accessibility.

      (3) ORF Integrity Thresholds:<br /> The cutoff choices for defining "intact" elements (e.g., numbers/placement of stop codons, length ranges) are reasonable but only lightly justified. More rationale or sensitivity analysis would improve confidence in the inclusion criteria. For example, how did changing these criteria change the number of intact elements?

      (4) Minor Typos/Formatting:<br /> The paper contains sporadic typographical errors and formatting glitches (e.g., misaligned figure labels, unrendered symbols) that should be addressed.

    2. Reviewer #2 (Public review):

      Summary:

      The authors first surveyed metazoan genomes to identify homologs of Drosophila errantiviruses and classified them into two groups, "insect" and "ancient" elements, supporting the hypothesis of an early evolutionary origin for these retrotransposons. They subsequently identified two distinct types of envelope proteins, one resembling the glycoprotein F of paramyxoviruses and the other akin to the glycoprotein B of herpesviruses. Despite differences in their primary amino acid sequences, these proteins display notable structural similarity in their predicted domain architectures. The congruence between the phylogenies of the envelope and pol genes further supports the ancient origin of the envelope genes, challenging earlier hypotheses that proposed recent recombination events with baculoviruses. Additional analysis of the Pol "bridge region" corroborated the divergence among these elements, consistent with a pattern of limited cross-species recombination. Finally, by comparing these elements with non-envelope-containing Gypsy retrotransposons, the authors concluded that errantiviruses originated from multiple elements independently.

      Strengths:

      The conclusions of this study are based on a comprehensive collection of errantiviruses identified across a wide range of metazoan genomes. These findings are further supported by multiple lines of evidence, including phylogenetic congruence and the diverse evolutionary origins of envelope genes. AlphaFold2-assisted protein domain structure analyses also provided key insights into the characterization of these elements. Together, these results present a compelling case that errantiviruses arose independently through multiple evolutionary events, extending well beyond previous hypotheses.

      Weaknesses:

      It would be beneficial to emphasize in the Abstract the potential impact of this work by more clearly articulating the current knowledge gap in the field. While the second paragraph of the Introduction briefly touches on this point, highlighting the broader significance in the Abstract would better capture readers' interest. Additionally, some methodological choices would benefit from clearer justification and explanation. For instance, in Figure 6, the selection of the bridge region/RNase H domain is not explicitly explained, leaving the rationale for its choice unclear. As a minor point, some figure labels and texts are too small and difficult to read, and improving their legibility would enhance overall clarity.

    3. Reviewer #3 (Public review):

      Summary and Significance:

      In this work, Cary and Hayashi address the important question of when, in evolution, certain mobile genetic elements (Ty3/gypsy-like non-LTR retrotransposons) associated with certain membrane fusion proteins (viral glycoprotein F or B-like proteins), which could allow these mobile genetic elements to be transferred between individual cells of a given host. It is debated in the literature whether the acquisition of membrane fusion proteins by non-LTR retrotransposons is a rather recent phenomenon that separately occurred in the ancestors of certain host species or whether the association with membrane fusion proteins is a much more ancient one, pre-dating the Cambrian explosion. Obviously, this question also touches upon the origin of the retroviruses, which can spread between individuals of a given host but seem restricted to vertebrates. Based on convincing data, Cary and Hayashi argue that an ancient association of non-LTR retrotransposons with membrane fusion proteins is most probable.

      Strengths:

      The authors take the smart approach to systematically retrieve apparently complete, intact, and recently functional Ty3/gypsy-like non-LTR retrotransposons that, next to their characteristic gag and pol genes, additionally carry sequences that are homologous to viral glycoprotein F (env-F) or viral glycoprotein B (env-B). They then construct and compare phylogenetic trees of the host species and individual encoded proteins and protein domains, where 3D-structure calculations and other features explain and corroborate the clustering within the phylogenetic trees. Congruence of phylogenetic trees and correlation of structural features is then taken as evidence for an infrequent recombination and a long-term co-evolution of the reverse transcriptase (encoded by the pol gene) and its respective putative membrane fusion gene (encoded by env-F or env-B). Importantly, the env-F and env-B containing retrotransposons do not form a monophyletic group among the Ty3/gypsy-like non-LTR retrotransposons, but are scattered throughout, supporting the idea of an originally ancient association followed by a random loss of env-F/env-B in individual branches of the tree (and rather rare re-associations via more recent recombinations).

      Overall, this is valuable, stimulating, and important work of general and fundamental interest, but still also somewhat incompletely explored, imprecisely explained, and insufficiently put into context for a more general audience.

      Weaknesses:

      Some points that might be considered and clarified:

      (1) Imprecise explanations, terms, and definitions:

      It might help to add a 'definitions box' or similar to precisely explain how the authors decided to use certain terms in this manuscript, and then use these terms consistently and with precision.

      a) In particular, these are terms such as 'vertebrate retrovirus' vs 'retrovirus' vs 'endogenized retrovirus' vs 'endogenous retrovirus' vs 'non-LTR retrotransposon' and 'Ty3/gypsi-like retrotransposon' vs 'Ty3/gypsy retrotransposon' vs 'errantivirus'.

      b) The comment also applies to the term 'env' used for both 'env-F' and 'env-B', where often it remains unclear which of the two protein types the authors refer to. This is confusing, particularly in the methods, where the search for the respective homologs is described.

      c) Other examples are the use of the entire pol gene vs. pol-RT for the definition of the Ty3/gypsy clade and for the generation of phylogenetic trees (Methods and Figure S1), and the names for various portions of pol that appear without prior definition or explanation (e.g., 'pro' in Figure 1A, 'bridge' in Figure S1C, 'the chromodomain' in the text and Figure 7).

      d) It is unclear from the main text which portions of pol were chosen to define pol-RT and why. The methods name the 'palm-and-fingers', 'thumb', and 'connections' domains to define RT. In the main text, the 'connection' domain is called 'tether' and is instead defined as part of the 'bridge' region following RT, which is not part of RT.

      (2) Insufficient broader context:

      a) The introduction does not state what defines Ty3/gypsy non-LTR retrotransposons as compared to their closest relatives (Ty1/copia retrotransposons, BEL/pao retrotransposons, vertebrate retroviruses). This makes it difficult to judge the significance and generality of the findings.

      b) The various known compositions of Ty3/gypsi-like retrotransposons are not mentioned and explained in the introduction (open reading frames, (poly-)proteins and protein domains, and their variable arrangement, enzymatic activities, and putative functions), and the distribution of Ty3/gypsi-like retrotransposons among eukaryotes remains unclear. The introduction does not mention that Ty3/gypsi-like retrotransposons apparently are absent from vertebrates, and Figure 7 is not very clear about whether or not it includes sequences from plants ('Chromoviridae').

      c) The known association of Ty3/gypsi-like retrotransposons from different metazoan phyla with putative membrane fusion proteins (env-like) genes is mentioned in the introduction, but literature information, whether such associations also occur in the context of other retrotransposons (e.g., Ty1/ copia or BEL/pao), is not provided. The abstract is somewhat misleading in this respect. Finally, the different known types of env-like genes are not mentioned and explained as part of the introduction ('env-f', 'env-B', 'retroviral env', others?)

      d) Some key references and reviews might be added:

      - Pelisson, A. et al. (1994) https://www.embopress.org/doi/abs/10.1002/j.1460-2075.1994.tb06760.x<br /> (next to Song et al. (1994), for the identification of env in Ty3/gypsy)

      - Boeke, J.D. et al. (1999)<br /> In Virus Taxonomy: ICTV VIIth report. (ed. F.A. Murphy),. Springer-Verlag, New York.<br /> (cited by Malik et al. (2000) - for the definition and first use of the term 'errantivirus')

      - Eickbush, T.H. and Jamburuthugoda, V.K. (2008) https://doi.org/10.1016/j.virusres.2007.12.010<br /> (on the classification of retrotransposons and their env-like genes)

      - Hayward, A. (2017) https://doi.org/10.1016/j.coviro.2017.06.006<br /> (on scenarios of env acquisition)

      (3) Incomplete analysis:

      a) Mobile genetic elements are sometimes difficult to assemble correctly from short-read sequencing data. Did the authors confirm some of their newly identified elements by e.g., PCR analysis or re-identification in long-read sequencing data?

      b) The authors mention somewhat on the side that there are Ty3/gypsy elements with a different arrangement (gag-env-pol instead of gag-pol-env). Why was this important feature apparently not used and correlated in the analysis? How does it map on the RT phylogenetic tree? Which type of env is found with either arrangement? Is there evidence for a loss of env also in the case of gag-env-pol elements?

      c) Sankey plots are insufficiently explained. How would inconsistencies between trees (recombinations) show up here? Why is there no Sankey plot for the analysis of env-B in Figure 5?

      d) Why are there no trees generated for env-F and env-B like proteins, including closely related homologous sequences that do NOT come from Ty3/gypsy retrotransposons (e.g., from the eukaryotic hosts, from other types of retrotransposons (Ty1/copia or BEL/pao), from viruses such as Herpesvirus and Baculovirus)? It would be informative whether the sequences from Ty3/gypsy cluster together in this case.

      e) Did the authors identify any other env-like ORFs (apart from env-F and env-B) among Ty3/gypsy retrotransposons? Did they identify other, non-env-like ORFs that might help in the analysis? It is not quite clear from the methods if the searches for env-F and env-B - containing Ty3/gypsy elements were done separately and consecutively or somehow combined (the authors generally use 'env', and it is not clear which type of protein this refers to).

      f) Why was the gag protein apparently not used to support the analysis? Are there different, unrelated types of gag among non-LTR retrotransposons? Does gag follow or break the pattern of co-evolution between RT and env-F/env-B?

      g) Data availability. The link given in the paper does not seem to work (https://github.com/RippeiHayashi/errantiviruses_2025/tree/main). It would be useful for the community to have the sequences of the newly identified Ty3/gypsy retrotransposons listed readily available (not just genome coordinates as in table S1), together with the respective annotations of ORFs and features.

  2. Nov 2025
    1. Reviewer #1 (Public review):

      Summary:

      From a forward genetic mosaic mutant screen using EMS, the authors identify mutations in glucosylceramide synthase (GlcT), a rate-limiting enzyme for glycosphingolipid (GSL) production, that result in ee tumors. Multiple genetic experiments strongly support the model that the mutant phenotype caused by GlcT loss is due to by failure of conversion of ceramide into glucosylceramide. Further genetic evidence suggests that Notch signaling is comprised in the ISC lineage and may affect endocytosis of Delta. Loss of GlcT does not affect wing development or oogenesis, suggesting tissue-specific roles for GlcT. Finally, an increase in goblet cells in UGCG knockout mice, not previously reported, suggests a conserved role for GlcT in Notch signaling in intestinal cell lineage specification.

      Strengths:

      Overall, this is a well-written paper with multiple well-designed and executed genetic experiments that support a role for GlcT in Notch signaling in the fly and mammalian intestine. The authors have addressed my concerns from the prior review.

    2. Reviewer #2 (Public review):

      Summary:

      This study genetically identifies two key enzymes involved in the biosynthesis of glycosphingolipids, GlcT and Egh, act as tumor suppressors in the adult fly gut. Detailed genetic analysis indicates that a deficiency in Mactosyl-ceramide (Mac-Cer) is causing tumor formation. Analysis of a Notch transcriptional reporter further indicates that the lack of Mac-Ser is associated with reduced Notch activity in the gut, but not in other tissues.

      Addressing how a change in the lipid composition of the membranes might lead to defective Notch receptor activation, the authors studied the endocytic trafficking of Delta and claimed that internalized Delta appeared to accumulate faster into endosomes in the absence of Mac-Cer. Further analysis of Delta steady state accumulation in fixed samples suggested a delay in the endosomal trafficking of Delta from Rab5+ to Rab7+ endosomes, which was interpreted to suggest that the inefficient, or delayed, recycling of Delta might cause a loss in Notch receptor activation.

      Finally, the histological analysis of mouse guts following the conditional knock-out of the GlcT gene suggested that Mac-Cer might also be important for proper Notch signaling activity in that context.

      Strengths:

      The genetic analysis is of high quality. The finding that a Mac-Cer deficiency results in reduced Notch activity in the fly gut is important and fully convincing.

      The mouse data, although preliminary, raised the possibility that the role of this specific lipid may be conserved across species.

    1. Reviewer #1 (Public review):

      The study analyzes the gastric fluid DNA content identified as a potential biomarker for human gastric cancer. However, the study lacks overall logicality, and several key issues require improvement and clarification. In the opinion of this reviewer, some major revisions are needed:

      (1) This manuscript lacks a comparison of gastric cancer patients' stages with PN and N+PD patients, especially T0-T2 patients.

      (2) The comparison between gastric cancer stages seems only to reveal the difference between T3 patients and early-stage gastric cancer patients, which raises doubts about the authenticity of the previous differences between gastric cancer patients and normal patients, whether it is only due to the higher number of T3 patients.

      (3) The prognosis evaluation is too simplistic, only considering staging factors, without taking into account other factors such as tumor pathology and the time from onset to tumor detection.

      (4) The comparison between gfDNA and conventional pathological examination methods should be mentioned, reflecting advantages such as accuracy and patient comfort.

      (5) There are many questions in the figures and tables. Please match the Title, Figure legends, Footnote, Alphabetic order, etc.

      (6) The overall logicality of the manuscript is not rigorous enough, with few discussion factors, and cannot represent the conclusions drawn

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated whether the total DNA concentration in gastric fluid (gfDNA), collected via routine esophagogastroduodenoscopy (EGD), could serve as a diagnostic and prognostic biomarker for gastric cancer. In a large patient cohort (initial n=1,056; analyzed n=941), they found that gfDNA levels were significantly higher in gastric cancer patients compared to non-cancer, gastritis, and precancerous lesion groups. Unexpectedly, higher gfDNA concentrations were also significantly associated with better survival prognosis and positively correlated with immune cell infiltration. The authors proposed that gfDNA may reflect both tumor burden and immune activity, potentially serving as a cost-effective and convenient liquid biopsy tool to assist in gastric cancer diagnosis, staging, and follow-up.

      Strengths:

      This study is supported by a robust sample size (n=941) with clear patient classification, enabling reliable statistical analysis. It employs a simple, low-threshold method for measuring total gfDNA, making it suitable for large-scale clinical use. Clinical confounders, including age, sex, BMI, gastric fluid pH, and PPI use, were systematically controlled. The findings demonstrate both diagnostic and prognostic value of gfDNA, as its concentration can help distinguish gastric cancer patients and correlates with tumor progression and survival. Additionally, preliminary mechanistic data reveal a significant association between elevated gfDNA levels and increased immune cell infiltration in tumors (p=0.001).

      Weaknesses:

      The study has several notable weaknesses. The association between high gfDNA levels and better survival contradicts conventional expectations and raises concerns about the biological interpretation of the findings. The diagnostic performance of gfDNA alone was only moderate, and the study did not explore potential improvements through combination with established biomarkers. Methodological limitations include a lack of control for pre-analytical variables, the absence of longitudinal data, and imbalanced group sizes, which may affect the robustness and generalizability of the results. Additionally, key methodological details were insufficiently reported, and the ROC analysis lacked comprehensive performance metrics, limiting the study's clinical applicability.

    1. Reviewer #1 (Public review):

      Summary:

      Most studies in sensory neuroscience investigate how individual sensory stimuli are represented in the brain (e.g., the motion or color of a single object). This study starts tackling the more difficult question of how the brain represents multiple stimuli simultaneously and how these representations help to segregate objects from cluttered scenes with overlapping objects.

      Strengths:

      The authors first document the ability of humans to segregate two motion patterns based on differences in speed. Then they show that a monkey's performance is largely similar; thus establishing the monkey as a good model to study the underlying neural representations.

      Careful quantification of the neural responses in the middle temporal area during the simultaneous presentation of fast and slow speeds leads to the surprising finding that, at low average speeds, many neurons respond as if the slowest speed is not present, while they show averaged responses at high speeds. This unexpected complexity of the integration of multiple stimuli is key to the model developed in this paper.

      One experiment in which attention is drawn away from the receptive field supports the claim that this is not due to the involuntary capture of attention by fast speeds.

      A classifier using the neuronal response and trained to distinguish single speed from bi-speed stimuli shows a similar overall performance and dependence on the mean speed as the monkey. This supports the claim that these neurons may indeed underlie the animal's decision process.

      The authors expand the well-established divisive normalization model to capture the responses to bi-speed stimuli. The incremental modeling (eq 9 and 10) clarifies which aspects of the tuning curves are captured by the parameters.

    2. Reviewer #3 (Public review):

      Summary:

      This study concerns how macaque visual cortical area MT represents stimuli composed of more than one speed of motion.

      Strengths:

      The study is valuable because little is known about how the visual pathway segments and preserves information about multiple stimuli. The study presents compelling evidence that (on average) MT neurons shift from faster-speed-takes-all at low speeds to representing the average of the two speeds at higher speeds. An additional strength of the study is the inclusion of perceptual reports from both humans and one monkey participant performing a task in which they judged whether the stimuli involved one vs two different speeds. Ultimately, this study raises intriguing questions about how exactly the response patterns in visual cortical area MT might preserve information about each speed, since such information is potentially lost in an average response as described here.

      Reviewing Editor comment on revised version:

      The remaining concern was resolved.

    1. Reviewer #1 (Public review):

      The authors focus on the molecular mechanisms by which EMT cells confer resistance to cancer cells. The authors use a wide range of methods to reveal that overexpression of Snail in EMT cells induces cholesterol/sphingomyelin imbalance via transcriptional repression of biosynthetic enzymes involved in sphingomyelin synthesis. The study also revealed that ABCA1 is important for cholesterol efflux and thus for counterbalancing the excess of intracellular free cholesterol in these snail-EMT cells. Inhibition of ACAT, an enzyme catalyzing cholesterol esterification, also seems essential to inhibit the growth of snail-expressing cancer cells.

      Overall, the provided data are convincing and enhance our knowledge on cancer biology.

    2. Reviewer #2 (Public review):

      Summary:

      This revised study provides a clearer and more mechanistically grounded explanation of how lipid metabolic imbalance contributes to EMT-associated chemoresistance in renal cancer. In this study, the authors discovered that chemoresistance in RCC cell lines correlates with the expression levels of ABCA1 and the EMT-related transcription factor Snail. They demonstrate that Snail induces ABCA1 expression and chemoresistance, and that inhibition of ABCA1-associated pathways can counteract this resistance. The study also suggests that Snail disrupts the cholesterol-sphingomyelin balance by repressing enzymes involved in VLCFA-sphingomyelin synthesis, leading to excess free cholesterol and activation of the LXR-ABCA1 axis. Importantly, inhibiting cholesterol esterification, which renders free cholesterol inert, selectively suppresses growth of a xenograft model of Snail-positive kidney cancer. These findings provide potential lipid metabolism-targeting strategies for cancer therapy. The revised version includes additional quantitative analyses and new experiments addressing lipid balance and ABCA1 localization, further strengthening the overall mechanistic model.

      Strengths:

      This revised manuscript provides a more comprehensive and convincing mechanistic explanation for how Snail-driven EMT induces chemoresistance through altered lipid homeostasis. The study presents a novel concept in which the Chol/SM balance, rather than individual lipid levels, shapes therapeutic vulnerability. The potential for targeting cholesterol detoxification pathways in Snail-positive cancer cells remains a significant therapeutic implication. In the revised version, the authors provide additional quantitative analyses and complementary experiments - including ABCA1 localization, restoration of VLCFA-SM levels by supplementation with C22:0 ceramide, and membrane-order assays - which further strengthen the mechanistic interpretation and address key concerns raised in earlier reviews.

      Weaknesses:

      The revised version includes new experiments showing that restoring sphingomyelin levels suppresses ABCA1 expression, thereby strengthening the causal link between altered lipid balance and ABCA1 induction. However, the evidence that ABCA1 is directly required for chemoresistance remains somewhat limited, as the phenotype was not reproduced by ABCA1 knockout or knockdown, and CsA may affect additional targets beyond ABCA1.

    1. Reviewer #1 (Public review):

      Summary:

      This study builds off prior work that focused on the molecule AA147 and its role as an activator of the ATF6 arm of the unfolded protein response. In prior manuscripts, AA147 was shown to enter the ER, covalently modify a subset of protein disulfide isomerases (PDIs), and improve ER quality control for the disease-associated mutants of AAT and GABAA. Unsuccessful attempts to improve the potency of AA147 have led the authors to characterize a second hit from the screen in this study: the phenylhydrazone compound AA263. The focus of this study on enhancing biological activity of the AA147 molecule is compelling, and overcomes a hurdle of the prior AA147 drug that proved difficult to modify. The study successfully identifies PDIs as a shared cellular target of AA263 and its analogs. The authors infer, based on the similar target hits previously characterized for AA147, that PDI modification likely accounts for a mechanism of action for AA263.

      Strengths:

      The work establishes the ability to modify the AA263 molecule to create analogs with more potency and efficacy for ATF6 activation. The "next generation" analogs are able to enhance the levels of functional AAT and GABAA receptors in cellular models expressing the Z-variant of AAT or an epilepsy-associated variant of the GABAA receptor, outlining the therapeutic potential for this molecule and laying the foundation for future organism-based studies.

      The authors are able to establish that like AA147, AA263 covalently targets ER PDIs. While it is a likely mechanism that AA263 works through the PDIs, the authors are careful to discuss that this is a potential mechanism that remains to be explicitly proven. The study provides the foundation for future work to further define a role for the PDIs in the actions of AA263.

    2. Reviewer #2 (Public review):

      Modulating the UPR by pharmacological targeting of its sensors (or regulators) provides mostly uncharted opportunities in diseases associated with protein misfolding in the secretory pathway. Spearheaded by the Kelly and Wiseman labs, ATF6 modulators were developed in previous years that act on ER PDIs as regulators of ATF6. However, hurdles in their medicinal chemistry have hampered further developments. In this study, the authors provide evidence that the small molecule AA263 also targets and covalently modifies ER PDIs with the effect of activating ATF6. Importantly, AA263 turned out to be amenable to chemical optimization while maintaining its desired activity. Building on this, the authors show that AA263 derivatives can improve aggregation, trafficking and function of two disease-associated mutants of secretory pathway proteins. Together, this study provides compelling evidence for AA263 (and its derivatives) being interesting modulators of ER proteostasis. Mechanistic details of its mode of action will need more attention in future studies that can now build on this.

      In detail, the authors provide strong evidence that AA263 covalently binds to ER PDIs, which will inhibit the protein disulfide isomerase activity. ER PDIs regulate ATF6, and thus their finding provides a mechanistic interpretation of AA263 activating the UPR. It should be noted, however, that AA263 shows broad protein labeling (Fig. 1G) which may suggest additional targets, beyond the ones defined as MS hits in this study. Also, a further direct analysis of the IRE1 and PERK pathways (activated or not by AA263) may be an interesting future directions, as e.g. PDIA1, a target of AA263, directly regulates IRE1 (Yu et al., EMBOJ, 2020) and other PDIs also act on PERK and IRE1. The authors interpret modest activation of IRE1/PERK target genes (Fig. 2C) as an effect on target gene overlap, indeed the most likely explanation based on their selective analyses on IRE1 (ERdj4) and PERK (CHOP) downstream genes, but direct activation due to the targeting of their PDI regulators is also a possible explanation. Further key findings of this paper are the observed improvement of AAT behavior and GABAA trafficking and function. Further strength to the mechanistic conclusion that ATF6 activation causes this could be obtained by using ATF6 inhibitors/knockouts in the presence of AA263 (as the target PDIs may directly modulate behavior of AAT and/or GABAA). Along the same line, it also warrants further investigation in future studies why the different compounds, even if all were used at concentrations above their EC50, had different rescuing capacities on the clients.

      Together, the study now provides a strong basis for such in-depth mechanistic analyses.

    3. Reviewer #3 (Public review):

      Summary:

      This study aims to develop and characterize phenylhydrazone-based small molecules that selectively activate the ATF6 arm of the unfolded protein response by covalently modifying a subset of ER-resident PDIs. The authors identify AA263 as a lead scaffold and optimize its structure to generate analogs with improved potency and ATF6 selectivity, notably AA263-20. These compounds are shown to restore proteostasis and functional expression of disease-associated misfolded proteins in cellular models involving both secretory (AAT-Z) and membrane (GABAA receptor) proteins. The findings provide valuable chemical tools for modulating ER proteostasis and may serve as promising leads for therapeutic development targeting protein misfolding diseases.

      Strengths:

      The study presents a well-defined chemical biology framework integrating proteomics, transcriptomics, and disease-relevant functional assays.

      Identification and optimization of a new electrophilic scaffold (AA263) that selectively activates ATF6 represents a valuable advance in UPR-targeted pharmacology.

      SAR studies are comprehensive and logically drive the development of more potent and selective analogs such as AA263-20.

      Functional rescue is demonstrated in two mechanistically distinct disease models of protein misfolding-one involving a secretory protein and the other a membrane protein-underscoring the translational relevance of the approach.

      Weaknesses:

      ATF6 activation is primarily inferred from reporter assays and transcriptional profiling; direct biochemical evidence of ATF6 cleavage or nuclear translocation remains missing. However, the authors have added supporting data showing that co-treatment with the ATF6 inhibitor CP7 suppresses target gene induction, which partially strengthens the evidence for ATF6-dependent activity.

      Although the proposed mechanism involving PDI modification and ATF6 activation is plausible, it is still not experimentally demonstrated and remains incompletely characterized.

      In vivo validation is absent, and thus the pharmacological feasibility, selectivity, and bioavailability of these compounds in physiological systems remain untested.

      Comments on revisions:

      The authors have generally addressed my comments.

    1. Reviewer #2 (Public review):

      Summary:

      This paper formulates an individual-based model to understand the evolution of division of labor in vertebrates. The model considers a population subdivided in groups, each group has a single asexually-reproducing breeder, other group members (subordinates) can perform two types of tasks called "work" or "defense", individuals have different ages, individuals can disperse between groups, each individual has a dominance rank that increases with age, and upon death of the breeder a new breeder is chosen among group members depending on their dominance. "Workers" pay a reproduction cost by having their dominance decreased, and "defenders" pay a survival cost. Every group member receives a survival benefit with increasing group size. There are 6 genetic traits, each controlled by a single locus, that control propensities to help and disperse, and how task choice and dispersal relate to dominance. To study the effect of group augmentation without kin selection, the authors cross-foster individuals to eliminate relatedness. The paper allows for the evolution of the 6 genetic traits under some different parameter values to study the conditions under which division of labour evolves, defined as the occurrence of different subordinates performing "work" and "defense" tasks. The authors envision the model as one of vertebrate division of labor.

      The main conclusion of the paper is that group augmentation is the primary factor causing the evolution of vertebrate division of labor, rather than kin selection. This conclusion is drawn because, for the parameter values considered, when the benefit of group augmentation is set to zero, no division of labor evolves and all subordinates perform "work" tasks but no "defense" tasks.

      Strengths:

      The model incorporates various biologically realistic details, including the possibility to evolve age polytheism where individuals switch from "work" to "defence" tasks as they age or vice versa, as well as the possibility of comparing the action of group augmentation alone with that of kin selection alone.

      Weaknesses:

      The model and its analysis are limited, which in my view makes the results insufficient to reach the main conclusion that group augmentation and not kin selection is the primary cause of the evolution of vertebrate division of labour. There are several reasons.

      First, although the main claim that group augmentation drives the evolution of division of labour in vertebrates, the model is rather conceptual in that it doesn't use quantitative empirical data that applies to all/most vertebrates and vertebrates only. So, I think the approach has a conceptual reach rather than being able to achieve such a conclusion about a real taxon.

      Second, I think that the model strongly restricts the possibility that kin selection is relevant. The two tasks considered essentially differ only by whether they are costly for reproduction or survival. "Work" tasks are those costly for reproduction and "defense" tasks are those costly for survival. The two tasks provide the same benefits for reproduction (eqs. 4, 5) and survival (through group augmentation, eq. 3.1). So, whether one, the other, or both helper types evolve presumably only depends on which task is less costly, not really on which benefits it provides. As the two tasks give the same benefits, there is no possibility that the two tasks act synergistically, where performing one task increases a benefit (e.g., increasing someone's survival) that is going to be compounded by someone else performing the other task (e.g., increasing that someone's reproduction). So, there is very little scope for kin selection to cause the evolution of labour in this model. Note synergy between tasks is not something unusual in division of labour models, but is in fact a basic element in them, so excluding it from the start in the model and then making general claims about division of labour is unwarranted. In their reply, the authors point out that they only consider fertility benefits as this, according to them, is what happens in cooperative breeders with alloparental care; however, alloparental care entails that workers can increase other's survival *without group augmentation*, such as via workers feeding young or defenders reducing predator-caused mortality, as a mentioned in my previous review but these potentially kin-selected benefits are not allowed here.

      Third, the parameter space is understandably little explored. This is necessarily an issue when trying to make general claims from an individual-based model where only a very narrow parameter region of a necessarily particular model can be feasibly explored. As in this model the two tasks ultimately only differ by their costs, the parameter values specifying their costs should be varied to determine their effects. In the main results, the model sets a very low survival cost for work (yh=0.1) and a very high survival cost for defense (xh=3), the latter of which can be compensated by the benefit of group augmentation (xn=3). Some limited variation of xh and xn is explored, always for very high values, effectively making defense unevolvable except if there is group augmentation. In this revision, additional runs have been included varying yh and keeping xh and xn constant (Fig. S6), so without addressing my comment as xn remains very high. Consequently, the main conclusion that "division of labor" needs group augmentation seems essentially enforced by the limited parameter exploration, in addition to the second reason above.

      Fourth, my view is that what is called "division of labor" here is an overinterpretation. When the two helper types evolve, what exists in the model is some individuals that do reproduction-costly tasks (so-called "work") and survival-costly tasks (so-called "defense"). However, there are really no two tasks that are being completed, in the sense that completing both tasks (e.g., work and defense) is not necessary to achieve a goal (e.g., reproduction). In this model there is only one task (reproduction, equation 4,5) to which both helper types contribute equally and so one task doesn't need to be completed if completing the other task compensates for it; instead, it seems more fitting to say that there are two types of helpers, one that pays a fertility cost and another one a survival cost, for doing the same task. So, this model does not actually consider division of labor but the evolution of different helper types where both helper types are just as good at doing the single task but perhaps do it differently and so pay different types of costs. In this revision, the authors introduced a modified model where "work" and "defense" must be performed to a similar extent. Although I appreciate their effort, this model modification is rather unnatural and forces the evolution of different helper types if any help is to evolve.

      I should end by saying that these comments don't aim to discourage the authors, who have worked hard to put together a worthwhile model and have patiently attended to my reviews. My hope is that these comments can be helpful to build upon what has been done to address the question posed.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigate how the Drosophila TNF receptor-associated factor Traf4 - a multifunctional adaptor protein with potential E3 ubiquitin ligase activity - regulates JNK signaling and adherens junctions (AJs) in wing disc epithelium. When they overexpress Traf4 in the posterior compartment of the wing disc, many posterior cells express the JNK target gene puckered (puc), apoptose, and are basally extruded from the epithelium. The authors term this process "delamination", but I think that this is an inaccurate description, especially since they can suppress the "delamination" by blocking programmed cell death (by concomitantly overexpressing p35). Through Y2H assays using Traf4 as a bait, they identified the Bearded family proteins E(spl)m4 (and to a lesser extent E(spl)m2), as Traf4 interactors. They use Alphafold to model computationally the interaction between Traf4 and E(spl)m4. They show that co-overexpression of Traf4 with E(spl)m4 in the posterior domain of the wing disc reduces death of posterior cells. They generate a new, weaker hypomorphic allele of Traf4 that is viable (as opposed to the homozygous lethality of null Traf4 alleles). There is some effect of these mutations on wing margin bristles; fewer wing margin bristle defects are seen when E(spl)m4 is overexpressed, suggesting opposite effects of Traf4 and E(spl)m4. Finally, they use the Minute model of cell competition to show that Rp/+ loser clones have greater clone area (indicating increased survival) when they are depleted for Traf4 or when they overexpress E(spl)m4. Only the cell competition results are quantified. Because most of the data in the preprint are not quantified, it is impossible to know how penetrant the phenotypes are. The authors conclude that E(spl)m4 binds the Traf4 MATH/TRAF domain, disrupts Traf4 trimerization, and selectively suppresses Traf4-mediated JNK and caspase activation without affecting its role in AJ destabilization. However, I believe that this is an overstatement. First, there is no biochemical evidence showing that Traf4 binds E(spl)m4 and that E(spl)m4 disrupts Traf4 trimerization. Second, the data on AJs is weak and not quantified; additionally, cells that are being basally extruded lose contact with neighboring cells, hence changes in adhesion proteins. Related to this, the authors, in my opinion, inaccurately describe basal extrusion of dying cells from the wing disc epithelium as delamination.

      Strengths:

      (1) The authors use multiple approaches to test the model that overexpressed E(spl)m4 inhibits Traf4, including genetics, cell biological imaging, yeast two-hybrid assays, and molecular modeling.

      (2) The authors generate a new Traf4 hypomorphic mutant and use this mutant in cell competition studies, which supports the concept that E(spl)m4 (when overexpressed) can antagonize Traf4.

      Weaknesses:

      (1) Conflation of "delamination" with "basal extrusion of apoptotic cells": Over-expression of Traf4 causes apoptosis in wing disc cells, and this is a distinct process from delamination of viable cells from an epithelium. However, the two processes are conflated by the authors, and this weakens the premise of the paper.

      (2) Dependence on overexpression: The conclusions rely heavily on ectopic expression of Traf4 and E(spl)m4. Thus, the physiological relevance of the interaction remains inferred rather than demonstrated.

      (3) Lack of quantitative rigor: Except for the cell competition studies, phenotypic descriptions (e.g., number of apoptotic cells, puc-LacZ intensity) are qualitative; additional quantification, inclusion of sample size, and statistical testing would strengthen the conclusions.

      (4) Limited biochemical validation: The Traf4-E(spl)m4 binding is inferred from Y2H and in silico models, but no co-immunoprecipitation or in vitro binding assays confirm direct interaction or the predicted disruption of trimerization.

      (5) Specificity within the Bearded family: While E(spl)m2 shows partial binding and Tom shows none, the mechanistic basis for this selectivity is not deeply explored experimentally, leaving questions about motif-context contributions unresolved.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript analyzes the contribution of Traf4 to the fate of epithelial cells in the developing wing imaginal disc tissue. The manuscript is direct and concise and suggests an interesting and valuable hypothesis with dual functions of Traf4 in JNK pathway activation and cell delamination. However, the text is partially speculative, and the evidence is incomplete as the main claims are only partially supported. Some results require validation to support the conclusions.

      Strengths:

      (1) The manuscript is direct and concise, with a well-written and precise introduction.

      (2) It presents an interesting and valuable hypothesis regarding the dual role of Traf4 in JNK pathway activation and cell delamination.

      (3) The study addresses a relevant biological question in epithelial tissue development using a genetically tractable model.

      (4) The use of newly generated Traf4 mutants adds novelty to the experimental approach.

      (5) The manuscript includes multiple experimental strategies, such as genetic manipulation and imaging, to explore Traf4 function.

      Weaknesses:

      (1) The evidence supporting key claims is incomplete, and some conclusions are speculative.

      (2) The use of GFP-tagged Traf4 lacks validation regarding its functional integrity.

      (3) Orthogonal views and additional imaging data are needed to confirm changes in apicobasal localization and cell delamination.

      (4) Experimental conditions and additional methods should be further detailed.

      (5) The interaction between Traf4 and E(spl)m4 remains speculative in Drosophila.

      (6) New mutants require deeper analysis and validation.

      (7) The elimination of Traf4 mutant clones may be due to cell competition, which requires further experimental clarification.

      (8) The role of Traf4 in cell competition is contradictory and needs to be resolved.

    3. Reviewer #3 (Public review):

      Summary:

      This is an important and well-conceived study that identifies the Bearded-type small protein E(spl)m4 as a physical and genetic interactor of TRAF4 in Drosophila. By combining classical genetics, yeast two-hybrid assays, and AlphaFold in silico modeling, the authors convincingly demonstrate that E(spl)m4 acts as an inhibitor of TRAF4-mediated induction of JNK-driven apoptosis in developing larval imaginal wing discs, while not affecting TRAF4's role in adherence junction remodeling.

      Based primarily on modeling, the authors propose that the specificity of E(spl)m4 towards TRAF4-mediated signaling arises from its interference with TRAF4 trimerization, which is likely required for the activation of the JNK signaling arm but not for the maintenance of adherence junctions and stability of E-cadherin/β-catenin complex.

      Overall, this study is of broad interest to cell and developmental biologists. It also holds potential biomedical relevance, particularly for strategies aimed at modulating TRAF protein activities to dissect and modulate canonical versus non-canonical signaling functions.

      Strengths:

      (1) The work identifies the Bearded-type small protein E(spl)m4 as a physical and genetic interactor of TRAF4 in Drosophila, extending the understanding of E(spl)m4 beyond its established functions in Notch signaling.

      (2) The study is experimentally solid, well-executed, and written, combining classical genetics with protein-protein interaction assays and modeling to reveal E(spl)m4 as a new regulator of TRAF4 signaling.

      (3) The genetic and biochemical data convincingly show the ability of E(spl)m4 overexpression to inhibit TRAF4-induced JNK-dependent apoptosis, while leaving the TRAF4 role in adherens junction remodeling unaffected.

      (4) The findings have important implications for the regulation of cell signaling and apoptosis and may guide pharmacological targeting of TRAF proteins.

      Weaknesses:

      The study is overall strong; however, several aspects could be clarified or expanded to strengthen the proposed mechanism and data presentation:

      (1) The proposed mechanism that E(spl)m4 inhibits TRAF4 activation of JNK signaling by affecting TRAF4 trimerization relies mainly on modeling. Experimental evidence would strengthen this claim. For example, a native or non-denaturing SDS-PAGE could be used to assess TRAF4 oligomerization states in the absence or presence of E(spl)m4 overexpression, testing whether E(spl)m4 interferes with high-molecular-weight TRAF4 assemblies.

      (2) The study depends largely on E(spl)m4 overexpression, which may not reflect physiological conditions. It would be valuable to test, or at least discuss, whether loss-of-function or knockdown of E(spl)m4 modulates the strength or duration of JNK-mediated signaling, potentially accelerating apoptosis. Such data would reinforce the model that E(spl)m4 acts as a physiological modulator of TRAF4-JNK signaling in vivo.

      (3) The authors initially identify both E(spl)m4 and E(spl)m2 as TRAF4 interactions, but subsequently focus on E(spl)m4. It would be helpful to clarify or discuss the rationale for prioritizing E(spl)m4 for detailed functional analysis.

      (4) E(spl)m4 overexpression appears to protect RpS3 loser clones (Figure 6H-K), yet caspase-3-positive cells are still visible in mosaic wing discs. Please comment on the nature of these Caspase 3-positive cells, whether they are cell-autonomous to the clone or non-autonomous (Figure 6K)?

      (5) This is a clear, well-executed, and conceptually strong study that significantly advances understanding of TRAF4 signaling specificity and its modulation by the Bearded-type protein E(spl)m4.

    1. Reviewer #1 (Public review):

      Nielsen et al have identified a new disease mechanism underlying hypoplastic left heart syndrome due to variants in ribosomal protein genes that lead to impaired cardiomyocyte proliferation. This detailed study starts with an elegant screen in stem cell derived cardiomyocytes and whole genome sequencing of human patients and extends to careful functional analysis of RP gene variants in fly and fish models. Striking phenotypic rescue is seen by modulating known regulators of proliferation including the p53 and Hippo pathways. Additional experiments suggest that cell type specificity of the variants in these ubiquitously expressed genes may result from genetic interactions with cardiac transcription factors. This work positions RPs as important regulators of cardiomyocyte proliferation and differentiation involved in the etiology of HLHS, and point to potential downstream mechanisms.

      The revised manuscript has been extended, facilitating interpretation and reinforcing the authors' conclusions.

    2. Reviewer #2 (Public review):

      Tanja Nielsen et al. presents a novel strategy for identification of candidate genes in Congenital Heart Disease (CHD). Their methodology, which is based on comprehensive experiments across cell models, drosophila and zebrafish models, represents an innovative, refreshing and very useful set of tools for identification of disease genes, in a field which are struggling with exactly this problem.

      The authors have applied their methodology to investigate the pathomechanisms of Hypoplastic Left Heart Syndrome (HLHS) - a severe and rare subphenotype in the large spectrum of CHD malformations. Their data convincingly implicates ribosomal proteins (RPs) in growth and proliferation defects of cardiomyocytes, a mechanism which is suspected to be associated with HLHS.

      By whole genome sequencing analysis of a small cohort of trios (25 HLHS patients and their parents) the authors investigated a possible association between RP encoding genes and HLHS.

      Although the possible association between defective RPs and HLHS needs to be verified, the results suggest a novel disease mechanism in HLHS, which is a potentially substantial advance in our understanding of HLHS and CHD. The conclusions of the paper are based on solid experimental evidence from appropriate high- to medium-throughput models, while additional genetic results from an independent patient cohort is needed to verify an association between RP encoding genes and HLHS in patients.

    1. Reviewer #1 (Public review):

      The study analyzes the gastric fluid DNA content identified as a potential biomarker for human gastric cancer. However, the study lacks overall logicality, and several key issues require improvement and clarification. In the opinion of this reviewer, some major revisions are needed:

      (1) This manuscript lacks a comparison of gastric cancer patients' stages with PN and N+PD patients, especially T0-T2 patients.

      (2) The comparison between gastric cancer stages seems only to reveal the difference between T3 patients and early-stage gastric cancer patients, which raises doubts about the authenticity of the previous differences between gastric cancer patients and normal patients, whether it is only due to the higher number of T3 patients.

      (3) The prognosis evaluation is too simplistic, only considering staging factors, without taking into account other factors such as tumor pathology and the time from onset to tumor detection.

      (4) The comparison between gfDNA and conventional pathological examination methods should be mentioned, reflecting advantages such as accuracy and patient comfort.

      (5) There are many questions in the figures and tables. Please match the Title, Figure legends, Footnote, Alphabetic order, etc.

      (6) The overall logicality of the manuscript is not rigorous enough, with few discussion factors, and cannot represent the conclusions drawn.

      Comments on revisions:

      The authors have addressed all concerns in the revision.

    2. Reviewer #2 (Public review):

      Summary

      The authors aimed to evaluate whether total DNA concentration in gastric fluid (gfDNA) collected during routine endoscopy could serve as a diagnostic and prognostic biomarker for gastric cancer. Using a large cohort (n=941), they reported elevated gfDNA in gastric cancer patients, an unexpected association with improved survival, and a positive correlation with immune cell infiltration.

      Strengths

      The study benefits from a substantial sample size, clear patient stratification, and control of key clinical confounders. The method is simple and clinically feasible, with preliminary evidence linking gfDNA to immune infiltration.

      Weaknesses

      (1) While the study identifies gfDNA as a potential prognostic tool, the evidence remains preliminary. Unexplained survival associations and methodological gaps weaken support for the conclusions.

      (2) The paradoxical association between high gfDNA and better survival lacks mechanistic validation. The authors acknowledge but do not experimentally distinguish tumor vs. immune-derived DNA, leaving the biological basis speculative.

      (3) Pre-analytical variables were noted but not systematically analyzed for their impact on gfDNA stability.

      Comments on revisions:

      To enhance the completeness and credibility of this research, it is essential to clarify the biological origin of gastric fluid DNA and validate these preliminary findings through a prospective, longitudinal study design.

    1. Reviewer #1 (Public review):

      The authors of this study set out to address a central question in the psycholinguistics literature: does the human brain's ability to predict upcoming language come at a cognitive cost, or is it an automatic, "free" process? To investigate this, they employed a dual-task paradigm where participants read texts word-by-word while simultaneously performing a secondary task (an n-back task on font color) designed to manipulate cognitive load. The study examines how this external cognitive load, along with the effects of aging, modulates the impact of word predictability (measured by surprisal and entropy) on reading times. The central finding is that increased cognitive load diminishes the effects of word predictability, supporting the conclusion that language prediction is a resource-dependent process.

      A major strength of the revised manuscript is its comprehensive and parallel analysis of both word surprisal and entropy. The initial submission focused almost exclusively on surprisal, which primarily reflects the cost of integrating a word into its context after it has been perceived. The new analysis now thoroughly investigates entropy as well, which reflects the uncertainty and cognitive effort involved in predicting the next word before it appears. This addition provides a much more complete and theoretically nuanced picture, allowing the authors to address how cognitive load affects both predictive and integrative stages of language processing. This is a significant improvement and substantially increases the paper's contribution to the field.

      Furthermore, the authors have commendably addressed the initial concerns regarding the robustness of their replication findings. The first version of the manuscript presented replication results that were inconsistent, particularly for key interaction effects. In the revision, the authors have adopted a more focused and appropriately powered modeling approach for the replication analysis. This revised analysis now demonstrates a consistent effect of cognitive load on the processing of predictable words across both the original and replication datasets. This strengthens the evidence for the paper's primary claim.

      The initial review also raised concerns that the results could be explained by general cognitive factors, such as task-switching costs, rather than the specific demands on the language prediction system. While the complexity of cognitive load in a dual-task paradigm remains a challenge, the authors have provided sufficient justification in their revisions and rebuttal to support their interpretation that the observed effects are genuinely tied to the process of language prediction.

    2. Reviewer #2 (Public review):

      Summary:

      This paper considers the effects of cognitive load (using an n-back task related to font color), predictability, and age on reading times in two experiments. There were main effects of all predictors, but more interesting effects of load and age on predictability. The effect of load is very interesting, but the manipulation of age is problematic, because we don't know what is predictable for different participants (in relation to their age). There are some theoretical concerns about prediction and predictability, and a need to address literature (reading time, visual world, ERP studies).

      There is a major concern about the effects of age. See the results (155-190): this depends what is meant by word predictability. It's correct if it means the predictability in the corpus. But it may or may not be correct if it refers to how predictable a word is to an individual participant. The texts are unlikely to be equally predictable to different participants, and in particular to younger vs. older participants, because of their different experience. To put it informally, the newspaper articles may be more geared to the expectations of younger people. But there is also another problem: the LLM may have learned on the basis of language that has largely been produced by young people and so its predictions are based on what young people are likely to say. Both of these possibilities strike me as extremely likely. So it may be that older adults are affected more by words that they find surprising, but it is also possible that the texts are not what they expect, or the LLM predictions from the text are not the ones that they would make. In sum, I am not convinced that the authors can say anything about the effects of age unless they can determine what is predictable for different ages of participants. I suspect that this failure to control is an endemic problem in the literature on aging and language processing and needs to be systematically addressed.

      Overall, I think the paper makes enough of a contribution with respect to load to be useful to the literature. But for discussion of age, we would need something like evidence of how younger and older adults would complete these texts (on a word-by-word basis) and that they were equally predictable for different ages. I assume there are ways to get LLMs to emulate different participant groups, but I doubt if we could be confident about their accuracy without a lot of testing. But without something like this, I think making claims about age would be quite misleading.

      The authors respond to my summary comment by saying that prediction is individual and that they account for age-related effects in their models. But these aren't my concerns. Rather:

      (1) The texts (these edited newspaper articles) could be more predictable for younger than older adults. If so, effects with older adults could simply be because people are less likely to predict less than more predictable words.

      (2) The GPT-2 generated surprisal scores may correspond more closely to younger than older adult responses -- that is, its next word predictions may be more younger- than older-adult-like.

      In my view, the authors have two choices: they could remove the discussion of age-related effects, or they could try to address BOTH (1) and (2).

      As an aside, consider what we would conclude if we drew similar conclusions from a study in which children and adults read the same (children's) texts, but we didn't test what was predictable to each of them separately.

      The paper is really strong in other respects and if my concern is not addressed, the conclusions about age might be generally accepted.

    1. Reviewer #1 (Public review):

      Summary:

      Wojnowska et al. report structural and functional studies of the interaction of Streptococcus pyogenes M3 protein with collagen. They show through X-ray crystallographic studies that the N-terminal hypervariable region of M3 protein forms a T-like structure, and that the T-like structure binds a three-stranded collagen-mimetic peptide. They indicate that the T-like structure is predicted by AlphaFold3 with moderate confidence level in other M proteins that have sequence similarity to M3 protein and M-like proteins from group C and G streptococci. For some, but not all, of these related M and M-like proteins, AlphaFold3 predicts, with moderate confidence level, complexes similar to the one observed for M3-collagen. Functionally, the authors show that emm3 strains form biofilms with more mass when surfaces are coated with collagen, and this effect can be blocked by an M3 protein fragment that contains the T-structure. They also show the co-occurrence of emm3 strains and collagen in patient biopsies and a skin tissue organoid. Puzzlingly, M1 protein has been reported to bind collagen, but collagen inhibits biofilm in a particular emm1 strain but that same emm1 strain colocalizes with collagen in a patient biopsy sample. The implications of the variable actions of collagen on biofilm formation are not clear.

      Strengths:

      The paper is well written and the results are presented in a logical fashion.

      Weaknesses:

      A major limitation of the paper is that it is almost entirely observational and lacks detailed molecular investigation. Insufficient details or controls are provided to establish the robustness of the data.

      Comments on revisions:

      The authors' response to this reviewer's Major issue #1 is inadequate. Their argument is essentially that if they denature the protein, then there is no activity. This does not address the specificity of the structure or its interactions.

      They went only part way to addressing this reviewer's Major issue #2. While Figure 8 - supplement 3 shows 1D NMR spectra for M3 protein (what temperature?), it does not establish that stability is unaltered (to a significant degree).

      This reviewer's Major issue #3 is one of the major reasons for considering this study to be observational. This reviewer agrees that structural biology is by its nature observational, but modern standards require validation of structural observations. The authors' response is that a mechanistic investigation involving mutant bacterial strains and validation involving mutated proteins is beyond their scope. Therefore, the study remains observational.

      Major issue 4 was addressed suitably, but brings up the problematic point that the emm1 2006 strain colocalizes quite well with collagen in a patient biopsy sample but not in other assays. This calls into question the overall interpretability of the patient biopsy data.

      The authors have not provided a point-by-point response. Issues that were indicated to be minor previously were deemed to be minor because this reviewer thought that they could easily be addressed in a revision. It appears that the authors have ignored many of these comments, and these issues are therefore now considered to be major issues. For example, no errors are given for Kd measurements, Table 2 is sloppy and lacks the requested information, negative controls are missing (Figure 10 - figure supplement 1), and there is no indication of how many independent times each experiment was done.

      And "C4-binding protein" should be corrected to "C4b-binding protein."

    2. Reviewer #2 (Public review):

      Streptococcus pyogenes, or group A streptococci (GAS) can cause diseases ranging skin and mucosal infections, plasma invasion, and post-infection autoimmune syndromes. M proteins are essential GAS virulence factors that include an N-terminal hypervariable region (HVR). M proteins are known to bind to numerous human proteins; a small subset of M proteins were reported to bind collagen, which is thought to promote tissue adherence. In this paper, authors characterize M3 interactions with collagen and its role in biofilm formation. Specifically, they screened different collagen type II and III variants for full-length M3 protein binding using an ELISA-like method, detecting anti-GST antibody signal. By statistical analysis, hydrophobic amino acids and hydroxyproline found to positively support binding, whereas acidic residues and proline negatively impacted binding. The authors applied X-ray crystallography to determine the structure of the N-terminal domain (42-151 amino acids) of M3 protein (M3-NTD). M3-NTD dimmer (PDB 8P6K) forms a T-shaped structure with three helices (H1, H2, H3), which are stabilized by a hydrophobic core, inter-chain salt bridges and hydrogen bonds on H1, H2 helices, and H3 coiled coil. The conserved Gly113 serves as the turning point between H2 and H3. The M3-NTD is co-crystalized with a 24-residue peptide, JDM238, to determine the structure of M3-collagen binding. The structure (PDB 8P6J) shows that two copies of collagen in parallel bind to H1 and H2 of M3-NTD. Among the residues involved binding, conserved Try96 is shown to play a critical role supported by structure and isothermal titration calorimetry (ITC). The authors also apply a crystal-violet assay and fluorescence microscopy to determine that M3 is involved in collagen type I binding, but not M1 or M28. Tissue biopsy staining indicates that M3 strains co-localize with collagen IV-containing tissue, while M1 strains do not. The authors provide generally compelling evidence to show that GAS M3 protein binds to collagen, and plays a critical role in forming biofilms, which contribute to disease pathology. This is a very well-executed study and a well-written report relevant to understanding GAS pathogenesis and approaches to combatting disease; data are also applicable to emerging human pathogen Streptococcus dysgalactiae. One caveat that was not entirely resolved is if/how different collagen types might impact M3 binding and function. Due to the technical constrains, the in vitro structure and other binding assays use type II collagen whereas in vivo, biofilm formation assays and tissue biopsy staining use type I and IV collagen; it was unclear if this difference is significant. One possibility is that M3 has an unbiased binding to all types of collagens, only the distribution of collagens leads to the finding that M3 binds to type IV (basement membrane) and type I (varies of tissue including skin), rather than type II (cartilage).

      Comments on revisions:

      We are glad to see that the authors addressed our prior comments on M3 binding to different types of collagens in discussion section; adding a prediction of M3 binding to type I collagen (Figure 8-figure supplement 1B and 1C) is helpful to fill in the gap. Although it would be nice to experimentally fill in the gap by putting all types of collagens into one experiment (For example, like Figure 9A, use different types of human collagens to test biofilm formation; or Figure 10, use different types of human collagens to compete for biofilm formation), this appears to be beyond the scope of this paper. Meanwhile, the changes they have made are constructive.

      The authors have addressed the majority of our prior comments.

    1. Reviewer #3 (Public review):

      Summary:

      In this well-written manuscript, Unitt and colleagues propose a new, hierarchical nomenclature system for the pathogen Neisseria gonorrhoeae. The proposed nomenclature addresses a longstanding problem in N. gonorrhoeae genomics, namely that the highly recombinant population complicates typing schemes based on only a few loci and that previous typing systems, even those based on the core genome, group strains at only one level of genomic divergence without a system for clustering sequence types together. In this work, the authors have revised the core genome MLST scheme for N. gonorrhoeae and devised life identification numbers (LIN) codes to describe the N. gonorrhoeae population structure.

      Strengths:

      The LIN codes proposed in this manuscript are congruent with previous typing methods for Neisseria gonorrhoeae like cgMLST groups, Ng-STAR, and NG-MAST. Importantly, they improve upon many of these methods as the LIN codes are also congruent with the phylogeny and represent monophyletic lineages/sublineages. Additionally, LIN code cluster assignment is fixed, and clusters are not fused as is common in other typing schemes.

      The LIN code assignment has been implemented in PubMLST allowing other researchers to assign LIN codes to new assemblies and put genomes of interest in context with global datasets, including in private datasets.

      Weaknesses:

      The authors have defined higher resolution thresholds for the LIN code scheme. However, they do not investigate how these levels correspond to previously identified transmission clusters from genomic epidemiology studies. This will be an important focus of future work, but it may be beyond the scope of the current manuscript.

      Comments on revisions:

      The authors have addressed my previous comments. I have no additional recommendations.

    1. Reviewer #1 (Public review):

      Summary:

      The authors set out to evaluate the regulation of interferon (IFN) gene expression in fish, using mainly zebrafish as a model system. Similar to more widely characterized mammalian systems, fish IFN is induced during viral infection through the action of the transcription factor IRF3 which is activated by phosphorylation by the kinase TBK1. It has been previously shown in many systems that TBK1 is subjected to both positive and negative regulation to control IFN production. In this work, the authors find that the cell cycle kinase CDK2 functions as a TBK1 inhibitor by decreasing its abundance through recruitment of the ubiquitinylation ligase, Dtx4, which has been similarly implicated in the regulation of mammalian TBK1. Experimental data are presented showing that CDK2 interacts with both TBK1 and Dtx4, leading to TBK1 K48 ubiqutinylation on K567 and its subsequent degradation by the proteasome.

      Strengths:

      The strengths of this manuscript are its novel demonstration of the involvement of CDK2 in a process in fish that is controlled by different factors in other vertebrates and its clear and supportive experimental data.

      Weaknesses:

      The weaknesses of the study include the following. 1) It remains unclear how CDK is regulated during viral infection and how it specifically recruits E3 ligase to TBK1. The authors find that its abundance increases during viral infection, an unusual finding given that CDK2 levels are often found to be stable. How this change in abundance might affect cell cycle control was not explored. 2) The implications and mechanisms for a relationship between the cell cycle and IFN production will be a fascinating topic for future studies. In particular, it will be critical to determine if CDK2 catalytic activity is required. An experiment with an inhibitor suggests that this novel action of CDK2 is kinase independent, but the lack of controls showing the efficacy of the inhibitor prevents a firm conclusion. It will also be critical to determine if there is a role for cyclins in this process or if there is competition for binding between TBK1 and cyclin and, if so, if this has an impact on the cell cycle. Likewise, an impact of CDK2 induction by virus infection on normal cell cycling will be important to investigate.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors describe a novel function involving the cell cycle protein kinase CDK2, which binds to TBK1 (an essential component of the innate immune response) leading to its degradation in a ubiquitin/proteasome-dependent manner. Moreover, the E3 ubiquitin ligase, Dtx4, is implicated in the process by which CDK2 increases the K48-linked ubiquitination of TBK1. This paper presents intriguing findings on the function of CDK2 in lower vertebrates, particularly its regulation of IFN expression and antiviral immunity.

      Strengths:

      (1) The research employs a variety of experimental approaches to address a single question. The data are largely convincing and appear to be well executed.

      (2) The evidence is strong and includes a combination of in vivo and in vitro experiments, including knockout models, protein interaction studies, and ubiquitination analyses.

      (3) This study significantly impacts the field of immunology and virology, particularly concerning the antiviral mechanisms in lower vertebrates. The findings provide new insights into the regulation of IFN expression and the broader role of CDK2 in immune responses. The methods and data presented in this paper are highly valuable for the scientific community, offering new avenues for research into antiviral strategies and the development of therapeutic interventions targeting CDK2 and its associated pathways.

    1. Reviewer #1 (Public review):

      The authors investigated the potential role of IgG N-glycosylation in Haemorrhagic Fever with Renal Syndrome (HFRS), which may offer significant insights for understanding molecular mechanisms and for the development of therapeutic strategies for this infectious disease.

    2. Reviewer #2 (Public review):

      This work sought to explore antibody responses in the context of hemorrhagic fever with renal syndrome (HFRS) - a severe disease caused by Hantaan virus infection. Little is known about the characteristics or functional relevance of IgG Fc glycosylation in HFRS. To address this gap, the authors analyzed samples from 65 patients with HFRS spanning the acute and convalescent phases of disease via IgG Fc glycan analysis, scRNAseq, and flow cytometry. The authors observed changes in Fc glycosylation (increased fucosylation and decreased bisection) coinciding with a 4-fold or greater increased in Haantan virus-specific antibody titer. The study also includes exploratory analyses linking IgG glycan profiles to glycosylation-related gene expression in distinct B cell subsets, using single-cell transcriptomics. Overall, this is an interesting study that combines serological profiling with transcriptomic data to shed light on humoral immune responses in an underexplored infectious disease. The integration of Fc glycosylation data with single-cell transcriptomic data is a strength.

    1. Reviewer #1 (Public review):

      Summary:

      The microbiota of Dactylorhiza traunsteineri, an endangered marsh orchid, forms complex root associations that support plant health. Using 16S rRNA sequencing, we identified dominant bacterial phyla in its rhizosphere, including Proteobacteria, Actinobacteria, and Bacteroidota. Deep shotgun metagenomics revealed high-quality MAGs with rich metabolic and biosynthetic potential. This study provides key insights into root-associated bacteria and highlights the rhizosphere as a promising source of bioactive compounds, supporting both microbial ecology research and orchid conservation.

      Strengths:

      The manuscript presents an investigation of the bacterial communities in the rhizosphere of D. traunsteineri using advanced metagenomic approaches. The topic is relevant, and the techniques are up-to-date; however, the study has several critical weaknesses.

      Weaknesses:

      (1) Title: The current title is misleading. Given that fungi are the primary symbionts in orchids and were not analyzed in this study (nor were they included among other microbial groups), the use of the term "microbiome" is not appropriate. I recommend replacing it with "bacteriome" to better reflect the scope of the work.

      (2) Line 124: The phrase "D. traunsteineri individuals were isolated" seems misleading. A more accurate description would be "individuals were collected", as also mentioned in line 128.

      (3) Experimental design: The major limitation of this study lies in its experimental design. The number of plant individuals and soil samples analyzed is unclear, making it difficult to assess the statistical robustness of the findings. It is also not well explained why the orchids were collected two years before the rhizosphere soil samples. Was the rhizosphere soil collected from the same site and from remnants of the previously sampled individuals in 2018? This temporal gap raises serious concerns about the validity of the biological associations being inferred.

      (4) Low sample size: In lines 249-251 (Results section), the authors mention that only one plant individual was used for identifying rhizosphere bacteria. This is insufficient to produce scientifically robust or generalizable conclusions.

      (5) Contextual limitations: Numerous studies have shown that plant-microbe interactions are influenced by external biotic and abiotic factors, as well as by plant age and population structure. These elements are not discussed or controlled for in the manuscript. Furthermore, the ecological and environmental conditions of the site where the plants and soil were collected are poorly described. The number of biological and technical replicates is also not clearly stated.

      (6) Terminology: Throughout the manuscript, the authors refer to the "microbiome," though only bacterial communities were analyzed. This terminology is inaccurate and should be corrected consistently.

      Considering the issues addressed, particularly regarding experimental design and data interpretation, significant improvements to the study are needed.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aim to provide an overview of the D. traunsteineri rhizosphere microbiome on a taxonomic and functional level, through 16S rRNA amplicon analysis and shotgun metagenome analysis. The amplicon sequencing shows that the major phyla present in the microbiome belong to phyla with members previously found to be enriched in rhizospheres and bulk soils. Their shotgun metagenome analysis focused on producing metagenome assembled genomes (MAGs), of which one satisfies the MIMAG quality criteria for high-quality MAGs and three those for medium-quality MAGs. These MAGs were subjected to functional annotations focusing on metabolic pathway enrichment and secondary metabolic pathway biosynthetic gene cluster analysis. They find 1741 BGCs of various categories in the MAGs that were analyzed, with the high-quality MAG being claimed to contain 181 SM BGCs. The authors provide a useful, albeit superficial, overview of the taxonomic composition of the microbiome, and their dataset can be used for further analysis.

      The conclusions of this paper are not well-supported by the data, as the paper only superficially discusses the results, and the functional interpretation based on taxonomic evidence or generic functional annotations does not allow drawing any conclusions on the functional roles of the orchid microbiota.

      Weaknesses:

      The authors only used one individual plant to take samples. This makes it hard to generalize about the natural orchid microbiome.

      The authors use both 16S amplicon sequencing and shotgun metagenomics to analyse the microbiome. However, the authors barely discuss the similarities and differences between the results of these two methods, even though comparing these results may be able to provide further insights into the conclusions of the authors. For example, the relative abundance of the ASVs from the amplicon analysis is not linked to the relative abundances of the MAGs.

      Furthermore, the authors discuss that phyla present in the orchid microbiome are also found in other microbiomes and are linked to important ecological functions. However, their results reach further than the phylum level, and a discussion of genera or even species is lacking. The phyla that were found have very large within-phylum functional variability, and reliable functional conclusions cannot be drawn based on taxonomic assignment at this level, or even the genus level (Yan et al. 2017).

      Additionally, although the authors mention their techniques used, their method section is sometimes not clear about how samples or replicates were defined. There are also inconsistencies between the methods and the results section, for example, regarding the prediction of secondary metabolite biosynthetic gene clusters (BGCs).

      The BGC prediction was done with several tools, and the unusually high number of found BGCs (181 in their high-quality MAG) is likely due to false positives or fragmented BGCs. The numbers are much higher than any numbers ever reported in literature supported by functional evidence (Amos et al, 2017), even in a prolific genus like Streptomyces (Belknap et al., 2020). This caveat is not discussed by the authors.

      The authors have generated one high-quality MAG and three medium-quality MAGs. In the discussion, they present all four of these as high-quality, which could be misleading. The authors discuss what was found in the literature about the role of the bacterial genera/phyla linked to these MAGs in plant rhizospheres, but they do not sufficiently link their own analysis results (metabolic pathway enrichment and biosynthetic gene cluster prediction) to this discussion. The results of these analyses are only presented in tables without further explanation in either the results section or the discussion, even though there may be interesting findings. For example, the authors only discuss the class of the BGCs that were found, but don't search for experimentally verified homologs in databases, which could shed more light on the possible functional roles of BGCs in this microbiome.

      In the conclusions, the authors state: "These analyses uncovered potential metabolic capabilities and biosynthetic potentials that are integral to the rhizosphere's ecological dynamics." I don't see any support for this. Mentioning that certain classes of BGCs are present is not enough to make this claim, in my opinion. Any BGC is likely important for the ecological niche the bacteria live in. The fact that rhizosphere bacteria harbour BGCs is not surprising, and it doesn't tell us more than is already known.

      References:

      Belknap, Kaitlyn C., et al. "Genome mining of biosynthetic and chemotherapeutic gene clusters in Streptomyces bacteria." Scientific reports 10.1 (2020): 2003

      Amos GCA, Awakawa T, Tuttle RN, Letzel AC, Kim MC, Kudo Y, Fenical W, Moore BS, Jensen PR. Comparative transcriptomics as a guide to natural product discovery and biosynthetic gene cluster functionality. Proc Natl Acad Sci U S A. 2017 Dec 26;114(52):E11121-E11130.

      References:

      Belknap, Kaitlyn C., et al. "Genome mining of biosynthetic and chemotherapeutic gene clusters in Streptomyces bacteria." Scientific reports 10.1 (2020): 2003

      Amos GCA, Awakawa T, Tuttle RN, Letzel AC, Kim MC, Kudo Y, Fenical W, Moore BS, Jensen PR. Comparative transcriptomics as a guide to natural product discovery and biosynthetic gene cluster functionality. Proc Natl Acad Sci U S A. 2017 Dec 26;114(52):E11121-E11130.

      Yan Yan, Eiko E Kuramae, Mattias de Hollander, Peter G L Klinkhamer, Johannes A van Veen, Functional traits dominate the diversity-related selection of bacterial communities in the rhizosphere, The ISME Journal, Volume 11, Issue 1, January 2017, Pages 56-66

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript addresses an important methodological issue - the fragility of meta-analytic findings - by extending fragility concepts beyond trial-level analysis. The proposed EOIMETA framework provides a generalizable and analytically tractable approach that complements existing methods such as the traditional Fragility Index and Atal et al.'s algorithm. The findings are significant in showing that even large meta-analyses can be highly fragile, with results overturned by very small numbers of event recodings or additions. The evidence is clearly presented, supported by applications to vitamin D supplementation trials, and contributes meaningfully to ongoing debates about the robustness of meta-analytic evidence. Overall, the strength of evidence is moderate to strong, though some clarifications would further enhance interpretability.

      Strengths:

      (1) The manuscript tackles a highly relevant methodological question on the robustness of meta-analytic evidence.

      (2) EOIMETA represents an innovative extension of fragility concepts from single trials to meta-analyses.

      (3) The applications are clearly presented and highlight the potential importance of fragility considerations for evidence synthesis.

      Weaknesses:

      (1) The rationale and mathematical details behind the proposed EOI and ROAR methods are insufficiently explained. Readers are asked to rely on external sources (Grimes, 2022; 2024b) without adequate exposition here. At a minimum, the definitions, intuition, and key formulas should be summarized in the manuscript to ensure comprehensibility.

      (2) EOIMETA is described as being applicable when heterogeneity is low, but guidance is missing on how to interpret results when heterogeneity is high (e.g., large I²). Clarification in the Results/Discussion is needed, and ideally, a simulation or illustrative example could be added.

      (3) The manuscript would benefit from side-by-side comparisons between the traditional FI at the trial level and EOIMETA at the meta-analytic level. This would contextualize the proposed approach and underscore the added value of EOIMETA.

      (4) Scope of FI: The statement that FI applies only to binary outcomes is inaccurate. While originally developed for dichotomous endpoints, extensions exist (e.g., Continuous Fragility Index, CFI). The manuscript should clarify that EOIMETA focuses on binary outcomes, but FI, as a concept, has been generalized.

    2. Reviewer #2 (Public review):

      Summary:

      The study expands existing analytical tools originally developed for randomized controlled trials with dichotomous outcomes to assess the potential impact of missing data, adapting them for meta-analytical contexts. These tools evaluate how missing data may influence meta-analyses where p-value distributions cluster around significance thresholds, often leading to conflicting meta-analyses addressing the same research question. The approach quantifies the number of recodings (adding events to the experimental group and/or removing events from the control group) required for a meta-analysis to lose or gain statistical significance. The author developed an R package to perform fragility and redaction analyses and to compare these methods with a previously established approach by Atal et al. (2019), also integrated into the package. Overall, the study provides valuable insights by applying existing analytical tools from randomized controlled trials to meta-analytical contexts.

      Strengths:

      The author's results support his claims. Analyzing the fragility of a given meta-analysis could be a valuable approach for identifying early signs of fragility within a specific topic or body of evidence. If fragility is detected alongside results that hover around the significance threshold, adjusting the significance cutoff as a function of sample size should be considered before making any binary decision regarding statistical significance for that body of evidence. Although the primary goal of meta-analysis is effect estimation, conclusions often still rely on threshold-based interpretations, which is understandable. In some of the examples presented by Atal et al. (2019), the event recoding required to shift a meta-analysis from significant to non-significant (or vice versa) produced only minimal changes in the effect size estimation. Therefore, in bodies of evidence where meta-analyses are fragile or where results cluster near the null, it may be appropriate to adjust the cutoff. Conducting such analyses-identifying fragility early and adapting thresholds accordingly-could help flag fragile bodies of evidence and prevent future conflicting meta-analyses on the same question, thereby reducing research waste and improving reproducibility.

      Weaknesses:

      It would be valuable to include additional bodies of conflicting literature in which meta-analyses have demonstrated fragility. This would allow for a more thorough assessment of the consistency of these analytical tools, their differences, and whether this particular body of literature favored one methodology over another. The method proposed by Atal et al. was applied to numerous meta-analyses and demonstrated consistent performance. I believe there is room for improvement, as both the EOI and ROAR appear to be very promising tools for identifying fragility in meta-analytical contexts.

      I believe the manuscript should be improved in terms of reporting, with clearer statements of the study's and methods' limitations, and by incorporating additional bodies of evidence to strengthen its claims.

    3. Reviewer #3 (Public review):

      Summary and strengths:

      In this manuscript, Grimes presents an extension of the Ellipse of Insignificant (EOI) and Region of Attainable Redaction (ROAR) metrics to the meta-analysis setting as metrics for fragility and robustness evaluation of meta-analysis. The author applies these metrics to three meta-analyses of Vitamin D and cancer mortality, finding substantial fragility in their conclusions. Overall, I think extension/adaptation is a conceptually valuable addition to meta-analysis evaluation, and the manuscript is generally well-written.

      Specific comments:

      (1) The manuscript would benefit from a clearer explanation of in what sense EOIMETA is generalizable. The author mentions this several times, but without a clear explanation of what they mean here.

      (2) The authors mentioned the proposed tools assume low between-study heterogeneity. Could the author illustrate mathematically in the paper how the between-study heterogeneity would influence the proposed measures? Moreover, the between-study heterogeneity is high in Zhang et al's 2022 study. It would be a good place to comment on the influence of such high heterogeneity on the results, and specifying a practical heterogeneity cutoff would better guide future users.

      (3) I think clarifying the concepts of "small effect", "fragile result", and "unreliable result" would be helpful for preventing misinterpretation by future users. I am concerned that the audience may be confusing these concepts. A small effect may be related to a fragile meta-analysis result. A fragile meta-analysis doesn't necessarily mean wrong/untrustworthy results. A fragile but precise estimate can still reflect a true effect, but whether that size of true effect is clinically meaningful is another question. Clarifying the effect magnitude, fragility, and reliability in the discussion would be helpful.

    1. Reviewer #1 (Public review):

      The authors used fluorescence microscopy, image analysis, and mathematical modeling to study the effects of membrane affinity and diffusion rates of MinD monomer and dimer states on MinD gradient formation in B. subtilis. To test these effects, the authors experimentally examined MinD mutants that lock the protein in specific states, including Apo monomer (K16A), ATP-bound monomer (G12V) and ATP-bound dimer (D40A, hydrolysis defective), and compared to wild-type MinD. Overall, the experimental results support the conclusions that reversible membrane binding of MinD is critical for the formation of the MinD gradient, but the binding affinities between monomers and dimers are similar.

      The modeling part is a new attempt to use the Monte Carlo method to test the conditions for the formation of the MinD gradient in B. subtilis. The modeling results provide good support for the observations and find that the MinD gradient is sensitive to different diffusion rates between monomers and dimers. This simulation is based on several assumptions and predictions, which raises new questions that need to be addressed experimentally in the future.

    2. Reviewer #3 (Public review):

      This important study by Bohorquez et al examines the determinants necessary for concentrating the spatial modulator of cell division, MinD, at the future site of division and the cell poles. Proper localization of MinD is necessary to bring the division inhibitor, MinC, in proximity to the cell membrane and cell poles where it prevents aberrant assembly of the division machinery. In contrast to E. coli, in which MinD oscillates from pole-to-pole courtesy of a third protein MinE, how MinD localization is achieved in B. subtilis-which does not encode a MinE analog-has remained largely a mystery. The authors present compelling data indicating that MinD dimerization is dispensable for membrane localization but required for concentration at the cell poles. Dimerization is also important for interactions between MinD and MinC, leading to the formation of large protein complexes. Computational modeling, specifically a Monte Carlo simulation, supports a model in which differences in diffusion rates between MinD monomers and dimers lead to concentration of MinD at cell poles. Once there, interaction with MinC increases the size of the complex, further reinforcing diffusion differences. Notably, interactions with MinJ-which has previously been implicated in MinCD localization, are dispensable for concentrating MinD at cell poles although MinJ may help stabilize the MinCD complex at those locations.

      Comments on revisions:

      I believe the authors put respectable effort into revisions and addressing reviewer comments, particularly those that focused on the strengths of the original conclusions. The language in the current version of the manuscript is more precise and the overall product is stronger.

    1. Reviewer #1 (Public review):

      Summary:

      Outstanding fundamental phenomenon (migrasomes) en route to become transitionally highly significant.

      Strengths:

      Innovative approach at several levels: Migrasomes, discovered by DR. Yu's group, are an outstanding biological phenomenon of fundamental interest and now of potentially practical value.

      Weaknesses:

      I feel that the overemphasis on practical aspects (vaccine), however important, eclipses some of the fundamental aspects that may be just as important and actually more interesting. If this can be expanded, the study would be outstanding.

      Comments on revisions: This reviewer feels that the authors have addressed all issues.

    2. Reviewer #2 (Public review):

      Summary:

      The authors report describes a novel vaccine platform derived from a newly discovered organelle called a migrasome. First, the authors address a technical hurdle for using migrasomes as a vaccine platform. Natural migrasome formation occurs at low levels and is labor intensive, however, by understanding the molecular underpinning of migrasome formation, the authors have designed a method to make engineered migrasomes from cultures cells at higher yields utilizing a robust process. These engineered migrasomes behave like natural migrasomes. Next, the authors immunized mice with migrasomes that either expressed a model peptide or the SARS-CoV-2 spike protein. Antibodies against the spike protein were raised that could be boosted by a 2nd vaccination and these antibodies were functional as assessed by an in vitro pseudoviral assay. This new vaccine platform has the potential to overcome obstacles such as cold chain issues for vaccines like messenger RNA that require very stringent storage conditions.

      Strengths:

      The authors present very robust studies detailing the biology behind migrasome formation and this fundamental understanding was used to from engineered migrasomes, which makes it possible to utilize migrasomes as a vaccine platform. The characterization of engineered migrasomes is thorough and establishes comparability with naturally occurring migrasomes. The biophysical characterization of the migrasomes is well done, including thermal stability and characterization of the particle size (important characterizations for a good vaccine).

      Weaknesses:

      With a new vaccine platform technology, it would be nice to compare them head-to-head against a proven technology. The authors would improve the manuscript if they made some comparisons to other vaccine platforms such as a SARS-CoV-2 mRNA vaccine or even an adjuvanted recombinant spike protein. This would demonstrate a migrasome based vaccine could elicit responses comparable to a proven vaccine technology. Additionally, understanding the integrity of the antigens expressed in their migrasomes could be useful. This could be done by looking at functional monoclonal antibody binding to their migrasomes in a confocal microscopy experiment.

      Updates after revision:

      The revised manuscript has additional experiments that I believe improve the strength of evidence presented in the manuscript and address the weaknesses of the first draft. First, they provide a comparison to the antibody responses induced by their migrasome based platform to recombinant protein formulated in an adjuvant and show the response is comparable. Second, they provide evidence that the spike protein incorporated into their migrasomes retains structural integrity by preserving binding to monoclonal antibodies. Together, these results strengthen the paper significantly and support the claims that the novel migrasome based vaccine platform could be a useful in the vaccine development field.

    1. Reviewer #1 (Public review):

      Summary

      This work performed Raman spectral microscopy for E. coli cells with 15 different culture conditions. The author developed a theoretical framework to construct a regression matrix which predicts proteome composition by Raman data. Specifically, this regression matrix is obtained by statistical inference from various experimental conditions. With this model, the authors categorized co-expressed genes and illustrate how proteome stoichiometry is regulated among different culture conditions. Co-expressed gene clusters were investigated and identified as homeostasis core, carbon-source dependent, and stationary phase dependent genes. Overall, the author demonstrates a strong and comprehensive data analysis scheme for the joint analysis of Raman and proteome datasets.

      Strengths and major contributions

      Major contributions: (1) Experimentally, the authors contributed Raman datasets of E. coli with various growth conditions. (2) In data analysis, the authors developed a scheme to compare proteome and Raman datasets. Protein co-expression clusters were identified, and their biological meaning were investigated.

      Discussion and impact for the field

      Raman signature contains both proteomic and metabolomic information and is an orthogonal method to infer the composition biomolecules. This work is a strong initiative for introducing the powerful technique to systems biology and provide a rigorous pipeline for future data analysis. The regression matrix can be used for cross-comparison among future experimental results on proteome-Raman datasets.

      Comments on revisions:

      The authors addressed all my questions nicely. In particular, the subsampling test demonstrated that with enough "distinct" physiological condition (even for m=5) one could already explore the major mode of proteome regulation and Raman signature. The main text has been streamlined and the clarity is improved. I have a minor suggestion:

      (i) For equation (1), it is important to emphasize that the formula works for every j=1,...,15, and the regression matrix B is obtained by statistical inference by summarizing data from all 15 conditions.

    1. Reviewer #1 (Public review):

      Summary:

      The authors recorded neural activity using laminar probes while mice engaged in a global/local visual oddball paradigm. The focus of the article is on oscillatory activity, and found activity differences in theta, alpha/beta, and gamma bands related to predictability and prediction error.

      I think this is an important paper, providing more direct evidence for the role of signals in different frequency bands related to predictability and surprise in the sensory cortex.

      Comments:

      Below are some comments that may hopefully help further improve the quality of this already very interesting manuscript.

      (1) Introduction:

      The authors write in their introduction: "H1 further suggests a role for θ oscillations in prediction error processing as well." Without being fleshed out further, it is unclear what role this would be, or why. Could the authors expand this statement?

      (2) Limited propagation of gamma band signals:

      Some recent work (e.g. https://www.cell.com/cell-reports/fulltext/S2211-1247(23)00503-X) suggests that gamma-band signals reflect mainly entrainment of the fast-spiking interneurons, and don't propagate from V1 to downstream areas. Could the authors connect their findings to these emerging findings, suggesting no role in gamma-band activity in communication outside of the cortical column?

      (3) Paradigm:

      While I agree that the paradigm tests whether a specific type of temporal prediction can be formed, it is not a type of prediction that one would easily observe in mice, or even humans. The regularity that must be learned, in order to be able to see a reflection of predictability, integrates over 4 stimuli, each shown for 500 ms with a 500 ms blank in between (and a 1000 ms interval separating the 4th stimulus from the 1st stimulus of the next sequence). In other words, the mouse must keep in working memory three stimuli, which partly occurred more than a second ago, in order to correctly predict the fourth stimulus (and signal a 1000 ms interval as evidence for starting a new sequence).

      A problem with this paradigm is that positive findings are easier to interpret than negative findings. If mice do not show a modulation to the global oddball, is it because "predictive coding" is the wrong hypothesis, or simply because the authors generated a design that operates outside of the boundary conditions of the theory? I think the latter is more plausible. Even in more complex animals, (eg monkeys or humans), I suspect that participants would have trouble picking up this regularity and sequence, unless it is directly task-relevant (which it is not, in the current setting). Previous experiments often used simple pairs (where transitional probability was varied, eg, Meyer and Olson, PNAS 2012) of stimuli that were presented within an intervening blank period. Clearly, these regularities would be a lot simpler to learn than the highly complex and temporally spread-out regularity used here, facilitating the interpretation of negative findings (especially in early cortical areas, which are known to have relatively small temporal receptive fields).

      I am, of course, not asking the authors to redesign their study. I would like to ask them to discuss this caveat more clearly, in the Introduction and Discussion, and situate their design in the broader literature. For example, Jeff Gavornik has used much more rapid stimulus designs and observed clear modulations of spiking activity in early visual regions. I realize that this caveat may be more relevant for the spiking paper (which does not show any spiking activity modulation in V1 by global predictability) than for the current paper, but I still think it is an important general caveat to point out.

      (4) Reporting of results:

      I did not see any quantification of the strength of evidence of any of the results, beyond a general statement that all reported results pass significance at an alpha=0.01 threshold. It would be informative to know, for all reported results, what exactly the p-value of the significant cluster is; as well as for which performed tests there was no significant difference.

      (5) Cluster test:

      The authors use a three-dimensional cluster test, clustering across time, frequency, and location/channel. I am wondering how meaningful this analytical approach is. For example, there could be clusters that show an early difference at some location in low frequencies, and then a later difference in a different frequency band at another (adjacent) location. It seems a priori illogical to me to want to cluster across all these dimensions together, given that this kind of clustering does not appear neurophysiologically implausible/not meaningful. Can the authors motivate their choice of three-dimensional clustering, or better, facilitating interpretability, cluster eg at space and time within specific frequency bands (2d clustering)?

    2. Reviewer #2 (Public review):

      Summary:

      Sennesh and colleagues analyzed LFP data from 6 regions of rodents while they were habituated to a stimulus sequence containing a local oddball (xxxy) and later exposed to either the same (xxxY) or a deviant global oddball (xxxX). Subsequently, they were exposed to a controlled random sequence (XXXY) or a controlled deterministic sequence (xxxx or yyyy). From these, the authors looked for differences in spectral properties (both oscillatory and aperiodic) between three contrasts (only for the last stimulus of the sequence).

      (1) Deviance detection: unpredictable random (XXXY) versus predictable habituation (xxxy)

      (2) Global oddball: unpredictable global oddball (xxxX) versus predictable deterministic (xxxx), and

      (3) "Stimulus-specific adaptation:" locally unpredictable oddball (xxxY) versus predictable deterministic (yyyy).

      They found evidence for an increase in gamma (and theta in some cases) for unpredictable versus predictable stimuli, and a reduction in alpha/beta, which they consider evidence towards the "predictive routing" scheme.

      While the dataset and analyses are well-suited to test evidence for predictive coding versus alternative hypotheses, I felt that the formulation was ambiguous, and the results were not very clear. My major concerns are as follows:

      (1) The authors set up three competing hypotheses, in which H1 and H2 make directly opposite predictions. However, it must be noted that H2 is proposed for spatial prediction, where the predictability is computed from the part of the image outside the RF. This is different from the temporal prediction that is tested here. Evidence in favor of H2 is readily observed when large gratings are presented, for which there is substantially more gamma than in small images. Actually, there are multiple features in the spectral domain that should not be conflated, namely (i) the transient broadband response, which includes all frequencies, (ii) contribution from the evoked response (ERP), which is often in frequencies below 30 Hz, (iii) narrow-band gamma oscillations which are produced by large and continuous stimuli (which happen to be highly predictive), and (iv) sustained low-frequency rhythms in theta and alpha/beta bands which are prominent before stimulus onset and reduce after ~200 ms of stimulus onset. The authors should be careful to incorporate these in their formulation of PC, and in particular should not conflate narrow-band and broadband gamma.

      (2) My understanding is that any aspect of predictive coding must be present before the onset of stimulus (expected or unexpected). So, I was surprised to see that the authors have shown the results only after stimulus onset. For all figures, the authors should show results from -500 ms to 500 ms instead of zero to 500 ms.

      (3) In many cases, some change is observed in the initial ~100 ms of stimulus onset, especially for the alpha/beta and theta ranges. However, the evoked response contributes substantially in the transient period in these frequencies, and this evoked response could be different for different conditions. The authors should show the evoked responses to confirm the same, and if the claim really is that predictions are carried by genuine "oscillatory" activity, show the results after removing the ERP (as they had done for the CSD analysis).

      (4) I was surprised by the statistics used in the plots. Anything that is even slightly positive or negative is turning out to be significant. Perhaps the authors could use a more stringent criterion for multiple comparisons?

      (5) Since the design is blocked, there might be changes in global arousal levels. This is particularly important because the more predictive stimuli in the controlled deterministic stimuli were presented towards the end of the session, when the animal is likely less motivated. One idea to check for this is to do the analysis on the 3rd stimulus instead of the 4th? Any general effect of arousal/attention will be reflected in this stimulus.

      (6) The authors should also acknowledge/discuss that typical stimulus presentation/attention modulation involves both (i) an increase in broadband power early on and (ii) a reduction in low-frequency alpha/beta power. This could be just a sensory response, without having a role in sending prediction signals per se. So the predictive routing hypothesis should involve testing for signatures of prediction while ruling out other confounds related to stimulus/cognition. It is, of course, very difficult to do so, but at the same time, simply showing a reduction in low-frequency power coupled with an increase in high-frequency power is not sufficient to prove PR.

      (7) The CSD results need to be explained better - you should explain on what basis they are being called feedforward/feedback. Was LFP taken from Layer 4 LFP (as was done by van Kerkoerle et al, 2014)? The nice ">" and "<" CSD patterns (Figure 3B and 3F of their paper) in that paper are barely observed in this case, especially for the alpha/beta range.

      (8) Figure 4a-c, I don't see a reduction in the broadband signal in a compared to b in the initial segment. Maybe change the clim to make this clearer?

      (9) Figure 5 - please show the same for all three frequency ranges, show all bars (including the non-significant ones), and indicate the significance (p-values or by *, **, ***, etc) as done usually for bar plots.

      (10) Their claim of alpha/beta oscillations being suppressed for unpredictable conditions is not as evident. A figure akin to Figure 5 would be helpful to see if this assertion holds.

      (11) To investigate the prediction and violation or confirmation of expectation, it would help to look at both the baseline and stimulus periods in the analyses.

    3. Reviewer #3 (Public review):

      Summary:

      In their manuscript entitled "Ubiquitous predictive processing in the spectral domain of sensory cortex", Sennesh and colleagues perform spectral analysis across multiple layers and areas in the visual system of mice. Their results are timely and interesting as they provide a complement to a study from the same lab focussed on firing rates, instead of oscillations. Together, the present study argues for a hypothesis called predictive routing, which argues that non-predictable stimuli are gated by Gamma oscillations, while alpha/beta oscillations are related to predictions.

      Strengths:

      (1) The study contains a clear introduction, which provides a clear contrast between a number of relevant theories in the field, including their hypotheses in relation to the present data set.

      (2) The study provides a systematic analysis across multiple areas and layers of the visual cortex.

      Weaknesses:

      (1) It is claimed in the abstract that the present study supports predictive routing over predictive coding; however, this claim is nowhere in the manuscript directly substantiated. Not even the differences are clearly laid out, much less tested explicitly. While this might be obvious to the authors, it remains completely opaque to the reader, e.g., as it is also not part of the different hypotheses addressed. I guess this result is meant in contrast to reference 17, by some of the same authors, which argues against predictive coding, while the present work finds differences in the results, which they relate to spectral vs firing rate analysis (although without direct comparison).

      (2) Most of the claims about a direction of propagation of certain frequency-related activities (made in the context of Figures 2-4) are - to the eyes of the reviewer - not supported by actual analysis but glimpsed from the pictures, sometimes, with very little evidence/very small time differences to go on. To keep these claims, proper statistical testing should be performed.

      (3) Results from different areas are barely presented. While I can see that presenting them in the same format as Figures 2-4 would be quite lengthy, it might be a good idea to contrast the right columns (difference plots) across areas, rather than just the overall averages.

      (4) Statistical testing is treated very generally, which can help to improve the readability of the text; however, in the present case, this is a bit extreme, with even obvious tests not reported or not even performed (in particular in Figure 5).

      (5) The description of the analysis in the methods is rather short and, to my eye, was missing one of the key descriptions, i.e., how the CSD plots were baselined (which was hinted at in the results, but, as far as I know, not clearly described in the analysis methods). Maybe the authors could section the methods more to point out where this is discussed.

      (6) While I appreciate the efforts of the authors to formulate their hypotheses and test them clearly, the text is quite dense at times. Partly this is due to the compared conditions in this paradigm; however, it would help a lot to show a visualization of what is being compared in Figures 2-4, rather than just showing the results.

    1. Reviewer #1 (Public review):

      Summary:

      This study develops and validates a neural subspace similarity analysis for testing whether neural representations of graph structures generalize across graph size and stimulus sets. The authors show the method works in rat grid and place cell data, finding that grid but not place cells generalize across different environments, as expected. The authors then perform additional analyses and simulations to show that this method should also work on fMRI data. Finally, the authors test their method on fMRI responses from entorhinal cortex (EC) in a task that involves graphs that vary in size (and stimulus set) and statistical structure (hexagonal and community). They find neural representations of stimulus sets in lateral occipital complex (LOC) generalize across statistical structure and that EC activity generalizes across stimulus sets/graph size, but only for the hexagonal structures.

      Strengths:

      (1) The overall topic is very interesting and timely and the manuscript is well written.

      (2) The method is clever and powerful. It could be important for future research testing whether neural representations are aligned across problems with different state manifestations.

      (3) The findings provide new insights into generalizable neural representations of abstract task states in entorhinal cortex.

      Weaknesses:

      (1) There are two design confounds that are not sufficiently discussed.

      (1.1) First, hexagonal and community structures are confounded by training order. All subjects learned the hexagonal graph always before the community graph. As such, any differences between the two graphs could be explained (in theory) by order effects (although this is unlikely). However, because community and hexagonal structures shared the same stimuli, it is possible that subjects had to find ways to represent the community structures separately from the hexagonal structures. This could potentially explain why there was no generalization across graph size for community structures.

      (1.2) Second, subjects had more experience with the hexagonal and community structures before and during fMRI scanning. This is another possible reason why there was no generalization for the community structure.

      (2) The authors include the results from a searchlight analysis to show specificity of the effects for EC. A more convincing way (in my opinion) to show specificity would be to test for (and report the results) of a double dissociation between the visual and structural contrast in two independently defined regions (e.g., anatomical ROIs of LOC and EC). This would substantiate the point that EC activity generalizes across structural similarity while sensory regions like LOC generalize across visual similarity.

    2. Reviewer #2 (Public review):

      Summary:

      Mark and colleagues test the hypothesis that entorhinal cortical representations may contain abstract structural information that facilitates generalization across structurally similar contexts. To do so, they use a method called "subspace generalization" designed to measure abstraction of representations across different settings. The authors validate the method using hippocampal place cells and entorhinal grid cells recorded in a spatial task, then show perform simulations that support that it might be useful in aggregated responses such as those measured with fMRI. Then the method is applied to an fMRI data that required participants to learn relationships between images in one of two structural motifs (hexagonal grids versus community structure). They show that the BOLD signal within an entorhinal ROI shows increased measures of subspace generalization across different tasks with the same hexagonal structure (as compared to tasks with different structures) but that there was not evidence for the complementary result (ie. increased generalization across tasks that share community structure, as compared to those with different structures). Taken together, this manuscript describes and validates a method for identifying fMRI representations that generalize across conditions and applies it to reveal that entorhinal representations that emerge across specific shared structural conditions.

      Strengths:

      I found this paper interesting both in terms of its methods and its motivating questions. The question asked is novel and the methods employed are new - and I believe this is the first time that they have been applied to fMRI data. I also found the iterative validation of the methodology to be interesting and important - showing persuasively that the method could detect a target representation - even in the face of random combination of tuning and with the addition of noise, both being major hurdles to investigating representations using fMRI.

      Weaknesses:

      The primary weakness of the paper in terms of empirical results is that the representations identified in EC had no clear relationship to behavior, raising questions about their functional importance.

      The method developed is a clearly valuable tool that can serve as part of a larger battery of analysis techniques, but a small weakness on the methodological side is that for a given dataset, it might be hard to determine whether the method developed here would be better or worse than alternative methods.

    3. Reviewer #3 (Public review):

      Summary:

      The article explores the brain's ability to generalize information, with a specific focus on the entorhinal cortex (EC) and its role in learning and representing structural regularities that define relationships between entities in networks. The research provides empirical support for the longstanding theoretical and computational neuroscience hypothesis that the EC is crucial for structure generalization. It demonstrates that EC codes can generalize across non-spatial tasks that share common structural regularities, regardless of the similarity of sensory stimuli and network size.

      Strengths:

      At first glance, a potential limitation of this study appears to be its application of analytical methods originally developed for high-resolution animal electrophysiology (Samborska et al., 2022) to the relatively coarse and noisy signals of human fMRI. Rather than sidestepping this issue, however, the authors embrace it as a methodological challenge. They provide compelling empirical evidence and biologically grounded simulations to show that key generalization properties of entorhinal cortex representations can still be robustly detected. This not only validates their approach but also demonstrates how far non-invasive human neuroimaging can be pushed. The use of multiple independent datasets and carefully controlled permutation tests further underscores the reliability of their findings, making a strong case that structural generalization across diverse task environments can be meaningfully studied even in abstract, non-spatial domains that are otherwise difficult to investigate in animal models.

      Weaknesses:

      While this study provides compelling evidence for structural generalization in the entorhinal cortex (EC), several limitations remain that pave the way for promising future research. One issue is that the generalization effect was statistically robust in only one task condition, with weaker effects observed in the "community" condition. This raises the question of whether the null result genuinely reflects a lack of EC involvement, or whether it might be attributable to other factors such as task complexity, training order, or insufficient exposure possibilities that the authors acknowledge as open questions. Moreover, although the study leverages fMRI to examine EC representations in humans, it does not clarify which specific components of EC coding-such as grid cells versus other spatially tuned but non-grid codes-underlie the observed generalization. While electrophysiological data in animals have begun to address this, the human experiments do not disentangle the contributions of these different coding types. This leaves unresolved the important question of what makes EC representations uniquely suited for generalization, particularly given that similar effects were not observed in other regions known to contain grid cells, such as the medial prefrontal cortex (mPFC) or posterior cingulate cortex (PCC). These limitations point to important future directions for better characterizing the computational role of the EC and its distinctiveness within the broader network supporting learning and decision making based on cognitive maps.

    1. Reviewer #1 (Public review):

      The authors present exciting new experimental data on the antigenic recognition of 78 H3N2 strains (from the beginning of the 2023 Northern Hemisphere season) against a set of 150 serum samples. The authors compare protection profiles of individual sera and find that the antigenic effect of amino acid substitutions at specific sites depends on the immune class of the sera, differentiating between children and adults. Person-to-person heterogeneity in the measured titers is strong, specifically in the group of children's sera. The authors find that the fraction of sera with low titers correlates with the inferred growth rate using maximum likelihood regression (MLR), a correlation that does not hold for pooled sera. The authors then measure the protection profile of the sera against historical vaccine strains and find that it can be explained by birth cohort for children. Finally, the authors present data comparing pre- and post- vaccination protection profiles for 39 (USA) and 8 (Australia) adults. The data shows a cohort-specific vaccination effect as measured by the average titer increase, and also a virus-specific vaccination effect for the historical vaccine strains. The generated data is shared by the authors and they also note that these methods can be applied to inform the bi-annual vaccine composition meetings, which could be highly valuable.

      Thanks to the authors for the revised version of the manuscript. A few concerns remain after the revision:

      (1) We appreciate the additional computational analysis the authors have performed on normalizing the titers with the geometric mean titer for each individual, as shown in the new Supplemental Figure 6. We agree with the authors statement that, after averaging again within specific age groups, "there are no obvious age group-specific patterns." A discussion of this should be added to the revised manuscript, for example in the section "Pooled sera fail to capture the heterogeneity of individual sera," referring to the new Supplemental Figure 6.

      However, we also suggested that after this normalization, patterns might emerge that are not necessarily defined by birth cohort. This possibility remains unexplored and could provide an interesting addition to support potential effects of substitutions at sites 145 and 275/276 in individuals with specific titer profiles, which as stated above do not necessarily follow birth cohort patterns.

      (2) Thank you for elaborating further on the method used to estimate growth rates in your reply to the reviewers. To clarify: the reason that we infer from Fig. 5a that A/Massachusetts has a higher fitness than A/Sydney is not because it reaches a higher maximum frequency, but because it seems to have a higher slope. The discrepancy between this plot and the MLR inferred fitness could be clarified by plotting the frequency trajectories on a log-scale.

      For the MLR, we understand that the initial frequency matters in assessing a variant's growth. However, when starting points of two clades differ in time (i.e., in different contexts of competing clades), this affects comparability, particularly between A/Massachusetts and A/Ontario, as well as for other strains. We still think that mentioning these time-dependent effects, which are not captured by the MLR analysis, would be appropriate. To support this, it could be helpful to include the MLR fits as an appendix figure, showing the different starting and/or time points used.

      (3) Regarding my previous suggestion to test an older vaccine strain than A/Texas/50/2012 to assess whether the observed peak in titer measurements is virus-specific: We understand that the authors want to focus the scope of this paper on the relative fitness of contemporary strains, and that this additional experimental effort would go beyond the main objectives outlined in this manuscript. However, the authors explicitly note that "Adults across age groups also have their highest titers to the oldest vaccine strain tested, consistent with the fact that these adults were first imprinted by exposure to an older strain." This statement gives the impression that imprinting effects increase titers for older strains, whereas this does not seem to be true from their results, but only true for A/Texas. It should be modified accordingly.

    2. Reviewer #2 (Public review):

      This is an excellent paper. The ability to measure the immune response to multiple viruses in parallel is a major advancement for the field, that will be relevant across pathogens (assuming the assay can be appropriately adapted). I only had a few comments, focused on maximising the information provided by the sera. These concerns were all addressed in the revised paper.

    3. Reviewer #3 (Public review):

      The authors use high throughput neutralisation data to explore how different summary statistics for population immune responses relate to strain success, as measured by growth rate during the 2023 season. The question of how serological measurements relate to epidemic growth is an important one, and I thought the authors present a thoughtful analysis tackling this question, with some clear figures. In particular, they found that stratifying the population based on the magnitude of their antibody titres correlates more with strain growth than using measurements derived from pooled serum data. The updated manuscript has a stronger motivation, and there is substantial potential to build on this work in future research.

      Comments on revisions:

      I have no additional recommendations. There are several areas where the work could be further developed, which were not addressed in detail in the responses, but given this is a strong manuscript as it stands, it is fine that these aspects are for consideration only at this point.

    1. Reviewer #1 (Public review):

      Summary:

      This study provides evidence that neuropeptide signaling, particularly via the CRH-CRHBP pathway, plays a key role in regulating the precision of vocal motor output in songbirds. By integrating gene expression profiling with targeted manipulations in the song vocal motor nucleus RA, the authors demonstrate that altering CRH and CRHBP levels bidirectionally modulate song variability. These findings reveal a previously unrecognized neuropeptidergic mechanism underlying motor performance control, supported by molecular and functional evidence.

      Strengths:

      Neural circuit mechanisms underlying motor variability have been intensively studied, yet the molecular bases of such variability remain poorly understood. The authors address this important gap using the songbird (Bengalese finch) as a model system for motor learning, providing experimental evidence that neuropeptide signaling contributes to vocal motor variability. They comprehensively characterize the expression patterns of neuropeptide-related genes in brain regions involved in song vocal learning and production, revealing distinct regulatory profiles compared to non-vocal related regions, as well as developmental, revealing distinct regulatory profiles compared to non-vocal regions, as well as developmental and behavioral dependencies, including altered expression following deafening and correlations with singing activity over the two days preceding sampling. Through these multi-level analyses spanning anatomy, development, and behavior, the authors identify the CRH-CRHBP pathway in the vocal motor nucleus RA as a candidate regulator of song variability. Functional manipulations further demonstrate that modulation of this pathway bidirectionally alters song variability.

      Overall, this work represents an effective use of songbirds, though a well-established neuroethological framework uncovers how previously uncharacterized molecular pathways shape behavioral output at the individual level.

      Weaknesses:

      (1) This study uses Bengalese finches (BFs) for all experiments-bulk RNA-seq, in situ hybridization across developmental stages, deafening, gene manipulation, and CRH microinfusion-except for the sc/snRNA-seq analysis. BFs differ from zebra finches (ZFs) in several important ways, including faster song degradation after deafening and greater syllable sequence complexity. This study makes effective use of these unique BF characteristics and should be commended for doing so.

      However, the major concern lies in the use of the single-cell/single-nucleus RNA-seq dataset from Colquitt et al. (2021), which combines data from both ZFs and BFs for cell-type classification. Based on our reanalysis of the publicly available dataset used in both Colquitt et al. (2021) and the present study, my lab identified two major issues:

      (a) The first concern is that the quality of the single-cell RNA-seq data from BFs is extremely poor, and the number of BF-derived cells is very limited. In other words, most of the gene expression information at the single-cell (or "subcellular type") level in this study likely reflects ZF rather than BF profiles. In our verification of the authors' publicly annotated data, we found that in the song nucleus RA, only about 18 glutamatergic cells (2.3%) of a total of 787 RA_Glut (RA_Glut1+2+3) cells were derived from BFs. Similarly, in HVC, only 53 cells (4.1%) out of 1,278 Glut1+Glut4 cells were BF-derived. This clearly indicates that the cell-subtype-level expression data discussed in this study are predominantly based on ZF, not BF, expression profiles.

      Recent studies have begun to report interspecies differences in the expression of many genes in the song control nuclei. It is therefore highly plausible that the expression patterns of CRHBP and other neuropeptide-signaling-related genes differ between ZFs and BFs. Yet, the current study does not appear to take this potential species difference into account. As a result, analyses such as the CellChat results (Fig. 2F and G) and the model proposed in Fig. 6G are based on ZF-derived transcriptomic information, even though the rest of the experimental data are derived from BF, which raises a critical methodological inconsistency.

      (b) The second major concern involves the definition of "subcellular types" in the sc/snRNA-seq dataset. Specifically, the RA_Glut1, 2, and 3 and HVC_Glu1 and 4 clusters-classified as glutamatergic projection neuron subtypes-may in fact represent inter-individual variation within the same cell type rather than true subtypes. Following Colquitt et al. (2021), Toji et al. (PNAS, 2024) demonstrated clear individual differences in the gene expression profiles of glutamatergic projection neurons in RA.

      In our reanalysis of the same dataset, we also observed multiple clusters representing the same glutamatergic projection neurons in UMAP space. This likely occurs because Seurat integration (anchor-based mutual nearest neighbor integration) was not applied, and because cells were not classified based on individual SNP information using tools such as Souporcell. When classified by individual SNPs, we confirmed that the RA_Glut1-3 and HVC_Glu1 and 4 clusters correspond simply to cells from different individuals rather than distinct subcellular types. (Although images cannot be attached in this review system, we can provide our analysis results if necessary.)

      This distinction is crucial, as subsequent analyses and interpretations throughout the manuscript depend on this classification. In particular, Figure 6G presents a model based on this questionable subcellular classification. Similarly, the ligand-receptor relationships shown in Figure 2G - such as the absence of SST-SSTR1 signaling in RA_Glut3 but its presence in RA_Glut1 and 2-are more plausibly explained by inter-individual variation rather than subcellular-type specificity.

      Whether these differences are interpreted as individual variation within a single cell type or as differences in projection targets among glutamatergic neurons has major implications for understanding the biological meaning of neuropeptide-related gene expression in this system.

      (2) Based on the important finding that "CRHBP expression in the song motor pathway is correlated with singing," it is necessary to provide data showing that the observed changes in CRHBP and other neuropeptide-related gene expression during the song learning period or after deafening are not merely due to differences in singing amount over the two days preceding brain sampling.

      Without such data, the following statement cannot be justified: "Regarding CRHBP expression in the song motor pathway increases during song acquisition and decreases following deafening."

      (3) In Figure 5B, the authors should clearly distinguish between intact and deafened birds and show the singing amount for each group. In practice, deafening often leads to a reduction in both the number of song bouts and the total singing time. If, in this experiment, deafened birds also exhibited reduced singing compared to intact birds, then the decreased CRHBP expression observed in HVC and RA (Figures 3 and 4) may not reflect song deterioration, but rather a simple reduction in singing activity.

      As a similar viewpoint, the authors report that CRHBP expression levels in RA and HVC increase with age during the song learning period. However, this change may not be directly related to age or the decline in vocal plasticity. Instead, it could correlate with the singing amount during the one to two days preceding brain sampling. The authors should provide data on the singing activity of the birds used for in situ hybridization during the two days prior to sampling.

    2. Reviewer #2 (Public review):

      Summary:

      The results presented here are a useful extension of two of their previous papers (Colquitt et al 2021, Colquitt et al 2023), where they used single-cell transcriptomics to characterize the inhibitory and excitatory cell types and gene expression patterns of the song circuit, comparing them to mammalian and reptilian brains, and characterized the effect of deafening on these gene expression patterns. In this paper, they focus on the differential expression of various neuropeptidergic systems in the songbird brain. They discover a role for the CRHBP gene in song performance and causally show its influence on song variability.

      Strengths:

      The authors leverage the advantages of the 'nucleated' structure of the songbird neural circuitry and use a robust approach to compare neuropeptidergic gene expression patterns in these circuits. Their analysis of the expression patterns of the CRHBP gene in different cell types supports their conclusion that interneurons are particularly amenable to this modulation. Their use of a knockdown strategy along with pharmacological manipulation provides strong support for a causal role of neuropeptidergic modulation on song behaviour. These results have important implications as they bring into focus neuropeptide modulation of the song-motor circuit and pave the way for future studies focussing on how this signalling pathway regulates plasticity during song learning and maintenance.

      Weaknesses:

      While the results demonstrating the bidirectional modulation of CRH and CRHBP on song performance shed light on their role in song plasticity, it would be important to show this in juvenile finches during sensorimotor learning. We also don't get a clear picture of the 'causal' role of this signalling pathway on the song pre-motor area, HVC, as the knockdown and pharmacological manipulation studies were done in RA, whereas we see a modulation of CRHBP expression during deafening and song learning in both RA and HVC. Given the role of interneurons in the HVC in song acquisition (e.g., Vallentin et al. 2016, Science), it would have been interesting to see the results of HVC-specific manipulation of this neuropeptidergic pathway and/or how it affects the song learning process. Perhaps a short discussion of this would help to give the readers some perspective. Finally, a more direct demonstration of the neurophysiological effect of the signalling pathway would also strengthen our understanding of precisely how these modulate the song circuit plasticity, which I understand might be beyond the scope of this study.

      Technical/minor:

      In the Methods section, several clarifications would be beneficial. For instance, the description of the design matrices would benefit from being presented in a more general statistical form (e.g., linear model equations) rather than using R syntax. This would make the modeling approach more accessible to readers unfamiliar with software-specific syntax. In addition, while some variables (e.g., cdr_scale, frac_mito_scale) are briefly defined, others (e.g., tags, cut3,nsongs_last_two_days_cut3) could be more clearly described. This applies to the descriptions of both the gene set enrichment analysis and the neuropeptide-receptor analysis, which rely heavily on package-specific terminology (e.g., fgseaMultilevel, computeCommunProb), making it difficult for readers to understand the conceptual or statistical basis of the analyses. It would improve clarity if the authors provided a complete list of variable definitions, types (categorical or continuous), and any scaling/transformations applied would enhance clarity and reproducibility.

    3. Reviewer #3 (Public review):

      Summary:

      The stable production of learned vocalizations like human language and birdsong requires auditory feedback. What happens in the brain areas that generate stable vocalizations as performance deteriorates is not well understood. Using a species of songbird, the current study investigates individual cells within the evolutionarily-conserved brain regions that generate learned vocalizations to describe that the complement of neuropeptide (short proteins) signals may be a key feature of behavioral change. Because neuropeptides are important across species, these findings may help explain diminishing stability in learned behaviors even in humans.

      Strengths:

      The experiments are solid and follow a strong progression from description through manipulation. The songbird model is appropriate and powerful to inform on generalizable biological mechanisms of precisely learned behaviors, including human speech.

      Weaknesses:

      While it is always possible to perform more experiments, most of the weaknesses are in the presentation of the project, not in the evidence or analysis, which are leading-edge and appropriate. Generally, the ability to follow the findings and to independently assess rigor would be enhanced with increased explicit mention of the statistical thresholds and subjective descriptions. In addition, two prior pieces of relevant work seem to be omitted, including one performing deafening, gene expression measures, and behavioral assessment in zebra finches, and another describing neuropeptide complements in zebra finch singing nuclei based largely on mass spectrometry. The former in particular should be related to the current findings.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aim to investigate the mechanisms underlying Kupffer cell death in metabolic-associated steatotic liver disease (MASLD). The authors propose that KCs undergo massive cell death in MASLD and that glycolysis drives this process. However, there appears to be a discrepancy between the reported high rates of KC death and the apparent maintenance of KC homeostasis and replacement capacity.

      Strengths:

      This is an in vivo study.

      Weaknesses:

      There are discrepancies between the authors' observations and previous reports, as well as inconsistencies among their own findings.

      Before presenting the percentage of CLEC4F⁺TUNEL⁺ cells, the authors should have first shown the number of CLEC4F⁺ cells per unit area in Figure 1. At 16 weeks of age, the proportion of TUNEL⁺ KCs is extremely high (~60%), yet the flow cytometry data indicate that nearly all F4/80⁺ KCs are TIMD4⁺, suggesting an embryonic origin. If such extensive KC death occurred, the proportion of embryonically derived TIMD4⁺ KCs would be expected to decrease substantially. Surprisingly, the proportion of TIMD4⁺ KCs is comparable between chow-fed and 16-week HFHC-fed animals. Thus, the immunostaining and flow cytometry data are inconsistent, making it difficult to explain how massive KC death does not lead to their replacement by monocyte-derived cells.

      These data suggest that despite the reported high rate of cell death among CLEC4F⁺TIMD4⁺ KCs, the population appears to self-maintain, with no evidence of monocyte-derived KC generation in this model, which contradicts several recent studies in the field.

      Moreover, there is no evidence that TIMD4⁺CLEC4F⁺ KCs increase their proliferation rate to compensate for such extensive cell death. If approximately 60% of KCs are dying and no monocyte-derived KCs are recruited, one would expect a much greater decrease in total KC numbers than what is reported.

      It is also unexpected that the maximal rate of KC death occurs at early time points (8 weeks), when the mice have not yet gained substantial weight (Figure 1B). Previous studies have shown that longer feeding periods are typically required to observe the loss of embryo-derived KCs.

      Furthermore, it is surprising that the HFD induces as much KC death as the HFHC and MCD diets. Earlier studies suggested that HFD alone is far less effective than MASH-inducing diets at promoting the replacement of embryonic KCs by monocyte-derived macrophages.

      In Figure 2D, TIMD4 staining appears extremely faint, making the results difficult to interpret. In contrast, the TUNEL signal is strikingly intense and encompasses a large proportion of liver cells (approximately 60% of KCs, 15% of hepatocytes, 20% of hepatic stellate cells, 30% of non-KC macrophages, and a proportion of endothelial cells is also likely affected). This pattern closely resembles that typically observed in mouse models of acute liver failure. Given this apparent extent of cell death, it is unexpected that ALT and AST levels remain low in MASH mice, which is highly unusual.

      No statistical analysis is provided for Figure 5D, and it is unclear which metabolites show statistically significant changes in Figure 5C.

      In addition, there is no evaluation of liver pathology in Clec4f-Cre × Chil1flox/flox mice. It remains possible that the observed effects on KC death result from aggravated liver injury in these animals. There is also no evidence that Chil1 deficiency affects glucose metabolism in KCs in vivo.

      Finally, the authors should include a more direct experimental approach to modulate glycolysis in KCs and assess its causal role in KC death in MASH.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, He et al. set out to investigate the mechanisms behind Kupffer Cell death in MASLD. As has been previously shown, they demonstrate a loss of resident KCs in MASLD in different mouse models. They then go on to show that this correlates with alterations in genes/metabolites associated with glucose metabolism in KCs. To investigate the role of glucose metabolism further, they subject isolated KCs in vitro to different metabolic treatments and assess cleaved caspase 3 staining, demonstrating that KCs show increased Cl. Casp 3 staining upon stimulation of glycolysis. Finally, they use a genetic mouse model (Chil1KO) where they have previously reported that loss of this gene leads to increased glycolysis and validate this finding in BMDMs (KO). They then remove this gene specifically from KCs (Clec4fCre) and show that this leads to increased macrophage death compared with controls.

      Strengths:

      As we do not yet understand why KCs die in MASLD, this manuscript provides some explanation for this finding. The metabolomics is novel and provides insight into KC biology. It could also lead to further investigation; here, it will be important that the full dataset is made available.

      Weaknesses:

      Different diets are known to induce different amounts of KC loss, yet here, all models examined appear to result in 60% KC death. One small field of view of liver tissue is shown as representative to make these claims, but this is not sufficient, as anything can be claimed based on one field of view. Rather, a full tissue slice should be included to allow readers to really assess the level of death. Additionally, there is no consistency between the markers used to define KCs and moMFs, with CLEC4F being used in microscopy, TIM4 in flow, while the authors themselves acknowledge that moKCs are CLEC4F+TIM4-. As moKCs are induced in MASLD, this limits interpretation. Additionally, Iba1 is referred to as a moMF marker but is also expressed by KCs, which again prevents an accurate interpretation of the data. Indeed, the authors show 60% of KCs are dying but only 30% of IBA1+ moMFs, as KCs are also IBA1+, this would mean that KCs die much more than moMFs, which would then limit the relevance of the BMDM studies performed if the phenotype is KC specific. Therefore, this needs to be clarified. The claim that periportal KCs die preferentially is not supported, given that the majority of KCs are peri-portal. Rather, these results would need to be normalised to KC numbers in PP vs PC regions to make meaningful conclusions. Additionally, KCs are known to be notoriously difficult to keep alive in vitro, and for these studies, the authors only examine cl. Casp 3 staining. To fully understand that data, a full analysis of the viability of the cells and whether they retain the KC phenotype in all conditions is required. Finally, in the Cre-driven KO model, there does not seem to be any death of KCs in the controls (rather numbers trend towards an increase with time on diet, Figure 6E), contrary to what had been claimed in the rest of the paper, again making it difficult to interpret the overall results. Additionally, there is no validation that the increased death observed in vivo in KCs is due to further promotion of glycolysis.

    3. Reviewer #3 (Public review):

      This manuscript provides novel insights into altered glucose metabolism and KC status during early MASLD. The authors propose that hyperactivated glycolysis drives a spatially patterned KC depletion that is more pronounced than the loss of hepatocytes or hepatic stellate cells. This concept significantly enhances our understanding of early MASLD progression and KC metabolic phenotype.

      Through a combination of TUNEL staining and MS-based metabolomic analyses of KCs from HFHC-fed mice, the authors show increased KC apoptosis alongside dysregulation of glycolysis and the pentose phosphate pathway. Using in vitro culture systems and KC-specific ablation of Chil1, a regulator of glycolytic flux, they further show that elevated glycolysis can promote KC apoptosis.

      However, it remains unclear whether the observed metabolic dysregulation directly causes KC death or whether secondary factors, such as low-grade inflammation or macrophage activation, also contribute significantly. Nonetheless, the results, particularly those derived from the Chil1-ablated model, point to a new potential target for the early prevention of KC death during MASLD progression.

      The manuscript is clearly written and thoughtfully addresses key limitations in the field, especially the focus on glycolytic intermediates rather than fatty acid oxidation. The authors acknowledge the missing mechanistic link between increased glycolysis and KC death. Still, several interpretations require moderation to avoid overstatement, and certain experimental details, particularly those concerning flow cytometry and population gating, need further clarification.

      Strengths:

      (1) The study presents the novel observation of profound metabolic dysregulation in KCs during early MASLD and identifies these cells as undergoing apoptosis. The finding that Chil1 ablation aggravates this phenotype opens new avenues for exploring therapeutic strategies to mitigate or reverse MASLD progression.

      (2) The authors provide a comprehensive metabolic profile of KCs following HFHC diet exposure, including quantification of individual metabolites. They further delineate alterations in glycolysis and the pentose phosphate pathway in Chil1-deficient cells, substantiating enhanced glycolytic flux through 13C-glucose tracing experiments.

      (3) The data underscore the critical importance of maintaining balanced glucose metabolism in both in vitro and in vivo contexts to prevent KC apoptosis, emphasizing the high metabolic specialization of these cells.

      (4) The observed increase in KC death in Chil1-deficient KCs demonstrates their dependence on tightly regulated glycolysis, particularly under pathological conditions such as early MASLD.

      Weaknesses:

      (1) The novelty is questionable. The presented work has considerable overlap with a study by the same lab, which is currently under review (citation 17), and it should be considered whether the data should not be presented in one paper.

      (2) The authors report that 60% of KCs are TUNEL-positive after 16 weeks of HFHC diet and confirm this by cleaved caspase-3 staining. Given that such marker positivity typically indicates imminent cell death within hours, it is unexpected that more extensive KC depletion or monocyte infiltration is not observed. Since Timd4 expression on monocyte-derived macrophages takes roughly one month to establish, the authors should consider whether these TUNEL-positive KCs persist in a pre-apoptotic state longer than anticipated. Alternatively, fate-mapping experiments could clarify the dynamics of KC death and replacement.

      (3) The mechanistic link between elevated glycolytic flux and KC death remains unclear.

      (4) The study does not address the polarization or ontogeny of KCs during early MASLD. Given that pro-inflammatory macrophages preferentially utilize glycolysis, such data could provide valuable insight into the reason for increased KC death beyond the presented hyperreliance on glycolysis.

      (5) The gating strategy for monocyte-derived macrophages (moMFs) appears suboptimal and may include monocytes. A more rigorous characterization of myeloid populations by including additional markers would strengthen the study's conclusions.

      (6) While BMDMs from Chil1 knockout mice are used to demonstrate enhanced glycolytic flux, it remains unclear whether Chil1 deficiency affects macrophage differentiation itself.

      (7) The authors use the PDK activator PS48 and the ATP synthase inhibitor oligomycin to argue that increased glycolytic flux at the expense of OXPHOS promotes KC death. However, given the high energy demands of KCs and the fact that OXPHOS yields 15-16 times more ATP per glucose molecule than glycolysis, the increased apoptosis observed in Figure 4C-F could primarily reflect energy deprivation rather than a glycolysis-specific mechanism.

      (8) In Figure 1C, KC numbers are significantly reduced after 4 and 16 weeks of HFHC diet in WT male mice, yet no comparable reduction is seen in Clec4Cre control mice, which should theoretically exhibit similar behavior under identical conditions.

    1. Reviewer #1 (Public review):

      Summary:

      This study addresses the emerging role of fungal pathogens in colorectal cancer and provides mechanistic insights into how Candida albicans may influence tumor-promoting pathways. While the work is potentially impactful and the experiments are carefully executed, the strength of evidence is limited by reliance on in vitro models, small patient sample size, and the absence of in vivo validation, which reduces the translational significance of the findings.

      Strengths:

      (1) Comprehensive mechanistic dissection of intracellular signaling pathways.

      (2) Broad use of pharmacological inhibitors and cell line models.

      (3) Inclusion of patient-derived organoids, which increases relevance to human disease.

      (4) Focus on an emerging and underexplored aspect of the tumor microenvironment, namely fungal pathogens.

      Weaknesses:

      (1) Clinical association data are inconsistent and based on very small sample numbers.

      (2) No in vivo validation, which limits the translational significance.

      (3) Species- and cell type-specificity claims are not well supported by the presented controls.

      (4) Reliance on colorectal cancer cell lines alone makes it difficult to judge whether findings are specific or general epithelial responses.

    2. Reviewer #2 (Public review):

      The authors in this manuscript studied the role of Candida albicans in Colorectal cancer progression. The authors have undertaken a thorough investigation and used several methods to investigate the role of Candida albicans in Colorectal cancer progression. The topic is highly relevant, given the increasing burden of colon cancer globally and the urgent need for innovative treatment options.

      However, there are some inconsistencies in the figures and some missing details in the figures, including:

      (1) The authors should clearly explain in the results section which patient samples are shown in Figure 1B.

      (2) What do a, ab, b, b written above the bars in Figure 1F represent? Maybe authors should consider removing them, because they create confusion. Also, there is no explanation for those letters in the figure legend.

      (3) The authors should submit all the raw images of Western blot with appropriate labels to indicate the bands of protein of interest along with molecular weight markers.

      (4) The authors should do the quantification of data in Figure 2d and include it in the figure.

      (5) In Figure 2h, the authors should indicate if the quantification represents VEGF expression after 6h or 12h of C. albicans co-culture with cells.

      (6) In Figure 2i, quantification of VEGF should be done and data from three independent experiments should be submitted. The authors should also mention the time point.

    1. Reviewer #1 (Public review):

      This study presents an exploration of PPGL tumour bulk transcriptomics and identifies three clusters of samples (labeled as subtypes C1-C3). Each subtype is then investigated for the presence of somatic mutations, metabolism-associated pathway and inflammation correlates, and disease progression.

      The proposed subtype descriptions are presented as an exploratory study. The proposed potential biomarkers from this subtype are suitably caveated and will require further validation in PPGL cohorts together with mechanistic study.

      The first section uses WGCNA (a method to identify clusters of samples based on gene expression correlations) to discover three transcriptome-based clusters of PPGL tumours using a new cohort of n=87 PPGL samples from various locations in the body.

      The second section inspects a previously published snRNAseq dataset, assigning the published samples to subtypes C1-C3 using a pseudo-bulk approach.

      The tumour samples are obtained from multiple locations in the body, summarised in Fig1A. It will be important to see further investigation of how the sample origin is distributed among the C1-C3 clusters, and whether there is a sample-origin association with mutational drivers and disease progression.

      Comments on revisions:

      In SupplFile3 (pdf) - please correct the table format. The contents are obscured due to the narrowness of the table columns.

      Deposit the new RNAseq data (N=87 cases, N=5 controls) in an appropriate repository; see "Data on human genotypes and phenotypes" at https://elife-rp.msubmit.net/html/elife-rp_author_instructions.html#dataavailability

    2. Reviewer #2 (Public review):

      Summary:

      A study that furthers the molecular definition of PPGL (where prognosis is variable) and provides a wide range of sub-experiments to back up the findings. One of the key premises of the study is that identification of driver mutations in PPGL is incomplete and that compromises characterisation for prognostic purposes. This is a reasonable starting point on which to base some characterisation based on different methods.

      Strengths:

      The cohort is a reasonable size, and a useful validation cohort in the form of TCGA is used. Whilst it would be resource-intensive (though plausible given the rarity of the tumour type) to perform RNAseq on all PPGL samples in clinical practice, some potential proxies are proposed.

      Weaknesses:

      Performance of some of the proxy markers for transcriptional subtype is not presented.

      Limited prognostic information available.

      Comments on revisions:

      Having reviewed the responses to my comments and associated revisions, I am satisfied that they have been addressed.

    1. Reviewer #1 (Public review):

      This paper examines how geometric regularities in abstract shapes (e.g., parallelograms, kites) are perceived and processed in the human brain. The manuscript contains multimodal data (behavior, fMRI, MEG) from adults and additional fMRI data from 6-year-old children. The key findings show that (1) processing geometric shapes lead to reduced activity in ventral areas in comparison to complex stimuli and increased activity in intraparietal and inferior temporal regions, (2) the degree of geometric regularity modulates activity in intraparietal and inferior temporal regions, (3) similarity in neural representation of geometric shapes can be captured early by using CNN models and later by models of geometric regularity. In addition to these novel findings, the paper also includes a replication of behavioral data, showing that the perceptual similarity structure amongst the geometric stimuli used can be explained by a combination of visual similarities (as indexed by feedforward CNN model of ventral visual pathway) and geometric features. The paper comes with openly accessible code in a well-documented GitHub repository and the data will be published with the paper on OpenNeuro.

      In the revised version of this manuscript, the authors clarified certain aspects of the task design, added critical detail to the description of the methods, and updated the figures to show unsmoothed data and variability across participants. Importantly, the authors thoroughly discussed potential task effects (for the fMRI data only) and added additional analyses that indicate that the effects are unlikely to be driven by linguistic labels/name availability of the stimuli.

      Comments on the revision:

      Thank you for carefully addressing all my concerns and especially for clarifying the task design.

    2. Reviewer #2 (Public review):

      Summary

      The current study seeks to understand the neural mechanisms underlying geometric reasoning. Using fMRI with both children and adults, the authors found that contrasting simple geometric shapes with naturalistic images (faces, tools, houses) led to responses in the dorsal visual stream, rather than ventral regions that are generally thought to represent shape properties. The author's followed up on this result using computational modeling and MEG to show that geometric properties explain distinct variance in the neural response than what is captured by a CNN.

      Strengths

      These findings contribute much-needed neural and developmental data to the ongoing debate regarding shape processing in the brain and offer additional insights into why CNNs may have difficulty with shape processing. The motivation and discussion for the study is appropriately measured, and I appreciate the authors' use of multiple populations, neuroimaging modalities, and computational models in explore this question.

      Weaknesses

      The presence of activation in aIPS led the authors to interpret their results to mean that geometric reasoning draws on the same processes as mathematical thinking. However, there is only weak and indirect evidence in the current study that geometric reasoning, as its tested here, draws on the same circuits as math.

    3. Reviewer #3 (Public review):

      Summary:

      The authors report converging evidence from behavioral studies as well as several brain-imaging techniques that geometric figures, notably quadrilaterals, are processed differently in visual (lower activation) and spatial (greater) areas of the human brain than representative figures. Comparison of mathematical models to fit activity for geometric figures shows the best fit for abstract geometric features like parallelism and symmetry. The brain areas active for geometric figures are also active in processing mathematical concepts even in blind mathematicians, linking geometric shapes to abstract math concepts. The effects are stronger in adults than in 6-year-old Western children. Similar phenomena do not appear in great apes, suggesting that this is uniquely human and developmental.

      Strengths:

      Multiple converging techniques of brain imaging and testing of mathematical models showing special status of perception of abstract forms. Careful reasoning at every step of research and presentation of research, anticipating and addressing possible reservations. Connecting these findings to other findings, brain, behavior, and historical/anthropological to suggest broad and important fundamental connections between abstract visual-spatial forms and mathematical reasoning.

      Weaknesses:

      I have reservations of the authors' use of "symbolic." They seem to interpret "symbolic" as relying on "discrete, exact, rule-based features." Words are generally considered to symbolic (that is their major function), yet words do not meet those criteria. Depictions of objects can be regarded as symbolic because they represent real objects, they are not the same as the object (as Magritte observed). If so then perhaps depictions of quadrilaterals are also symbolic but then they do not differ from depictions of objects on that quality. Relatedly, calling abstract or generalized representations of forms a distinct "language of thought" doesn't seem supportable by the current findings. Minimally, a language has elements that are combined more or less according to rules. The authors present evidence for geometric forms as elements but nowhere is there evidence for combining them into meaningful strings.

      Further thoughts

      Incidentally, there have been many attempts at constructing visual languages from visual elements combined by rules, that is, mapping meaning to depictions. Many written languages like Egyptian hieroglyphics or Mayan or Chinese, began that way; there are current attempts using emoji. Apparently, mapping sound to discrete letters, alphabets, is more efficient and was invented once but spread. That said, for restricted domains like maps, circuit diagrams, networks, chemical interactions, mathematics, and more, visual "languages" work quite well.

      The findings are striking and as such invite speculation about their meaning and limitations. The images of real objects seem to be interpreted as representations of 3D objects as they activate the same visual areas as real objects. By contrast, the images of 2D geometric forms are not interpreted as representations of real objects but rather seemingly as 2D abstractions. It would be instructive to investigate stimuli that are on a continuum from representational to geometric, e. g., real objects that have simple geometric forms like table tops or boxes under various projections or balls or buildings that are rectangular or triangular. Objects differ from geometric forms in many ways: 3D rather than 2D, more complicated shapes; internal features as well as outlines. The geometric figures used are flat, 2-D, but much geometry is 3-D (e. g. cubes) with similar abstract features. The feature space of geometry is more than parallelism and symmetry; angles are important for example. Listing and testing features would be fascinating.

      Can we say that mathematical thinking began with the regularities of shapes or with counting, or both? External representations of counting go far back into prehistory; tallies are frequent and wide-spread. Infants are sensitive to number across domains as are other primates (and perhaps other species). Finding overlapping brain areas for geometric forms and number is intriguing but doesn't show how they are related.

      Categories are established in part by contrast categories; are quadrilaterals and triangles and circles different categories? As for quadrilaterals, the authors say some are "completely irregular." Not really; they are still quadrilaterals, if atypical. See Eleanor Rosch's insightful work on (visual) categories. One wonders about distinguishing squashed quadrilaterals from squashed triangles.

      What in human experience but not the experience of close primates would drive the abstraction of these geometric properties? It's easy to make a case for elaborate brain processes for recognizing and distinguishing things in the world, shared by many species, but the case for brain areas sensitive to abstracting geometric figures is harder. The fact that these areas are active in blind mathematicians and that they are parietal areas suggest that what is important is spatial far more than visual. Could these geometric figures and their abstract properties be connected in some way to behavior, perhaps with fabrication, construction or use of objects? Or with other interactions with complex objects and environments where symmetry and parallelism (and angles and curvature--and weight and size) would be important? Manual dexterity and fabrication also distinguish humans from great apes (quantitatively not qualitatively) and action drives both visual and spatial representations of objects and spaces in the brain. I certainly wouldn't expect the authors to add research to this already packed paper, but raising some of the conceptual issues would contribute to the significance of the paper.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents three experiments. Experiments 1 and 3 use a target detection paradigm to investigate the speed of statistical learning. The first experiment is a replication of Batterink, 2017, in which participants are presented with streams of uniform-length, trisyllabic nonsense words and asked to detect a target syllable. The results replicate previous findings, showing that learning (in the form of response time facilitation to later-occurring syllables within a nonsense word) occurs after a single exposure to a word. In the second experiment, participants are presented with streams of variable length nonsense words (two trisyllabic words and two disyllabic words), and perform the same task. A similar facilitation effect was observed as in Experiment 1. In Experiment 3 (newly added in the Revised manuscript), an adult version of the study by Johnson and Tyler is included. Participants were exposed to streams of words of either uniform length (all disyllabic) or mixed length (two disyllabic, two trisyllabic) and then asked to perform a familiarity judgment on a 1-5 scale on two words from the stream and two part-words. Performance was better in the uniform length condition.

      The authors interpret these findings as evidence that target detection requires mechanisms different from segmentation. They present results of a computational model to simulate results from the target detection task, and find that a bigram model can produce facilitation effects similar to the ones observed by human participants in Experiments 1 and 2 (though this model was not directly applied to test whether human-like effects were also produced to account for the data in Experiment 3). PARSER was also tested and produced differing results from those observed by humans across all three experiments. The authors conclude that the mechanisms involved in the target detection task are different from those involved in the word segmentation task.

      Strengths:

      The paper presents multiple experiments that provide internal replication of a key experimental finding, in which response times are facilitated after a single exposure to an embedded pseudoword. Both experimental data and results from a computational model are presented, providing converging approaches for understanding and interpreting the main results. The data are analyzed very thoroughly using mixed effects models with multiple explanatory factors. The addition of Experiment 3 provides direct evidence that the profile of performance for familiarity ratings and target detection differ as a function of word length variability.

      Weaknesses:

      (1) The concept of segmentation is still not quite clear. The authors seem to treat the testing procedure of Experiment 3 as synonymous with segmentation. But the ability to more strongly endorse words from the stream versus part-words as familiar does not necessarily mean that they have been successfully "segmented", as I elaborated on in my earlier review. In my view, it would be clearer to refer to segmentation as the mechanism or conceptual construct of segmenting continuous speech into discrete words. This ability to accurately segment component words could support familiarity judgments but is not necessary for above-chance familiarity or recognition judgments, which could be supported by more general memory signals. In other words, segmentation as an underlying ability is sufficient but not necessary for above-chance performance on familiarity-driven measures such as the one used in experiment 3.

      (2) The addition of experiment 3 is an added strength of the revised paper and provides more direct evidence of dissociations as a function of word length on the two tasks (target detection and familiarity ratings), compared to the prior strategy of just relying on previous work for this claim. However, it is not clear why the authors chose not to use the same stimuli as used in experiment 1 and 2, which would have allowed for more direct comparisons to be made. It should also be specified whether test items in the UWL and MWL were matched for overall frequency during exposure. Currently, the text does not specify whether test words in the UWL condition were taken from the high frequency or low frequency group; if they were taken from the high frequency group this would of course be a confound when comparing to the MWL condition. Finally, the definition of part-words should also be clarified,

      (3) The framing and argument for a prediction/anticipation mechanism was dropped in the Revised manuscript, but there are still a few instances where this framing and interpretation remain. E.g. Abstract - "we found that a prediction mechanism, rather than clustering, could explain the data from target detection." Discussion page 43 "Together, these results suggest that a simple prediction-based mechanism can explain the results from the target detection task, and clustering-based approaches such as PARSER cannot, contrary to previous claims."

      Minor (4) It was a bit unclear as to why a conceptual replication of Batterink 2017 was conducted, given that the target syllables at the beginning and end of the streams were immediately dropped from further analysis. Why include syllable targets within these positions in the design if they are not analyzed?

      (5) Figures 3 and 4 are plotted on different scales, which makes it difficult to visually compare the effects between word length conditions.

    2. Reviewer #2 (Public review):

      Summary:

      The valuable study investigates how statistical learning may facilitate a target detection task and whether the facilitation effect is related to statistical learning of word boundaries. Solid evidence is provided that target detection and word segmentation rely on different statistical learning mechanisms.

      Strengths:

      The study is well designed, using the contrast between the learning of words of uniform length and words of variable length to dissociate general statistical learning effects and effects related to word segmentation.

      Weaknesses:

      The study relies on the contrast between word length effects on target detection and word learning. However, the study only tested the target detection condition and did not attempt to replicate the word segmentation effect. It is true that the word segmentation effect has been replicated before but it is still worth reviewing the effect size of previous studies.

      The paper seems to distinguish prediction, anticipation, and statistical learning, but it is not entirely clear what each terms refers to.

      Comments on revisions:

      The authors did not address my concerns...they only replied to reviewer 1.

    1. Reviewer #1 (Public review):

      This study investigates how ant group demographics influence nest structures and group behaviors of Camponotus fellah ants, a ground-dwelling carpenter ant species (found locally in Israel) that build subterranean nest structures. Using a quasi-2D cell filled with artificial sand, the authors perform two complementary sets of experiments to try to link group behavior and nest structure: first, the authors place a mated queen and several pupae into their cell and observe the structures that emerge both before and after the pupae eclose (i.e., "colony maturation" experiments); second, the authors create small groups (of 5, 10, or 15 ants, each including a queen) within a narrow age range (i.e., "fixed demographic" experiments) to explore the dependence of age on construction. Some of the fixed demographic instantiations included a manually induced catastrophic collapse event; the authors then compared emergency repair behavior to natural nest creation. Finally, the authors introduce a modified logistic growth model to describe the time-dependent nest area. The modification introduced parameters that allow for age-dependent behavior, and the authors use their fixed demographic experiments to set these parameters, and then apply the model to interpret the behavior of the colony maturation experiments. The main results of this paper are that for natural nest construction, nest areas, and morphologies depend on the age demographics of ants in the experiments: younger ants create larger nests and angled tunnels, while older ants tend to dig less and build predominantly vertical tunnels; in contrast, emergency response seems to elicit digging in ants of all ages to repair the nest.

      The experimental results are convincing, providing new information and important insights into nest and colony growth in a social insect species. A model, inspired by previous work but modified to capture experimental results, is in reasonable agreement with experiments and is more biologically relevant than previous models.

    2. Reviewer #2 (Public review):

      I enjoyed this paper and its examination of the relationship between overall density and age polyethism to reduce the computational complexity required to match nest size with population. I had some questions about the requirement that growth is infinite in such a solution, but these have been addressed by the authors in the responses and updated manuscript. I also enjoyed the discussion of whether collective behaviour is an appropriate framework in systems in which agents (or individuals) differ in the behavioural rules they employ, according to age, location, or information state. This is especially important in a system like social insects, typically held as a classic example of individual-as-subservient to whole, and therefore most likely to employ universal rules of behaviour. The current paper demonstrates a potentially continuous age-related change in target behaviour (excavation), and suggests an elegant and minimal solution to the requirement for building according to need in ants, avoiding the invocation of potentially complex cognitive mechanisms, or information states that all individuals must have access to in order to have an adaptive excavation output.

      The authors have addressed questions I had in the review process and the manuscripts is now clear in its communication and conclusions.

      The modelling approach is compelling, also allowing extrapolation to other group sizes and even other species. This to me is the main strength of the paper, as the answer to the question of whether it is younger or older ants that primarily excavate nests could have been answered by an individual tracking approach (albeit there are practical limitations to this, especially in the observation nest setup, as the authors point out). The analysis of the tunnel structure is also an important piece of the puzzle, and I really like the overall study.

    1. Reviewer #1 (Public review):

      In this manuscript, the authors aimed to identify the molecular target and mechanism by which α-Mangostin, a xanthone from Garcinia mangostana, produces vasorelaxation that could explain the antihypertensive effects. Building on prior reports of vascular relaxation and ion channel modulation, the authors convincingly show that large-conductance potassium BK channels are the primary site of action. Using electrophysiological, pharmacological, and computational evidence, the authors achieved their aims and showed that BK channels are the critical molecular determinant of mangostin's vasodilatory effects, even though the vascular studies are quite preliminary in nature.

      Strengths:

      (1) The broad pharmacological profiling of mangostin across potassium channel families, revealing BK channels - and the vascular BK-alpha/beta1 complex - as the potently activated target in a concentration-dependent manner.

      (2) Detailed gating analyses showing large negative shifts in voltage-dependence of activation and altered activation and deactivation kinetics.

      (3) High-quality single-channel recordings for open probability and dwell times.

      (4) Convincing activation in reconstituted BKα/β1-Caᵥ nanodomains mimicking physiological conditions and functional proof-of-concept validation in mouse aortic rings.

      Weaknesses are minor:

      (1) Some mutagenesis data (e.g., partial loss at L312A) could benefit from complementary structural validation.

      (2) While Cav-BK nanodomains were reconstituted, direct measurement of calcium signals after mangostin application onto native smooth muscle could be valuable.

      (3) The work has an impact on ion channel physiology and pharmacology, providing a mechanistic link between a natural product and vasodilation. Datasets include electrophysiology traces, mutagenesis scans, docking analyses, and aortic tension recordings. The latter, however, are preliminary in nature.

    2. Reviewer #2 (Public review):

      Summary:

      In the present manuscript, Cordeiro et al. show that α-mangostin, a xanthone obtained from the fruit of the Garcinia mangostana tree, behaves as an agonist of the BK channels. The authors arrive at this conclusion through the effect of mangostin on macroscopic and single-channel currents elicited by BK channels formed by the α subunit and α + β1sununits, as well as αβ1 channels coexpressed with voltage-dependent Ca2+ (CaV1,2) channels. The single-channel experiments show that α-mangostin produces a robust increase in the probability of opening without affecting the single-channel conductance. The authors contend that α-mangostin activation of the BK channel is state-independent and molecular docking and mutagenesis suggest that α-mangostin binds to a site in the internal cavity. Importantly, α-mangostin (10 μM) alleviates the contracture promoted by noradrenaline. Mangostin is ineffective if the contracted muscles are pretreated with the BK toxin iberiotoxin.

      Strengths:

      The set of results combining electrophysiological measurements, mutagenesis, and molecular docking reveals α-mangostin as a potent activator of BK channels and the putative location of the α-mangostin binding site. Moreover, experiments conducted on aortic preparations from mice suggest that α-mangostin can aid in developing drugs to treat a myriad of diverse diseases involving the BK channel.

      Weaknesses:

      Major:

      (1) Although the results indicate that α-mangostin is modifying the closed-open equilibrium, the conclusion that this can be due to a stabilization of the voltage sensor in its active configuration may prove to be wrong. It is more probable that, as has been demonstrated for other activators, the α-mangostin is increasing the equilibrium constant that defines the closed-open reaction (L in the Horrigan, Aldrich allosteric gating model for BK). The paper will gain much if the authors determine the probability of opening in a wide range of voltages, to determine how the drug is affecting (or not), the channel voltage dependence, the coupling between the voltage sensor and the pore, and the closed-open equilibrium (L).

      (2) Apparently, the molecular docking was performed using the truncated structure of the human BK channel. However, it is unclear which one, since the PDB ID given in the Methods (6vg3), according to what I could find, corresponds to the unliganded, inactive PTK7 kinase domain. Be as it may, the apo and Ca2+ bound structures show that there is a rotation and a displacement of the S6 transmembrane domain. Therefore, the positions of the residues I308, L312, and A316 in the closed and open configurations of the BK channel are not the same. Hence, it is expected that the strength of binding will be different whether the channel is closed or open. This point needs to be discussed.

      Minor:

      (1) From Figure 3A, it is apparent that the increase in Po is at the expense of the long periods (seconds) that the channel remains closed. One might suggest that α-mangostin increases the burst periods. It would be beneficial if the authors measured both closed and open dwell times to test whether α-mangostin primarily affects the burst periods.

      (2) In several places, the authors make similarities in the mode of action of other BK activators and α-mangostin; however, the work of Gessner et al. PNAS 2012 indicates that NS1619 and Cym04 interact with the S6/RCK linker, and Webb et al. demonstrated that GoSlo-SR-5-6 agonist activity is abolished when residues in the S4/S5 linker and in the S6C region are mutated. These findings indicate that binding of the agonist is not near the selectivity filter, as the authors' results suggest that α-mangostin binds.

      (3) The sentence starting in line 452 states that there is a pronounced allosteric coupling between the voltage sensors and Ca2+ binding. If the authors are referring to the coupling factor E in the Horrigan-Aldrich gating model, the references cited, in particular, Sun and Horrigan, concluded that the coupling between those sensors is weak.

    3. Reviewer #3 (Public review):

      Summary:

      This research shows that a-mangostin, a proposed nutraceutical, with cardiovascular protective properties, could act through the activation of large conductance potassium permeable channels (BK). The authors provide convincing electrophysiological evidence that the compound binds to BK channels and induces a potent activation, increasing the magnitude of potassium currents. Since these channels are important modulators of the membrane potential of smooth muscle in vascular tissue, this activation leads to muscle relaxation, possibly explaining cardiovascular protective effects.

      Strengths:

      The authors present evidence based on several lines of experiments that a-mangostin is a potent activator of BK channels. The quality of the experiments and the analysis is high and represents an appropriate level of analysis. This research is timely and provides a basis to understand the physiological effects of natural compounds with proposed cardio-protective effects.

      Weaknesses:

      The identification of the binding site is not the strongest point of the manuscript. The authors show that the binding site is probably located in the hydrophobic cavity of the pore and show that point mutations reduce the magnitude of the negative voltage shift of activation produced by a-mangostin. However, these experiments do not demonstrate binding to these sites, and could be explained by allosteric effects on gating induced by the mutations themselves.

    1. Reviewer #1 (Public review):

      Summary:

      This study identifies three redundant pathways-glycine cleavage system (GCS), serine hydroxymethyltransferase (GlyA), and formate-tetrahydrofolate ligase/FolD-that feed the one-carbon tetrahydrofolate (1C-THF) pool essential for Listeria monocytogenes growth and virulence. Reactivation of the normally inactive fhs gene rescues 1C-THF deficiency, revealing metabolic plasticity and vulnerability for potential antimicrobial targeting

      Strengths:

      (1) Novel evolutionary insight - reversible reactivation of a pseudogene (fhs) shows adaptive metabolic plasticity, relevant for pathogen evolution.

      (2) They systematically combine targeted gene deletions with suppressor screening to dissect the folate/one-carbon network (GCS, GlyA, Fhs/FolD).

      Weaknesses:

      (1) The study infers 1C-THF depletion mostly genetically and indirectly (growth rescue with adenine) without direct quantification of folate intermediates or fluxes. Biochemical confirmation, LC-MS-based metabolomics of folates/1C donors, or isotopic tracing would strengthen mechanistic claims.

      (2) In multiple result sections, the authors report data from technical triplicates but do not mention independent biological replicates (e.g., Figure 2C, Figure 4A-B, Figure 6D). In addition, some results mention statistical significance but without a detailed description of the specific statistical tests used or replicates, such as Figure 2A-C, Figure 2E, and Figure 2G-I.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Freier et al examines the impact of deletion of the glycine cleavage system (GCS) GcvPAB enzyme complex in the facultative intracellular bacterial pathogen Listeria monocytogenes. GcvPAB mediates the oxidative decarboxylation of glycine as a first step in a pathway that leads to the generation of N5, N10-methylene-Tetrahydrofolate (THF) to replenish the 1-carbon THF (1C-THF) pool. 1C-THF species are important for the biosynthesis of purines and pyrimidines as well as for the formation of serine, methionine, and N-formylmethionine, and the authors have previously demonstrated that gcvPAB is important for bacterial replication within macrophages. A significant defect for growth is observed for the gcvPAB deletion mutant in defined media, and this growth defect appears to stem from the sensitivity of the mutant strain to excess glycine, which is hypothesized to further deplete the 1C-THF pool. Selection of suppressor mutations that restored growth of gcvPAB deletion mutants in synthetic media with high glycine yielded mutants that reversed stop codon inactivation of the formate-tetrahydrofolate ligase (fhs) gene, supporting the premise that generation of N10-formyl-THF can restore growth. Mutations within the folk, codY, and glyA genes, encoding serine hydroxymethyltransferase, were also identified, although the functional impact of these mutations is somewhat less clear. Overall, the authors report that their work identifies three pathways that feed the 1C-THF pool to support the growth and virulence of L. monocytogenes and that this work represents the first example of the spontaneous reactivation of a L. monocytogenes gene that is inactivated by a premature stop codon.

      Strengths:

      This is an interesting study that takes advantage of a naturally existing fhs mutant Listeria strain to reveal the contributions of different pathways leading to 1C-THF synthesis. The defects observed for the gcvPAB mutant in terms of intracellular growth and virulence are somewhat subtle, indicating that bacteria must be able to access host sources (such as adenine?) to compensate for the loss of purine and fMet synthesis. Overall, the authors do a nice job of assessing the importance of the pathways identified for 1C-THF synthesis.

      Weaknesses:

      (1) Line 114 and Figure 1: The authors indicate that the gcvPAB deletion forms significantly fewer plaques in addition to forming smaller plaques (although this is a bit hard to see in the plaque images). A reduction in the overall number of plaques sounds like a bacterial invasion defect - has this been carefully assessed? The smaller plaque size makes sense with reduced bacterial replication, but I'm not sure I understand the reduction in plaque number.

      (2) Do other Listeria strains contain the stop codon in fhs? How common is this mutation? That would be interesting to know.

      (3) Based on the observation that fhs+ ΔgcvPAB ΔglyA mutant is only possible to isolate in complex media, and fhs is responsible for converting formate to 1C-THF with the addition of FolD, have the authors thought of supplementing synthetic media with formate and assessing mutant growth?

    3. Reviewer #3 (Public review):

      Summary:

      In this study, Freier et al. demonstrate that 3 distinct metabolic pathways are critical for the synthesis of 1C-THF, a metabolite that is crucial for the growth and virulence of Listeria monocytogenes. Using an elegant suppressor screen, they also demonstrate the hierarchical importance of these metabolic pathways with respect to the biosynthesis of 1C-THF.

      Strengths:

      This study uses elegant bacterial genetics to confirm that 3 distinct metabolic pathways are critical for 1C-THF synthesis in L. monocytogenes, and the lack of either one of these pathways compromises bacterial growth and virulence. The study uses a combination of in vitro growth assays, macrophage-CFU assays, and murine infection models to demonstrate this.

      Weaknesses:

      (1) The primary finding of the study is that the perturbation of any of the 3 metabolic pathways important for the synthesis of 1C-THF results in reduced growth and virulence of L. monocytogenes. However, there is no evidence demonstrating the levels of 1C-THF in the various knockouts and suppressor mutants used in this study. It is important to measure the levels of this metabolite (ideally using mass spectrometry) in the various knockouts and suppressor mutants, to provide strong causality.

      (2) The story becomes a little hard to follow since macrophage-CFU assays and murine infection model data precede the in vitro growth assays. The manuscript would benefit from a reorganization of Figures 2,3, and 4 for better readability and flow of data.

    1. Reviewer #1 (Public review):

      Summary:

      This important study functionally profiled ligands targeting the LXR nuclear receptors using biochemical assays in order to classify ligands according to pharmacological functions. Overall, the evidence is solid, but nuances in the reconstituted biochemical assays and cellular studies and terminology of ligand pharmacology limit the potential impact of the study. This work will be of interest to scientists interested in nuclear receptor pharmacology.

      Strengths:

      (1) The authors rigorously tested their ligand set in CRTs for several nuclear receptors that could display ligand-dependent cross-talk with LXR cellular signaling and found that all compounds display LXR selectivity when used at ~1 µM.

      (2) The authors tested the ligand set for selectivity against two LXR isoforms (alpha and beta). Most compounds were found to be LXRbeta-specific.

      (3) The authors performed extensive LXR CRTs, performed correlation analysis to cellular transcription and gene expression, and classification profiling using heatmap analysis-seeking to use relatively easy-to-collect biochemical assays with purified ligand-binding domain (LBD) protein to explain the complex activity of full-length LXR-mediated transcription.

      Weaknesses:

      (1) The descriptions of some observations lack detail, which limits understanding of some key concepts.

      (2) The presence of endogenous NR ligands within cells may confound the correlation of ligand activity of cellular assays to biochemical assay data.

      (3) The normalization of biochemical assay data could confound the classification of graded activity ligands.

      (4) The presence of >1 coregulator peptide in the biplex (n=2 peptides) CRT (pCRT) format will bias the LBD conformation towards the peptide-bound form with the highest binding affinity, which will impact potency and interpretation of TR-FRET data.

      (5) Correlation graphical plots lack sufficient statistical testing.

      (6) Some of the proposed ligand pharmacology nomenclature is not clear and deviates from classifications used currently in the field (e.g., hard and soft antagonist; weak vs. partial agonist, definition of an inverse agonist that is not the opposite function to an agonist).

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript by Laham and co-workers, the authors profiled structurally diverse LXR ligands via a coregulator TR-FRET (CRT) assay for their ability to recruit coactivators and kick off corepressors, while identifying coregulator preference and LXR isoform selectivity.

      The relative ligand potencies measured via CRT for the two LXR isoforms were correlated with ABCA1 induction or lipogenic activation of SRE, depending on cellular contexts (i.e, astrocytoma or hepatocarcinoma cells). While these correlations are interesting, there is some leeway to improve the quantitative presentation of these correlations. Finally, the CRT signatures were correlated with the structural stabilization of the LXR: coregulator complexes. In aggregate, this study curated a set of LXR ligands with disparate agonism signatures that may guide the design of future nonlipogenic LXR agonists with potential therapeutic applications for cardiovascular disease, Alzheimer's, and type 2 diabetes, without inducing mechanisms that promote fat/lipid production.

      Strengths:

      This study has many strengths, from curating an excellent LXR compound set to the thoughtful design of the CRT and cellular assays. The design of a multiplexed precision CRT (pCRT) assay that detects corepressor displacement as a function of ligand-induced coactivator recruitment is quite impressive, as it allows measurement of ligand potencies to displace corepressors in the presence of coactivators, which cannot be achieved in a regular CRT assay that looks at coactivator recruitment and corepressor dissociation in separate experiments.

      Weaknesses:

      I did not identify any major weaknesses.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a high-throughput screening platform to identify nanobodies capable of recruiting chromatin regulators and modulating gene expression. The authors utilize a yeast display system paired with mammalian reporter assays to validate candidate nanobodies, aiming to create a modular resource for synthetic epigenetic control.

      Strengths:

      (1) The overall screening design combining yeast display with mammalian functional assays is innovative and scalable.

      (2) The authors demonstrate proof-of-concept that nanobody-based recruitment can repress or activate reporter expression.

      (3) The manuscript contributes to the growing toolkit for epigenome engineering.

      Weaknesses:

      (1) The manuscript does not investigate which endogenous factors are recruited by the nanobodies. While repression activity is demonstrated at the reporter level, there is no mechanistic insight into what proteins are being brought to the target site by each nanobody. This limits the interpretability and generalizability of the findings. Related to this, Figure S1B reports sequence similarity among complementarity-determining regions (CDRs) of nanobodies that scored highly in the DNMT3A screen. However, it remains unclear whether this similarity reflects convergence on a common molecular target or is coincidental. Without functional or proteomic validation, the relationship between sequence motifs and effector recruitment remains speculative.

      (2) The epigenetic consequences of nanobody recruitment are also left unexplored. Despite targeting epigenetic regulators, the study does not assess changes such as DNA methylation or histone modifications. This makes it difficult to interpret whether the observed reporter repression is due to true chromatin remodeling or secondary effects.

    2. Reviewer #2 (Public review):

      Summary:

      Wan, Thurm et al. use a yeast nanobody library that is thought to have diverse binders to isolate those that specifically bind to proteins of their interest. The yeast nanobody library collection in general carries enormous potential, but the challenge is to isolate binders that have specific activity. The authors posit that one reason for this isolation challenge is that the negative binders, in general, dampen the signal from the positive binders. This is a classic screening problem (one that geneticists have faced over decades) and, in general, underscores the value of developing a good secondary screen. Over many years, the authors have developed an elegant platform to carry out high-throughput silencing-based assays, thus creating the perfect secondary screen platform to isolate nanobodies that bind to chromatin regulators.

      Strengths:

      Highlights the enormous value of a strong secondary screen when identifying binders that can be isolated from the yeast nanobody library. This insight is generalizable, and I expect that this manuscript should help inspire many others to design such approaches.

      Provides new cell-based reagents that can be used to recruit epigenetic activators or repressors to modulate gene expression at target loci.

      Weaknesses:

      The authors isolate DNMT3A and TET1/2 enzymes directly from cell lysates and bind these proteins to beads. It is not clear what proteins are, in fact, bound to beads at the end of the IP. Epigenetic repressors are part of complexes, and it would be helpful to know if the IP is specific and whether the IP pulls down only DNMT3A or other factors. While this does not change the underlying assumptions about the screen, it does alter the authors' conclusions about whether the nanobody exclusively recruits DNMT3A or potentially binds to other co-factors.

      Using IP-MS to validate the pull-down would be a helpful addition to the manuscript, although one could very reasonably make the case that other co-factors get washed away during the course of the selection assay. Nevertheless, if there are co-factors that are structural and remain bound, these are likely to show up in the MS experiment.

    1. Reviewer #1 (Public review):

      Summary:

      In this work, Okell et al. describe the imaging protocol and analysis pipeline pertaining to the arterial spin labeling (ASL) MRI protocol acquired as part of the UK Biobank imaging study. In addition, they present preliminary analyses of the first 7000+ subjects in whom ASL data were acquired, and this represents the largest such study to date. Careful analyses revealed expected associations between ASL-based measures of cerebral hemodynamics and non-imaging-based markers, including heart and brain health, cognitive function, and lifestyle factors. As it measures physiology and not structure, ASL-based measures may be more sensitive to these factors compared with other imaging-based approaches.

      Strengths:

      This study represents the largest MRI study to date to include ASL data in a wide age range of adult participants. The ability to derive arterial transit time (ATT) information in addition to cerebral blood flow (CBF) is a considerable strength, as many studies focus only on the latter.

      Some of the results (e.g., relationships with cardiac output and hypertension) are known and expected, while others (e.g., lower CBF and longer ATT correlating with hearing difficulty in auditory processing regions) are more novel and intriguing. Overall, the authors present very interesting physiological results, and the analyses are conducted and presented in a methodical manner.

      The analyses regarding ATT distributions and the potential implications for selecting post-labeling delays (PLD) for single PLD ASL are highly relevant and well-presented.

      Weaknesses:

      At a total scan duration of 2 minutes, the ASL sequence utilized in this cohort is much shorter than that of a typical ASL sequence (closer to 5 minutes as mentioned by the authors). However, this implementation also included multiple (n=5) PLDs. As currently described, it is unclear how any repetitions were acquired at each PLD and whether these were acquired efficiently (i.e., with a Look-Locker readout) or whether individual repetitions within this acquisition were dedicated to a single PLD. If the latter, the number of repetitions per PLD (and consequently signal-to-noise-ratio, SNR) is likely to be very low. Have the authors performed any analyses to determine whether the signal in individual subjects generally lies above the noise threshold? This is particularly relevant for white matter, which is the focus of several findings discussed in the study.

      Hematocrit is one of the variables regressed out in order to reduce the effect of potential confounding factors on the image-derived phenotypes. The effect of this, however, may be more complex than accounting for other factors (such as age and sex). The authors acknowledge that hematocrit influences ASL signal through its effect on longitudinal blood relaxation rates. However, it is unclear how the authors handled the fact that the longitudinal relaxation of blood (T1Blood) is explicitly needed in the kinetic model for deriving CBF from the ASL data. In addition, while it may reduce false positives related to the relationships between dietary factors and hematocrit, it could also mask the effects of anemia present in the cohort. The concern, therefore, is two-fold: (1) Were individual hematocrit values used to compute T1Blood values? (2) What effect would the deconfounding process have on this?

      The authors leverage an observed inverse association between white matter hyperintensity volume and CBF as evidence that white matter perfusion can be sensitively measured using the imaging protocol utilized in this cohort. The relationship between white matter hyperintensities and perfusion, however, is not yet fully understood, and there is disagreement regarding whether this structural imaging marker necessarily represents impaired perfusion. Therefore, it may not be appropriate to use this finding as support for validation of the methodology.

    2. Reviewer #2 (Public review):

      Summary:

      Okell et al. report the incorporation of arterial spin-labeled (ASL) perfusion MRI into the UK Biobank study and preliminary observations of perfusion MRI correlates from over 7000 acquired datasets, which is the largest sample of human perfusion imaging data to date. Although a large literature already supports the value of ASL MRI as a biomarker of brain function, this important study provides compelling evidence that a brief ASL MRI acquisition may lead to both fundamental observations about brain health as manifested in CBF and valuable biomarkers for use in diagnosis and treatment monitoring.

      ASL MRI noninvasively quantifies regional cerebral blood flow (CBF), which reflects both cerebrovascular integrity and neural activity, hence serves as a measure of brain function and a potential biomarker for a variety of CNS disorders. Despite a highly abbreviated ASL MRI protocol, significant correlations with both expected and novel demographic, physiological, and medical factors are demonstrated. In many such cases, ASL was also more sensitive than other MRI-derived metrics. The ASL MRI protocol implemented also enables quantification of arterial transit time (ATT), which provides stronger clinical correlations than CBF in some factors. The results demonstrate both the feasibility and the efficacy of ASL MRI in the UK Biobank imaging study, which expects to complete ASL MRI in up to 60,000 richly phenotyped individuals. Although a large literature already supports the value of ASL MRI as a biomarker of brain function, this important study provides compelling evidence that a brief ASL MRI acquisition may lead to both fundamental observations about brain health as manifested in CBF and valuable biomarkers for use in diagnosis and treatment monitoring.

      Strengths:

      A key strength of this study is the use of an ASL MRI protocol incorporating balanced pseudocontinuous labeling with a background-suppressed 3D readout, which is the current state-of-the-art. To compensate for the short scan time, voxel resolution was intentionally only moderate. The authors also elected to acquire these data across five post-labeling delays, enabling ATT and ATT-corrected CBF to be derived using the BASIL toolbox, which is based on a variational Bayesian framework. The resulting CBF and ATT maps shown in Figure 1 are quite good, especially when combined with such a large and deeply phenotyped sample.

      Another strength of the study is the rigorous image analysis approach, which included covariation for a number of known CBF confounds as well as correction for motion and scanner effects. In doing so, the authors were able to confirm expected effects of age, sex, hematocrit, and time of day on CBF values. These observations lend confidence in the veracity of novel observations, for example, significant correlations between regional ASL parameters and cardiovascular function, height, alcohol consumption, depression, and hearing, as well as with other MRI features such as regional diffusion properties and magnetic susceptibility. They also provide valuable observations about ATT and CBF distributions across a large cohort of middle-aged and older adults.

      Weaknesses:

      This study primarily serves to illustrate the efficacy and potential of ASL MRI as an imaging parameter in the UK Biobank study, but some of the preliminary observations will be hypothesis-generating for future analyses in larger sample sizes. However, a weakness of the manuscript is that some of the reported observations are difficult to follow. In particular, the associations between ASL and resting fMRI illustrated in Figure 7 and described in the accompanying Results text are difficult to understand. It could also be clearer whether the spatial maps showing ASL correlates of other image-derived phenotypes in Figure 6B are global correlations or confined to specific regions of interest. Finally, while addressing partial volume effects in gray matter regions by covarying for cortical thickness is a reasonable approach, the Methods section seems to imply that a global mean cortical thickness is used, which could be problematic given that cortical thickness changes may be localized.

    3. Reviewer #3 (Public review):

      Summary:

      This is an extremely important manuscript in the evolution of cerebral perfusion imaging using Arterial Spin Labelling (ASL). The number of subjects that were scanned has provided the authors with a unique opportunity to explore many potential associations between regional cerebral blood flow (CBF) and clinical and demographic variables.

      Strengths:

      The major strength of the manuscript is the access to an unprecedentedly large cohort of subjects. It demonstrates the sensitivity of regional tissue blood flow in the brain as an important marker of resting brain function. In addition, the authors have demonstrated a thorough analysis methodology and good statistical rigour.

      Weaknesses:

      This reviewer did not identify any major weaknesses in this work.

    1. Reviewer #1 (Public review):

      In this work, Zhang et al, through a series of well-designed experiments, present a comprehensive study exploring the roles of the neuropeptide Corazonin (CRZ) and its receptor in controlling the female post-mating response (PMR) in the brown planthopper (BPH) Nilaparvata lugen and Drosophila melanogaster. Through a series of behavioural assays, micro-injections, gene knockdowns, Crispr/Cas gene editing, and immunostaining, the authors show that both CRZ and CrzR play a vital role in the female post-mating response, with impaired expression of either leading to quicker female remating and reduced ovulation in BPH. Notably, the authors find that this signaling is entirely endogenous in BPH females, with immunostaining of male accessory glands (MAGs) showing no evidence of CRZ expression. Further, the authors demonstrate that while CRZ is not expressed in the MAGs, BPH males with Crz knocked out show transcriptional dysregulation of several seminal fluid proteins and functionally link this dysregulation to an impaired PMR in BPH. In relation, the authors also find that in CrzR mutants, the injection of neither MAG extracts nor maccessin peptide triggered the PMR in BPH females. Finally, the authors extend this study to D. melanogaster, albeit on a more limited scale, and show that CRZ plays a vital role in maintaining PMR in D. melanogaster females with impaired CRZ signaling, once again leading to quicker female remating and reduced ovulation. The authors must be commended for their expansive set of complementary experiments. The manuscript is also generally well written. Given the seemingly conserved nature of CRZ, this work is a significant addition to the literature, opening several avenues for testing the molecular and neurobiological mechanisms in which CRZ triggers the PMR.

      However, there are some broad concerns/comments I had with this manuscript. The authors provide clear evidence that CRZ signaling plays a major role in the PMR of D. melanogaster, however, they provide no evidence that CRZ signaling is endogenous, as they did not check for expression in the MAGs of D. melanogaster males. Additionally, while the authors show that manipulating Crz in males leads to dysregulated seminal fluid expression and impaired PMR in BPH, the authors also find that CRZ injection in males in and of itself impairs PMR in BPH. The authors do not really address what this seemingly contradictory result could mean. While a lot of the figures have replicate numbers, the authors do not factor in replicate as an effect into their models, which they ideally should do.

      Finally, while the discussion is generally well-written, it lacks a broader conclusion about the wider implications of this study and what future work building on this could look like.

    2. Reviewer #2 (Public review):

      Summary:

      The work presented by Zhang and coauthors in this manuscript presents the study of the neuropeptide corazonin in modulating the post-mating response of the brown planthopper, with further validation in Drosophila melanogaster. To obtain their results, the authors used several different techniques that orthogonally demonstrate the involvement of corazonin signalling in regulating the female post-mating response in these species.

      They first injected synthetic corazonin peptide into female brown planthoppers, showing altered mating receptivity in virgin females and a higher number of eggs laid after mating. The role of corazonin in controlling these post-mating traits has been further validated by knocking down the expression of the corazonin gene by RNA interference and through CRISPR-Cas9 mutagenesis of the gene. Further proof of the importance of corazonin signalling in regulating the female post-mating response has been achieved by knocking down the expression or mutagenizing the gene coding for the corazonin receptor.

      Similar results have been obtained in the fruit fly Drosophila melanogaster, suggesting that corazonin signalling is involved in controlling the female post-mating response in multiple insect species.<br /> Notably, the authors also show that corazonin controls gene expression in the male accessory glands and that disruption of this pathway in males compromises their ability to elicit normal post-mating responses in their mates.

      Strengths:

      The study of the signalling pathways controlling the female post-mating response in insects other than Drosophila is scarce, and this limits the ability of biologists to draw conclusions about the evolution of the post-mating response in female insects. This is particularly relevant in the context of understanding how sexual conflict might work at the molecular and genetic levels, and how, ultimately, speciation might occur at this level. Furthermore, the study of the post-mating response could have practical implications, as it can lead to the development of control techniques, such as sterilization agents.

      The study, therefore, expands the knowledge of one of the signalling pathways that control the female post-mating response, the corazonin neuropeptide. This pathway is involved in controlling the post-mating response in both Nilaparvata lugens (the brown planthopper) and Drosophila melanogaster, suggesting its involvement in multiple insect species.

      The study uses multiple molecular approaches to convincingly demonstrate that corazonin controls the female post-mating response.

      Weaknesses:

      The data supporting the main claims of the manuscript are solid and convincing. The statistical analysis of some of the data might be improved, particularly by tailoring the analysis to the type of data that has been collected.

      In the case of the corazonin effect in females, all the data are coherent; in the case of CRISPR-Cas9-induced mutagenesis, the analysis of the behavioural trait in heterozygotes might have helped in understanding the haplosufficiency of the gene and would have further proved the authors' point.

      Less consistency was achieved in males (Figure 5): the authors show that injection of CRZ and RNAi of crz, or mutant crz, has the same effect on male fitness. However, the CRZ injection should activate the pathway, and crz RNAi and mutant crz should inhibit the pathway, yet they have the same effect. A comment about this discrepancy would have improved the clarity of the manuscript, pointing to new points that need to be clarified and opening new scientific discussion.

    1. Reviewer #1 (Public review):

      In this study, the authors investigated a specific subtype of SST-INs (layer 5 Chrna2-expressing Martinotti cells) and examined its functional role in motor learning. Using endoscopic calcium imaging combined with chemogenetics, they showed that activation of Chrna2 cells reduces the plasticity of pyramidal neuron (PyrN) assemblies but does not affect the animals' performance. However, activating Chrna2 cells during re-training improved performance. The authors claim that activating Chrna2 cells likely reduces PyrN assembly plasticity during learning and possibly facilitates the expression of already acquired motor skills.

      There are many major issues with the study. The findings across experiments are inconsistent, and it is unclear how the authors performed their analyses or why specific time points and comparisons were chosen. The study requires major re-analysis and additional experiments to substantiate its conclusions.

      Major Points:

      (1a) Behavior task - the pellet-reaching task is a well-established paradigm in the motor learning field. Why did the authors choose to quantify performance using "success pellets per minute" instead of the more conventional "success rate" (see PMID 19946267, 31901303, 34437845, 24805237)? It is also confusing that the authors describe sessions 1-5 as being performed on a spoon, while from session 6 onward, the pellets are presented on a plate. However, in lines 710-713, the authors define session 1 as "naïve," session 2 as "learning," session 5 as "training," and "retraining" as a condition in which a more challenging pellet presentation was introduced. Does "naïve session 1" refer to the first spoon session or to session 6 (when the food is presented on a plate)? The same ambiguity applies to "learning session 2," "training session 5," and so on. Furthermore, what criteria did the authors use to designate specific sessions as "learning" versus "training"? Are these definitions based on behavioral performance thresholds or some biological mechanisms? Clarifying these distinctions is essential for interpreting the behavioral results.

      (1b) Judging from Figures 1F and 4B, even in WT mice, it is not convincing that the animals have actually learned the task. In all figures, the mice generally achieve ~10-20 pellets per minute across sessions. The only sessions showing slightly higher performance are session 5 in Figure 1F ("train") and sessions 12 and 13 in Figure 4B ("CLZ"). In the classical pellet-reaching task, animals are typically trained for 10-12 sessions (approximately 60 trials per session, one session per day), and a clear performance improvement is observed over time. The authors should therefore present performance data for each individual session to determine whether there is any consistent improvement across days. As currently shown, performance appears largely unchanged across sessions, raising doubts about whether motor learning actually occurred.

      (1c) The authors also appear to neglect existing literature on the role of SST-INs in motor learning and local circuit plasticity (e.g., PMID 26098758, 36099920). Although the current study focuses on a specific subpopulation of SST-INs, the results reported here are entirely opposite to those of previous studies. The authors should, at a minimum, acknowledge these discrepancies and discuss potential reasons for the differing outcomes in the Discussion section.

      (2a) Calcium imaging - The methodology for quantifying fluorescence changes is confusing and insufficiently described. The use of absolute ΔF values ("detrended by baseline subtraction," lines 565-567) for analyses that compare activity across cells and animals (e.g., Figure 1H) is highly unconventional and problematic. Calcium imaging is typically reported as ΔF/F₀ or z-scores to account for large variations in baseline fluorescence (F₀) due to differences in GCaMP expression, cell size, and imaging quality. Absolute ΔF values are uninterpretable without reference to baseline intensity - for example, a ΔF of 5 corresponds to a 100% change in a dim cell (F₀ = 5) but only a 1% change in a bright cell (F₀ = 500). This issue could confound all subsequent population-level analyses (e.g., mean or median activity) and across-group comparisons. Moreover, while some figures indicate that normalization was performed, the Methods section lacks any detailed description of how this normalization was implemented. The critical parameters used to define the baseline are also omitted. The authors should reprocess the imaging data using a standardized ΔF/F₀ or z-score approach, explicitly define the baseline calculation procedure, and revise all related figures and statistical analyses accordingly.

      (2b) Figure 1G - It is unclear why neural activity during successful trials is already lower one second before movement onset. Full traces with longer duration before and after movement onset should also be shown. Additionally, only data from "session 2 (learning)" and a single neuron are presented. The authors should present data across all sessions and multiple neurons to determine whether this observation is consistent and whether it depends on the stage of learning.

      (2c) Figure 1H - The authors report that chemogenetic activation of Chrna2 cells induces differential changes in PyrN activity between successful and failed trials. However, one would expect that activating all Chrna2 cells would strongly suppress PyrN activity rather than amplifying the activity differences between trials. The authors should clarify the mechanism by which Chrna2 cell activation could exaggerate the divergence in PyrN responses between successful and failed trials. Perhaps, performing calcium imaging of Chrna2 cells themselves during successful versus failed trials would provide insight into their endogenous activity patterns and help interpret how their activation influences PyrN activity during successful and failed trials.

      (2d) Figure 1H - Also, in general, the Cre⁺ (red) data points appear consistently higher in activity than the Cre⁻ (black) points. This is counterintuitive, as activating Chrna2 cells should enhance inhibition and thereby reduce PyrN activity. The authors should clarify how Cre⁺ animals exhibit higher overall PyrN activity under a manipulation expected to suppress it. This discrepancy raises concerns about the interpretation of the chemogenetic activation effects and the underlying circuit logic.

      (3) The statistical comparisons throughout the manuscript are confusing. In many cases, the authors appear to perform multiple comparisons only among the N, L, T, and R conditions within the WT group. However, the central goal of this study should be to assess differences between the WT and hM3D groups. In fact, it is unclear why the authors only provide p-values for some comparisons but not for the majority of the groups.

      (4a) Figure 4 - It is hard to understand why the authors introduce LFP experiments here, and the results are difficult to interpret in isolation. The authors should consider combining LFP recordings with calcium imaging (as in Figure 1) or, alternatively, repeating calcium imaging throughout the entire re-training period. This would provide a clearer link between circuit activity and behavior and strengthen the conclusions regarding Chrna2 cell function during re-training.

      (4b) It is unclear why CLZ has no apparent effect in session 11, yet induces a large performance increase in sessions 12 and 13. Even then, the performance in sessions 12 and 13 (~30 successful pellets) is roughly comparable to Session 5 in Figure 1F. Given this, it is questionable whether the authors can conclude that Chrna2 cell activation truly facilitates previously acquired motor skills?

      (5) Figure 5 - The authors report decreased performance in the pasta-handling task (presumably representing a newly learned skill) but observe no difference in the pellet-reaching task (presumably an already acquired skill). This appears to contradict the authors' main claim that Chrna2 cell activation facilitates previously acquired motor skills.

      (6) Supplementary Figure 1 - The c-fos staining appears unusually clean. Previous studies have shown that even in home-cage mice, there are substantial numbers of c-fos⁺ cells in M1 under basal conditions (PMID 31901303, 31901303). Additionally, the authors should present Chrna2 cell labeling and c-fos staining in separate channels. As currently shown, it is difficult to determine whether the c-fos⁺ cells are truly Chrna2 cells⁺.

      Overall, the authors selectively report statistical comparisons only for findings that support their claims, while most other potentially informative comparisons are omitted. Complete and transparent reporting is necessary for proper interpretation of the data.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Malfatti et al. study the role of Chrna2 Martinotti cells (Mα2 cells), a subset of SST interneurons, for motor learning and motor cortex activity. The authors trained mice on a forelimb prehension task while recording neuronal activity of pyramidal cells using calcium imaging with a head-mounted miniscope. While chemogenetically increasing Mα2 cell activity did not affect motor learning, it changed pyramidal cell activity such that activity peaks became sharper and differently timed than in control mice. Moreover, co-active neuronal assemblies become more stable with a smaller spatial distribution. Increasing Mα2 cell activity in previously trained mice did increase performance on the prehension task and led to increased theta and gamma band activity in the motor cortex. On the other hand, genetic ablation of Mα2 cells affected fine motor movements on a pasta handling task while not affecting the prehension task.

      Strengths:

      The proposed question of how Chrna2-expressing SST interneurons affect motor learning and motor cortex activity is important and timely. The study employs sophisticated approaches to record neuronal activity and manipulate the activity of a specific neuronal population in behaving mice over the course of motor learning. The authors analyze a variety of neuronal activity parameters, comparing different behavior trials, stages of learning, and the effects of Mα2 cell activation. The analysis of neuronal assembly activity and stability over the course of learning by tracking individual neurons throughout the imaging sessions is notable, since technically challenging, and yielded the interesting result that neuronal assemblies are more stable when activating Mα2 cells.

      Overall, the study provides compelling evidence that Mα2 cells regulate certain aspects of motor behaviors, likely by shaping circuit activity in the motor cortex.

      Weaknesses:

      The main limitation of the study lies in its small sample sizes and the absence of key control experiments, which substantially weaken the strength of the conclusions.

      Core findings of this paper, such as the lack of effect of Mα2 cell activation on motor learning, as well as the altered neuronal activity, rely ona sample size of n=3 mice per condition, which is likely underpowered to detect differences in behavior and contributes to the somewhat disconnected results on calcium activity, activity timing, and neuronal assembly activity.

      More comprehensive analyses and data presentation are also needed to substantiate the results. For example, examining calcium activity and behavioral performance on a trial-by-trial basis could clarify whether closely spaced reaching attempts influence baseline signals and skew interpretation.

      The study uses cre-negative mice as controls for hM3Dq-mediated activation, which does not account for potential effects of Cre-dependent viral expression that occur only in Cre-positive mice.

      This important control would be necessary to substantiate the conclusion that it is increased Mα2 cell activity that drives the observed changes in behavior and cortical activity.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how human temporal voice areas (TVA) respond to vocalizations from nonhuman primates. Using functional MRI during a species-categorization task, the authors compare neural responses to calls from humans, chimpanzees, bonobos, and macaques while modeling both acoustic and phylogenetic factors. They find that bilateral anterior TVA regions respond more strongly to chimpanzee than to other nonhuman primate vocalizations, suggesting that these regions are sensitive not only to human voices but also to acoustically and evolutionarily related sounds.

      The work provides important comparative evidence for continuity in primate vocal communication and offers a strong empirical foundation for modeling how specific acoustic features drive TVA activity.

      Strengths:

      ­(1) Comparative scope: The inclusion of four primate species, including both great apes and monkeys, provides a rare and valuable cross-species perspective on voice processing.

      ­(2) Methodological rigor: Acoustic and phylogenetic distances are carefully quantified and incorporated into the analyses.

      ­(4) Neuroscientific significance: The finding of TVA sensitivity to chimpanzee calls supports the view that human voice-selective regions are evolutionarily tuned to certain acoustic features shared across primates.

      ­(4) Clear presentation: The study is well organized, the stimuli well controlled, and the imaging analyses transparent and replicable.

      ­(5) Theoretical contribution: The results advance understanding of the neural bases of voice perception and the evolutionary roots of voice sensitivity in the human brain.

      Weaknesses:

      ­(1) Acoustic-phylogenetic confound: The design does not fully disentangle acoustic similarity from phylogenetic proximity, as species co-vary along both dimensions. A promising way to address this would be to include an additional model focusing on the acoustic features that specifically differentiate bonobo from chimpanzee calls, which share equal phylogenetic distance to humans.

      ­(2) Selectivity vs. sensitivity: Without non-vocal control sounds, the study cannot determine whether TVA responses reflect true selectivity for primate vocalizations or general auditory sensitivity.<br /> ­<br /> (3) Task demands: The use of an active categorization task may engage additional cognitive processes beyond auditory perception; a passive listening condition would help clarify the contribution of attention and task performance.

      ­(4) Figures and presentation: Some results are partially redundant; keeping only the most representative model figure in the main text and moving others to the Supplementary Material would improve clarity.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigated how the human brain responds to vocalizations from multiple primate species, including humans, chimpanzees, bonobos, and rhesus macaques. The central finding - that subregions of the temporal voice areas (TVA), particularly in the bilateral anterior superior temporal gyrus, show enhanced responses to chimpanzee vocalizations - suggests a potential neural sensitivity to calls from phylogenetically close nonhuman primates.

      Strengths:

      The authors employed three analytical models to consistently demonstrate activation in the anterior superior temporal gyrus that is specific to chimpanzee calls. The methodology was logical and robust, and the results supporting these findings appear solid.

      Weakness:

      The interpretation of the findings in this paper regarding the evolutionary continuity of voice processing lacks sufficient evidence. A simple explanation is that the observed effects can be attributed to the similarity in low-level acoustic features, rather than effects specific to phylogenetically close species. The authors only tested vocalizations from three non-human primate species, other than humans. In this case, the species specificity of the effect does not fully represent the specificity of evolutionary relatedness.

    3. Reviewer #3 (Public review):

      Summary:

      Ceravolo et al. employed functional magnetic resonance imaging (fMRI) to examine how the temporal voice areas (TVA) in the human brain respond to vocalizations from different nonhuman primate species. Their findings reveal that the human TVA is not only responsible for human vocalizations but also exhibits sensitivity to the vocalizations of other primates, particularly chimpanzee vocalizations sharing acoustic similarities with human voices, which offers compelling evidence for cross-species vocal processing in the human auditory system. Overall, the study presents intellectually stimulating hypotheses and demonstrates methodological originality. However, the current findings are not yet solid enough to fully support the proposed claims, and the presentation could be enhanced for clarity and impact.

      Strengths:

      The study presents intellectually stimulating hypotheses and demonstrates methodological originality.

      Weaknesses:

      (1) The analysis of the fMRI data does not account for the participants' behavioral performance, specifically their reaction times (RTs) during the species categorization task.

      (2) The figure organization/presentation requires significant revision to avoid confusion and redundancy.

    1. Reviewer #1 (Public review):

      In this manuscript, the authors used a coarse-grained DNA model (cgNA+) to explore how DNA sequences and CpG methylation/hydroxymethylation influence nucleosome wrapping energy and the probability density of optimal nucleosomal configuration. Their findings indicate that both methylated and hydroxymethylated cytosines lead to increased nucleosome wrapping energy. Additionally, the study demonstrates that methylation of CpG islands increases the probability of nucleosome formation.

      The major strength of this method is that the model explicitly includes the phosphate group as DNA-histone binding site constraints, enhancing CG model accuracy and computational efficiency and allowing comprehensive calculations of DNA mechanical properties and deformation energies.

      The revised version has addressed the concerns raised previously, significantly strengthening the study.

    2. Reviewer #2 (Public review):

      Summary:

      This study uses a coarse-grained model for double stranded DNA, cgNA+, to assess nucleosome sequence affinity. cgNA+ coarse-grains DNA on the level of bases and accounts also explicitely for the positions of the backbone phosphates. It has been proven to reproduce all-atom MD data very accurately. It is also ideally suited to be incorporated into a nucleosome model because it is known that DNA is bound to the protein core of the nucleosome via the phosphates.

      It is still unclear whether this harmonic model parametrized for unbound DNA is accurate enough to describe DNA inside the nucleosome. Previous models by other authors, using more coarse-grained models of DNA, have been rather successful in predicting base pair sequence dependent nucleosome behavior. This is at least the case as long as DNA shape is concerned whereas assessing the role of DNA bendability (something this paper focuses on) has been consistingly challenging in all nucleosome models to my knowledge.

      It is thus of major interest whether this more sophisticated model is also more successful in handling this issue. As far as I can tell the work is technically sound and properly accounts for not only the energy required in wrapping DNA but also entropic effects, namely the change in entropy that DNA experiences when going from the free state to the bound state. The authors make an approximation here which seems to me to be a reasonable first step.

      Of interest is also that the authors have the parameters at hand to study the effect of methylation of CpG-steps. This is especially interesting as this allows to study a scenario where changes in the physical properties of base pair steps via methylation might influence nucleosome positioning and stability in a cell-type specific way.

      Overall, this is an important contribution to the questions of how sequence affects nucleosome positioning and affinity. The findings suggest that cgNA+ has something new to offer. But the problem is complex, also on the experimental side, so many questions remain open. Despite of this, I highly recommend publication of this manuscript.

      Strengths:

      The authors use their state-of-the-art coarse grained DNA model which seems ideally suited to be applied to nucleosomes as it accounts explicitly for the backbone phosphates.

      Weaknesses:

      The authors introduce penalty coefficients c_i to avoid steric clashes between the two DNA turns in the nucleosome. This requires c_i-values that are so high that standard deviations in the fluctuations of the simulation are smaller than in the experiments.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, authors utilize biophysical modeling to investigate differences in free energies and nucleosomal configuration probability density of CpG islands and nonmethylated regions in the genome. Toward this goal, they develop and apply the cgNA+ coarse-grained model, an extension of their prior molecular modeling framework.

      Strengths:

      The study utilizes biophysical modeling to gain mechanistic insight into nucleosomal occupancy differences in CpG and nonmethylated regions in the genome.

      Weaknesses:

      Although the overall study is interesting, the manuscripts need more clarity in places. Moreover, the rationale and conclusion for some of the analyses are not well described.

      Comments on revised version:

      The authors have addressed my concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This study examines letter-shape knowledge in a large cohort of children with minimal formal reading instruction. The authors report that these children can reliably distinguish upright from inverted letters despite limited letter naming abilities. They also show a visual-search advantage for upright over inverted letters, and this advantage correlates with letter-shape familiarity. These findings suggest that specialized letter-shape representations can emerge with very limited letter-sound mapping practice.

      Strengths:

      This study investigates whether children can develop letter-shape knowledge independently of letter-sound mapping abilities. This question is theoretically important, especially in light of functional subdivisions within the visual word form area (VWFA), with posterior regions associated with letter/orthographic shape and anterior regions with linguistic features of orthography (Caffarra et al., 2021; Lerma-Usabiaga et al., 2018). The study also includes a large sample of children at the very beginning of formal reading instruction, thereby minimizing the influence of explicit instruction on the formation of letter-shape knowledge.

      Weakness:

      A central concern is that a production task (naming) is used to index letter-name knowledge, whereas letter-shape knowledge is assessed with recognition. Production tasks impose additional demands (motor planning, articulation) and typically yield lower performance than recognition tasks (e.g., letter-sound verification). Thus, comparisons between letter-shape and letter-name knowledge are confounded by task type. The authors' partial-correlation and multiple-regression analyses linking familiarity (but not production) to the upright-search advantage are informative; however, they do not resolve the recognition-versus-production mismatch. Consequently, the current data cannot unambiguously support the claim that letter-shape representations are independent of letter-name knowledge.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors propose that there are two types of letter knowledge: knowledge about letter sound and knowledge about letter shape. Based on previous studies on implicit statistical learning in adults and babies, the authors hypothesized that passive exposure to letters in the environment allows early readers to acquire knowledge of letter shapes even before knowledge of letter-sound association. Children performed a set of experiments that measures letter shape familiarity, letter-sound association performance, visual processing of letters, and a reading-related cognitive skill. The results show that even the children who have little to no knowledge of letter names are familiar with letter shapes, and that this letter shape familiarity is predictive of performance in visual processing of letters.

      Strengths:

      The authors' hypothesis is based on widely accepted findings in vision science that repeated exposure to certain stimuli promotes implicit learning of, for example, statistical properties of the stimuli. They used simple and well-established tasks in large-scale experiments with a special population (i.e., children). The data analysis is quite comprehensive, accounting for any alternative explanations when needed. The data support at least a part of their hypothesis that the knowledge of letter shapes is distinct from, and precedes, the knowledge of letter-sound association, and is associated with performance in visual processing of the letters. This study shed light on a rather overlooked aspect of letter knowledge, i.e., letter shapes, challenging the idea that letters are learned only through formal instruction and calling for future research on the role of passive exposure to letters in reading acquisition.

      Weaknesses:

      Although the authors have successfully identified the knowledge of letter shapes as another type of letter knowledge other than the knowledge of letter-sound association, the question of whether it drives the subsequent reading acquisition remains largely unanswered, despite it being strongly implied in the Introduction. The authors collected a RAN score, which is known to robustly predict future reading fluency, but it did not show a significant partial correlation with familiarity accuracy (i.e., familiarity accuracy is not necessary to predict RAN score). The authors discussed that the performance in visual processing of letters might capture unique variance in reading fluency unexplained by RAN scores, but currently, this claim seems speculative.

      Since even children without formal literacy instruction were highly familiar with letter shapes, it would be reasonable to assume that they had obtained the knowledge through passive exposure. However, the role of passive exposure was not directly tested in the study.

      Given the superimposed straight lines in Figure 2, I assume the authors computed Pearson correlation coefficients. Testing the statistical significance of the Pearson correlation coefficient requires the assumption of bivariate normality (and therefore constant variance of a variable across the range of the other). According to Figure 2, this doesn't seem to be met, as the familiarity accuracy is hitting the ceiling. The ceiling effect might not be critical in Figure 2, since it tends to attenuate correlation, not inflate it. But in Figures 3 and 4, the authors' conclusion depends on the non-significant partial correlation. In fact, the authors themselves wrote that the ceiling effect might lead to a non-significant correlation even if there is an actual effect (line 404).

    3. Reviewer #3 (Public review):

      Summary:

      This study examined how young children with minimal reading instruction process letters, focusing on their familiarity with letter shapes, knowledge of letter names, and visual discrimination of upright versus inverted letters. Across four experiments, kindergarten and Grade 1 children could identify the correct orientation of letters even without knowing their names.

      Strengths:

      This study addresses an important research gap by examining whether children develop letter familiarity prior to formal literacy instruction and how this skill relates to reading-related cognitive abilities. By emphasizing letter familiarity alongside letter recognition, the study highlights a potentially overlooked yet important component of emergent literacy development.

      Weaknesses:

      The study's methods and results do not effectively test its stated research goals. Reading ability was not directly measured; instead, the authors inferred its relationship with reading from correlations between letter familiarity and reading-related cognitive measures, which limits the validity of their conclusions. Furthermore, the analytical approach was rather limited, relying primarily on simple and partial correlations without employing more advanced statistical methods that could better capture the underlying relationships.

      Major Comments:

      (1) Limited Novelty and Unclear Theoretical Contribution:

      The authors aim to challenge the view that children acquire letter shape knowledge only through formal literacy instruction, but similar questions regarding letter familiarity have already been explored in previous research. The manuscript does not clearly articulate how the present study advances beyond existing findings or why examining letter familiarity specifically before formal instruction provides new theoretical insight. Moreover, if letter familiarity and letter recognition are treated as distinct constructs, the authors should better justify their differentiation and clarify the theoretical significance of focusing on familiarity as an independent component of emergent literacy.

      (2) Overgeneralization to Reading Ability:

      Although the study measured several literacy-related cognitive skills and examined correlations with letter familiarity, it did not directly assess children's reading ability, as participants had not yet received formal literacy instruction. Therefore, the conclusion that letter familiarity influences reading skills (e.g., Line 519: "Our results are broadly consistent with previous work that has highlighted print letter knowledge as a strong predictor of future reading skills") is not fully supported and should be clarified or revised. To draw conclusions about the impact on reading ability, a longitudinal study would be more appropriate, assessing the relationship between letter familiarity and reading skills after children have received formal literacy instruction. If a longitudinal study is not feasible, measuring familial risk for dyslexia could provide an alternative approach to infer the potential influence of letter familiarity on later reading development.

      (3) Confusing and Limited Analytical Approach with Potential for More Sophisticated Modeling:

      The study employs a confusing analytical approach, alternating between simple correlational analyses and group-based comparisons, which may introduce circularity - for example, defining high vs. low familiarity groups partly based on performance differences in upright versus inverted letters and then observing a visual search advantage for upright letters within these groups. Moreover, the analyses are relatively simple: although multiple linear regression is mentioned, the results are not fully reported. These approaches may not fully capture the complex relationships among letter familiarity, recognition, visual search performance, RAN, and other covariates. More sophisticated modeling, such as mixed-effects models to account for repeated measures, structural equation modeling to examine latent constructs, or multivariate approaches jointly modeling familiarity and recognition effects, could provide a clearer understanding of the unique contribution of letter shape familiarity to early literacy outcomes. In addition, a large number of correlations were conducted without correction for multiple comparisons, which may increase the risk of false positives and raise concerns about the reliability of some significant findings.

    1. Reviewer #1 (Public review):

      In this manuscript, Domingo et al. present a novel perturbation-based approach to experimentally modulate the dosage of genes in cell lines. Their approach is capable of gradually increasing and decreasing gene expression. The authors then use their approach to perturb three key transcription factors and measure the downstream effects on gene expression. Their analysis of the dosage response curve of downstream genes reveals marked non-linearity.

      One of the strengths of this study is that many of the perturbations fall within the physiological range for each cis gene. This range is presumably between a single-copy state of heterozygous loss-of-function (log fold change of -1) and a three-copy state (log fold change of ~0.6). This is in contrast with CRISPRi or CRISPRa studies that attempt to maximize the effect of the perturbation, which may result in downstream effects that are not representative of physiological responses.

      Another strength of the study is that various points along the dosage-response curve were assayed for each perturbed gene. This allowed the authors to effectively characterize the degree of linearity and monotonicity of each dosage-response relationship. Ultimately, the study revealed that many of these relationships are non-linear, and that the response to activation can be dramatically different than the response to inhibition.

      To test their ability to gradually modulate dosage, the authors chose to measure three transcription factors and around 80 known downstream targets. As the authors themselves point out in their discussion about MYB, this biased sample of genes makes it unclear how this approach would generalize genome-wide. In addition, the data generated from this small sample of genes may not represent genome-wide patterns of dosage response. Nevertheless, this unique data set and approach represents a first step in understanding dosage-response relationships between genes.

      Another point of general concern in such screens is the use of the immortalized K562 cell line. It is unclear how the biology of these cell lines translates to the in vivo biology of primary cells. However, the authors do follow up with cell-type-specific analyses (Figures 4B, 4C, and 5A) to draw correspondence between their perturbation results and the relevant biology in primary cells and complex diseases.

      The conclusions of the study are generally well supported with statistical analysis throughout the manuscript. As an example, the authors utilize well-known model selection methods to identify when there was evidence for non-linear dosage response relationships.

      Gradual modulation of gene dosage is a useful approach to model physiological variation in dosage. Experimental perturbation screens that use CRISPR inhibition or activation often use guide RNAs targeting the transcription start site to maximize their effect on gene expression. Generating a physiological range of variation will allow others to better model physiological conditions.

      There is broad interest in the field to identify gene regulatory networks using experimental perturbation approaches. The data from this study provides a good resource for such analytical approaches, especially since both inhibition and activation were tested. In addition, these data provide a nuanced, continuous representation of the relationship between effectors and downstream targets, which may play a role in the development of more rigorous regulatory networks.

      Human geneticists often focus on loss-of-function variants, which represent natural knock-down experiments, to determine the role of a gene in the biology of a trait. This study demonstrates that dosage response relationships are often non-linear, meaning that the effect of a loss-of-function variant may not necessarily carry information about increases in gene dosage. For the field, this implies that others should continue to focus on both inhibition and activation to fully characterize the relationship between gene and trait.

      Comments on revisions:

      Thank you for responding to our comments. We have no further comments for the authors.

    2. Reviewer #2 (Public review):

      Summary:

      This work investigates transcriptional responses to varying levels of transcription factors (TFs). The authors aim for gradual up- and down-regulation of three transcription factors GFI1B, NFE2 and MYB in K562 cells, by using a CRISPRa- and a CRISPRi line, together with sgRNAs of varying potency. Targeted single-cell RNA sequencing is then used to measure gene expression of a set of 90 genes, which were previously shown to be downstream of GFI1B and NFE2 regulation. This is followed by an extensive computational analysis of the scRNA-seq dataset. By grouping cells with the same perturbations, the authors can obtain groups of cells with varying average TF expression levels. The achieved perturbations are generally subtle, not reaching half or double doses for most samples, and up-regulation is generally weak below 1.5-fold in most cases. Even in this small range, many target genes exhibit a non-linear response. Since this is rather unexpected, it is crucial to rule out technical reasons for these observations.

      Strengths:

      The work showcases how a single dataset of CRISPRi/a perturbations with scRNA-seq readout and an extended computational analysis can be used to estimate transcriptome dose-responses, a general approach that likely can be built upon in the future.<br /> Moreover, the authors highlight tiling of sgRNAs +/-1000bp around TSS as a useful approach. Compared with conventional direct TSS-targeting (+/- 200 bp), the larger sequence window allows placing more sgRNAs. Also it requires little prior knowledge of CREs, and avoids using "attenuated" sgRNAs which would require specialized sgRNA design.

      Weaknesses:

      The experiment was performed in a single replicate and it would have been reassuring to see an independent validation of the main findings, for example through measuring individual dose-response curves .

      Much of the analysis depends on the estimation of log-fold changes between groups of single cells with non-targeting controls and those carrying a guide RNA driving a specific knockdown. Generally, biological replicates are recommended for differential gene expression testing (Squair et al. 2021, https://doi.org/10.1038/s41467-021-25960-2). When using the FindMarkers function from the Seurat package, the authors divert from the recommendations for pseudo-bulk analysis to aggregate the raw counts (https://satijalab.org/seurat/articles/de_vignette.html). Furthermore, differential gene expression analysis of scRNA-seq data can suffer from mis-estimations (Nguyen et al. 2023, https://doi.org/10.1038/s41467-023-37126-3), and different computational tools or versions can affect these estimates strongly (Pullin et al. 2024, https://doi.org/10.1186/s13059-024-03183-0 and Rich et al. 2024, https://doi.org/10.1101/2024.04.04.588111). Therefore it would be important to describe more precisely in the Methods how this analysis was performed, any deviations from default parameters, package versions, and at which point which values were aggregated to form "pseudobulk" samples.

      Two different cell lines are used to construct dose-response curves, where a CRISPRi line allows gene down-regulation and the CRISPRa line allows gene upregulation. Although both lines are derived from the same parental line (K562) the expression analysis of Tet2, which is absent in the CRISPRi line, but expressed in the CRISPRa line (Fig. S1F, S3A) suggests clonal differences between the two lines. Similarly, the UMAP in S3C and the PCA in S4A suggest batch effects between the two lines. These might confound this analysis, even though all fold changes are calculated relative to the baseline expression in the respective cell line (NTC cells). Combining log2-fold changes from the two cell lines with different baseline expression into a single curve (e.g. Fig. 3) remains misleading, because different data points could be normalized to different base line expression levels.

      The study estimates the relationship between TF dose and target gene expression. This requires a system that allows quantitative changes in TF expression. The data provided does not convincingly show that this condition is met, which however is an essential prerequisite for the presented conclusions. Specifically, the data shown in Fig. S3A shows that upon stronger knock-down, a subpopulation of cells appear, where the targeted TF is not detected any more (drop-outs). Also in Fig. 3B (top) suggests that the knock-down is either subtle (similar to NTCs) or strong, but intermediate knock-down (log2-FC of 0.5-1) does not occur. Although the authors argue that this is a technical effect of the scRNA-seq protocol, it is also possible that this represents a binary behavior of the CRISPRi system. Previous work has shown that CRISPRi systems with the KRAB domain largely result in binary repression and not in gradual down-regulation as suggested in this study (Bintu et al. 2016 (https://doi.org/10.1126/science.aab2956), Noviello et al. 2023 (https://doi.org/10.1038/s41467-023-38909-4)).

      One of the major conclusions of the study is that non-linear behavior is common. It would be helpful to show that this observation does not arise from the technical concerns described in the previous points. This could be done for instance with independent experimental validations.

      Did the authors achieve their aims? Do the results support the conclusions?:

      Some of the most important conclusions, such as the claim that non-linear responses are common, are not well supported because they rely on accurately determining the quantitative responses of trans genes, which suffers from the previously mentioned concerns.

      Discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      Together with other recent publications, this work emphasizes the need to study transcription factor function with quantitative perturbations. The computational code repository contains all the valuable code with inline comments, but would have benefited from a readme file explaining the repository structure, package versions, and instructions to reproduce the analyses, including which input files or directory structure would be needed.

    1. Reviewer #1 (Public review):

      Summary:

      In the paper, the authors propose a new RNA velocity method, TSvelo, which predicts the transcription rate linearly based on the expression of RNA levels of transcription factors. This framework is an extension of its recent work TFvelo by including unspliced reads and designing a coherent neuralODE framework. Improved performance was demonstrated in six diverse datasets.

      Strengths:

      Overall, this method introduces innovative solutions to link cell differentiation and gene regulation, with a balance between model complexity (neuralODE) and interpretability (raw gene space).

      Weaknesses:

      While it seems to provide convincing results, there are multiple technical concerns for the authors to clarify and double-check.

      (1) The authors should clarify and discuss the TF-target map: here, the TF-target genes map is predefined by the TF binding's ChIP-seq data. This annotation is largely incomplete and mostly compiled from a set of bulk tissues. Therefore, for a certain population, the TF-target relation may change. This requires clarification and discussion, possibly exploring how to address this in the model. In addition, a regulon database could be added, e.g., DoRothEA?

      (2) The authors should clarify how example genes are selected. This is particularly unclear in Figure 2d.

      (3) The authors should clarify confidence in the statement in lines 179-180, that ANXA4 should initially decrease. This is particularly concerning, as TSvelo didn't capture the cell cycle transitions well during the initial part.

      (4) A support reference should be added for the statement in line 260 that "neuron migrations are inside-out manner". There is no reference supporting this, and this statement is critical for the model assessment.

      (5) The comparison to scMultiomics data is particularly interesting, as MultiVelo uses ATAC data to predict the transcription rate. It would be very insightful to add a direct comparison of the estimated transcription rate between using ATAC and directly using TFs' RNA expressions.

      (6) In Figure 6g, it should be clarified how the lineage was determined. Did the authors use the LARRY barcodes, predicted cell fate, or any other methods? Here, the best way is probably using the LARRY barcodes for individual clones.

    2. Reviewer #2 (Public review):

      Summary:

      Li et al. propose TSvelo, a computational framework for RNA velocity inference that models transcriptional regulation and gene-specific splicing using a neural ODE approach. The method is intended to improve trajectory reconstruction and capture dynamic gene expression changes in scRNA-seq data. However, the manuscript in its current form falls short in several critical areas, including rigorous validation, quantitative benchmarking, clarity of definitions, proper use of prior knowledge, and interpretive caution. Many of the authors' claims are not fully supported by the evidence.

      Major comments:

      (1) Modeling comments

      (a) Lines 512-513: How does the U-to-S delay validate the accuracy of pseudotime? Using only a single gene as an example is not sufficient for "validation."

      (b) Lines 512-518: The authors propose a strategy for selecting the initial state, but do not benchmark how accurate this selection procedure is, nor do they provide sufficient rationale. While some genes may indeed exhibit U-to-S delay during lineage differentiation, why does the highest U-to-S delay score indicate the correct initiation states? Please provide mathematical justification and demonstrate accuracy beyond using a single gene example. Maybe a simulation with ground truth could help here, too.

      (c) Equation (8): The formulation looks to be incorrect. If $$W \in \mathbb{R}^{G\times G}$$ and $$W' - \Gamma' \in \mathbb{R}^{K\times K}$$, how can they be aligned within the same row? Please clarify.

      (d) The use of prior knowledge graphs from ENCODE or ChEA to constrain regulation raises concerns. Much of the regulatory information in these databases comes from cell lines. How can such cell-line-based regulation be reliably applied to primary tissues, as is done throughout the manuscript? Additional experiments are needed to test the robustness of TSvelo with respect to prior knowledge.

      (e) Lines 579-580: How is the grid search performed? More methodological details are required. If an existing method was used, please provide a citation.

      (2) Application on pancreatic endocrine datasets

      (a) Lines 140-141: What is the definition of the final pseudotime-fitted time t or velocity pseudotime?

      (b) Lines 143-144: The use of the velocity consistency metric to benchmark methods in multi-lineage datasets is incorrect. In multi-lineage differentiation systems, cells (e.g., those in fate priming stages) may inherently show inconsistency in their velocity. Thus, it is difficult to distinguish inconsistency caused by estimation error from that arising from biological signals. Velocity consistency metrics are only appropriate in systems with unidirectional trajectories (e.g., cell cycling). The abnormally high consistency values here raise concerns about whether the estimated velocities meaningfully capture lineage differences.

      (c) The improvement of TSvelo over other methods in terms of cross-boundary direction correctness looks marginal; a statistical test would help to assess its significance.

      (d) Lines 177-178: Based on the figure, TSvelo does not appear to clearly distinguish cell types. A quantitative metric, such as Adjusted Rand Index (ARI), should be provided.

      (e) Lines 179-183: The claim that traditional methods cannot capture dynamics in the unspliced-spliced phase portrait is vague. What specific aspect is not captured-the fitted values or something else? Evidence is lacking. Please provide a detailed explanation and quantitative metrics to support this claim.

      (3) Application to gastrulation erythroid datasets

      (a) Lines 191-194: The observation that velocity genes are enriched for erythropoiesis-related pathways is trivial, since the analysis is restricted to highly variable genes (HVGs) from an erythropoiesis dataset. This enrichment is expected and therefore not informative.

      (b) Lines 227-228: It remains unclear how TSvelo "accurately captures the dynamics." What is the definition of dynamics in this context? Figure 3g shows unspliced/spliced vs. fitted time plots and phase portraits, but without a quantitative definition or measure, the claim of superiority cannot be supported. Visualization of a single gene is insufficient; a systematic and quantitative analysis is needed.

      (4) Application to the mouse brain and other datasets

      (a) Lines 280-281: The authors cannot claim that velocity streams are smoother in TSvelo than in Multivelo based solely on 2D visualization. Similarly, claiming that one model predicts the correct differentiation trajectory from a 2D projection is over-interpretation, as has been discussed in prior literature see PMID: 37885016.

      (b) Lines 304-306: Beyond transcriptional signal estimation, how is regulation inferred solely from scRNA-seq data validated, especially compared with scATAC-seq data? Are there cases where transcriptome-based regulatory inference is supported by epigenomic evidence, thereby demonstrating TSvelo's GRN inference accuracy?

      (c) The claim that TSvelo can model multi-lineage datasets hinges on its use of PAGA for lineage segmentation, followed by independent modeling of dynamics within each subset. However, the procedure for merging results across subsets remains unclear.

    3. Reviewer #3 (Public review):

      Despite the abundance of RNA velocity tools, there are still major limitations, and there is strong skepticism about the results these methods lead to. In this paper, the authors try to address some limitations of current RNA velocity approaches by proposing a unified framework to jointly infer transcriptional and splicing dynamics. The method is then benchmarked on 6 real datasets against the most popular RNA velocity tools.

      While the approach has the potential to be of interest for the field, and may present improvements compared to existing approaches, there are some major limitations that should be addressed, particularly concerning the benchmark (see major comment 1).

      Major comments:

      (1) My main criticism concerns the benchmarking: real data lack a ground truth, and are absolutely not ideal for comparing methods, because one can only speculate what results appear to be more plausible.<br /> A solid and extensive simulation study, which covers various scenarios and possibly distinct data-generating models, is needed for comparing approaches. The authors should check, for example, the simulation studies in the BayVel approach (Section 4, BayVel: A Bayesian Framework for RNA Velocity Estimation in Single-Cell Transcriptomics). Clearly, all methods should be included in the simulation.

      (2) Related to the above: since a ground truth is missing, the real data analyses need to be interpreted with caution. I recommend avoiding strong statements, such as "successfully captures the correct gene dynamics", or "accurately infer", in favour of milder statements supported by the data, such as "... aligns with the biological processes described" (as in page 12), or "results are compatible with current biological knowledge", etc...

      (3) Many methods perform RNA velocity analyses. While there is a brief description, I think it'd be useful to have a schematic summary (e.g., via a Table) of the main conceptual, mathematical, and computational characteristics of each approach.

      (4) Related to the above: I struggled to identify the main conceptual novelty of TSvelo, compared to existing approaches. I recommend explaining this aspect more extensively.

      (5) A computational benchmark is missing; I'd appreciate seeing the runtime and memory cost of all methods in a couple of datasets.

      (6) I think BayVel (mentioned above) should be added to the list of competing methods (both in the text and in the benchmarks). The package can be found here: https://github.com/elenasabbioni/BayVel_pkgJulia .

    1. Reviewer #1 (Public review):

      Summary:

      Stemming from the previous research on the adaptation of methylotrophic microbes in the phyllosphere environment, this paper tested a novel hypothesis on the molecular and cellular mechanisms by which yeast uses biomolecular condensates as unique niches for the regulation of methanol-induced mRNAs. While a few in vivo experiments were conducted in the phyllosphere, more assays were carried out on plates to mimic various stress conditions, diminishing the reliability of the conclusions in supporting the main hypothesis.

      Strengths:

      This study addressed an interesting and important biological question. Some of the experiments were conducted methodically and carefully. The visualization of both the biomolecular condensates and the mRNAs was helpful in addressing the questions. The results are expected to be useful in paving the way for the future study to directly test its main hypothesis. The results of this study could also have a general implication for the adaptation of a huge population of microbes in the enormous space of the phyllosphere on Earth.

      Weaknesses:

      The results were often over- and misinterpreted. Given mthat any hypotheses were tested indirectly on plates, the correlative results could only be used to carefully suggest the likelihood of the hypotheses. For example, a single edc3 mutant was used to represent a P-body-defective strain, although it is well known that EDC3 is a critical component in mRNA decapping; hence, the mutant should display a pleiotropic phenotype, rather than a mere reduced P-body phenotype. Using a similar reductionist approach, the study went on to employ a series of plate assays to argue that the conditions were mimicking the phyllosphere, which could be misleading under these circumstances. Furthermore, the low percentage of the colocalization between P-bodies and mimRNA granules and the similar results from negative control mRNAs do not convincingly support the idea that mimRNAs are sequestered between two biomolecular condensates, and P-bodies could serve as regulatory hubs. Given that the abundance of mimRNA granules was positively correlated with the transcript abundance of mimRNAs, and P-body abundance did not change too much under methanol induction, the results could not support an active mimRNA sequestration mechanism from mimRNA granules to P-bodies with a proportional increase of the overlap between the two condensates. More direct experiments conducted in the phyllosphere using multiple P-body defective yeast strains should strengthen the manuscript, assuming all the results turned out to be supportive.

    2. Reviewer #2 (Public review):

      Summary:

      This article aims to elucidate the potential roles of P-bodies in yeast adaptation to complex environmental conditions, such as the plant leaf phyllosphere. The authors demonstrated that yeast mutants defective in one of the P-body-localized proteins failed to grow in the Arabidopsis thaliana phyllosphere. They conducted detailed imaging analyses, focusing particularly on the co-localization of P-bodies and mRNAs (DAS1) related to the methanol metabolism pathway under various environmental conditions. The study newly revealed that these mRNAs form dot-like structures that occasionally co-localize with a P-body marker. Furthermore, the authors showed that the number of P-body-labeled dots increases under stress conditions, such as H₂O₂ treatment, and that mRNA dots are more frequently localized to P-body-like structures. Based on these detailed observations, the authors hypothesize that P-bodies function to protect mRNAs from degradation, particularly under stress conditions.

      Strengths:

      I think the authors' attempt to elucidate the potential roles of P-bodies in yeast under stress conditions is novel, and the imaging data are overall very nice.

      Weaknesses:

      I believe the authors could make additional efforts to more clearly demonstrate that P-bodies are indeed required for yeast proliferation in the phyllosphere, as described below, since this represents the most novel aspect of the study.

    3. Reviewer #3 (Public review):

      Summary:

      The authors use fluorescent microscopy and fluorescent markers to investigate the requirement of P-bodies during growth on methanol, a common substrate available on plant leaves, by using a yeast edc3 mutant defective in P-body formation. Growth on methanol upregulates the transcription of methanol metabolic genes, which accumulate in granular structures, as observed by microscopy. Co-localization of P-bodies and granules was quantified and described as dynamically enhanced during oxidative stress. Ultimately, the authors suggest a model where methanol induces the accumulation of methanol-induced mRNAs in cytosolic granules, which dynamically interact with P-bodies, especially during oxidative stress, to protect the mRNAs from degradation. However, this model is not strongly supported by the provided data, as the quantification of the co-localization between different markers (of organelles and between P-body and granules) is not well presented or described in the text.

      Considering that there is only a small EDC3-dependent overlap between P-bodies and mimRNA granules, the claim that P-bodies regulate mimRNAs is not fully justified. Rather, EDC3 could also be involved in mimRNA granule formation, independent of P-bodies.

      Strengths:

      (1) The authors could show convincingly that P-bodies (using a P-body-deficient edc3-KO strain) are important for colonizing the plant phyllosphere and for the regulation of methanol-induced mRNAs (mimRNA).

      (2) The visualization of mimRNA granules and P-bodies using fluorescent markers is interesting and was validated by alternative methods, such as FISH staining.

      (3) The dynamic formation of mimRNA granules and P-bodies was demonstrated during growth on leaves and in artificial medium during oxidative stress. The mimRNA granules showed a similar dynamic as the abundances of several mimRNAs and their corresponding proteins.

      (4) A role of EDC3 in the formation of mimRNA granules was demonstrated. However, the link between P-bodies and mimRNA granules was not clearly shown.

      Weaknesses:

      (1) The study largely relies on fluorescent microscopy and co-localization measurements. However, the subcellular resolution is not very high; it is unclear how dot-like structures were measured and, importantly, how co-localization was quantified.

      (2) The text does not clarify to what degree P-bodies and mimRNA granules are different structures. Based on the images, the size of P-bodies and granules seems to be vastly different, making it unclear whether these structures are fused or separate, even if their markers are reported to overlap.

      (3) The evidence that mimRNA granules contain ribosome-free and ribosome-associated RNA is only based on inhibitors and microscopy, without providing further evidence measuring granule content by isolation and sequencing approaches.

      (4) Similarly, the co-localization with other organelle markers is not supported by quantitative data.

    1. Reviewer #1 (Public review):

      Summary

      The manuscript by Ma et al. provides robust and novel evidence that the noctuid moth Spodoptera frugiperda (Fall Armyworm) possesses a complex compass mechanism for seasonal migration that integrates visual horizon cues with Earth's magnetic field (likely its horizontal component). This is an important and timely study: apart from the Bogong moth, no other nocturnal Lepidoptera has yet been shown to rely on such a dual-compass system. The research therefore expands our understanding of magnetic orientation in insects with both theoretical (evolution and sensory biology) and applied (agricultural pest management, a new model of magnetoreception) significance.

      The study uses state-of-the-art methods and presents convincing behavioural evidence for a multimodal compass. It also establishes the Fall Armyworm as a tractable new insect model for exploring the sensory mechanisms of magnetoreception, given the experimental challenges of working with migratory birds. Overall, the experiments are well-designed, the analyses are appropriate, and the conclusions are generally well supported by the data.

      Strengths

      (1) Novelty and significance: First strong demonstration of a magnetic-visual compass in a globally relevant migratory moth species, extending previous findings from the Bogong moth and opening new research avenues in comparative magnetoreception.

      (2) Methodological robustness: Use of validated and sophisticated behavioural paradigms and magnetic manipulations consistent with best practices in the field. The use of 5-minute bins to study the dynamic nature of the magnetic compass which is anchored to a visual cue but updated with a latency of several minutes, is an important finding and a new methodological aspect in insect orientation studies.

      (3) Clarity of experimental logic: The cue-conflict and visual cue manipulations are conceptually sound and capable of addressing clear mechanistic questions.

      (4) Ecological and applied relevance: Results have implications for understanding migration in an invasive agricultural pest with an expanding global range.

      (5) Potential model system: Provides a new, experimentally accessible species for dissecting the sensory and neural bases of magnetic orientation.

      Weaknesses

      While the study is strong overall, several recommendations should be addressed to improve clarity, contextualisation, and reproducibility:

      (1) Structure and presentation of results

      Requires reordering the visual-cue experiments to move from simpler (no cues) to more complex (cue-conflict) conditions, improving narrative logic and accessibility for non-specialists.

      (2) Ecological interpretation

      (a) The authors should discuss how their highly simplified, static cue setup translates to natural migratory conditions where landmarks are dynamic, transient or absent.

      (b) Further consideration is required regarding how the compass might function when landmarks shift position, are obscured, or are replaced by celestial cues. Also, more consolidated (one section) and concrete suggestions for future experiments are needed, with transient, multiple, or more naturalistic visual cues to address this.

      (3) Methodological details and reproducibility

      (a) It would be better to move critical information (e.g., electromagnetic noise measurements) from the supplementary material into the main Methods.

      (b) Specifying luminance levels and spectral composition at the moth's eye is required for all visual treatments.

      (c) Details are needed on the sex ratio/reproductive status of tested moths, and a map of the experimental site and migratory routes (spring vs. fall) should be included.

      (d) Expanding on activity-level analyses is required, replacing "fatigue" with "reduced flight activity," and clarifying if such analyses were performed.

      (4) Figures and data presentation

      (a) The font sizes on circular plots should be increased; compass labels (magnetic North), sample sizes, and p-values should be included.

      (b) More clarity is required on what "no visual cue" conditions entail, and schematics or photos should be provided.

      (c) The figure legends should be adjusted for readability and consistency (e.g., replace "magnetic South" with magnetic North, and for box plots better to use asterisks for significance, report confidence intervals).

      (5) Conceptual framing and discussion

      (a) Generalisations across species should be toned down, given the small number of systems tested by overlapping author groups.

      (b) It requires highlighting that, unlike some vertebrates, moths require both magnetic and visual cues for orientation.

      (c) It should be emphasised that this study addresses direction finding rather than full navigation.

      (d) Future Directions should be integrated and consolidated into one coherent subsection proposing realistic next steps (e.g., more complex visual environments, temporal adaptation to cue-field relationships).

      (e) The limitations should be better discussed, due to the artificiality of the visual cue earlier in the Discussion.

      (6) Technical and open-science points

      • Appropriate circular statistics should be used instead of t-tests for angular data shown in the supplementary material.

      • Details should be provided on light intensities, power supplies, and improvements to the apparatus.

      • The derivation of individual r-values should be clarified.

      • Share R code openly (e.g., GitHub).

      • Some highly relevant - yet missing - recent and relevant citations should be added, and some less relevant ones removed.

    2. Reviewer #2 (Public review):

      Summary:

      This work provided experimental evidence on how geomagnetic and visual cues are integrated, and visual cues are indispensable for magnetic orientation in the nocturnal fall armyworm.

      Strengths:

      Although it has been demonstrated previously that the Australian Bogon moth could integrate global stellar cues with the geomagnetic field for long-distance navigation, the study presented in this manuscript is still fundamentally important to the field of magnetoreception and sensory biology. It clearly shows that the integration of geomagnetic and visual cues may represent a conserved navigational mechanism broadly employed across migratory insects. I find the research very important, and the results are presented very well.

      Weaknesses:

      The authors developed an indoor experimental system to study the influence of magnetic fields and visual cues on insect orientation, which is certainly a valuable approach for this field. However, the ecological relevance of the visual cue may be limited or unclear based on the current version. The visual cues were provided "by a black isosceles triangle (10 cm high, 10 cm 513 base) made from black wallpaper and fixed to the horizon at the bottom of the arena". It is difficult to conceive how such a stimulus (intended to represent a landmark like a mountain) could provide directional information for LONG-DISTANCE navigation in nocturnal fall armyworms, particularly given that these insects would have no prior memory of this specific landmark. It might be a good idea to make a more detailed explanation of this question.

    1. Reviewer #1 (Public review):

      Summary:

      Zhou and colleagues introduce a series of generalized Gaussian process models for genotype-phenotype mapping. The goal was to develop models that were more powerful than standard linear models, while retaining explanatory power as opposed to neural network approaches. The novelty stems from choices of prior distributions (and I suppose fitted posteriors) that model epistasis based on some form of site/allele-specific modifier effect and genotype distance. The authors then apply their models to three empirical datasets, the GB1 antibody-binding dataset, the human 5' splice set dataset, and a yeast meiotic cross dataset, and find substantially improved variance explained while retaining strong explanatory power when compared to linear models.

      Strengths:

      The main strength of the manuscript lies in the development of the modeling approaches, as well as the evidence from the empirical dataset that the variance explained is improved.

      Weaknesses:

      The main weakness of the paper is that none of the models were tested on an in silico dataset where the ground truth is known. Therefore, it is unclear if their model actually retains any explanatory power.

      Impact:

      Genotype-phenotype mapping is a central point of genetics. However, the function is complex and unknown. Simple linear models can uncover some functional link between genes and their effects, but do so through severe oversimplification of the system. On the other hand, neural networks can, in principle, model the function perfectly, but it does so without easy interpretation. Gaussian regression is another approach that improves on linear regression, allowing better fitting of the data while allowing interpretation of the underlying alleles and their effects. This approach, now computable with state-of-the-art algorithms, will advance the field of genotype-to-phenotype associations.

    2. Reviewer #2 (Public review):

      This paper builds on prior work by some of the same authors on how to model fitness landscapes in the presence of epistasis. They have previously shown how simply writing general expansions of fitness in terms of one-body plus two-body plus three-body, etc., terms often fails to generalize to good predictions. They have also previously introduced a Gaussian process regression approach regarding how much epistasis there should be of each order.

      This paper contains several main advances:

      (1) They implement a more efficient form of the Gaussian process model fitting that uses GPUs and related algorithmic advances to enable better fitting of these models to datasets for larger sequences.

      (2) They provide a software package implementing the above.

      (3) They generalize the models to allow the extent of epistasis associated with changes in sequence to depend on specific sites, alleles, and mutations.

      (4) They show modest improvements in prediction and substantial improvements in interpretability with the more generalized models above.

      Overall, while this paper is quite technical, my assessment is that it represents a substantial conceptual and algorithmic advance for the above reasons, and I would recommend only modest revisions. The paper seems well-written and clear, given the inherent complexity of this topic.

    3. Reviewer #3 (Public review):

      Summary:

      The authors propose three types of Gaussian process kernels that extend and generalize standard kernels used for sequence-function prediction tasks, giving rise to the connectedness, Jenga, and general product models. The associated hyperparameters are interpretable and represent epistatic effects of varying complexity. The proposed models significantly outperform the simpler baselines, including the additive model, pairwise interaction model, and Gaussian process with a geometric kernel, in terms of R^2.

      Strengths:

      (1) The demonstrated performance boost and improved scaling with increasing training data are compelling.

      (2) The hyperparameter selection step using the marginal likelihood, as implemented by the authors, seems to yield a reasonable hyperparameter combination that lends itself to biologically plausible interpretations.

      (3) The proposed kernels generalize existing kernels in domain-interpretable ways, and can correspond to cases that would not be "physical" in the original models (e.g., $\mu_p>1$ in the original connectedness model that allows modeling of anticorrelated phenotypes).

      Weaknesses:

      (1) While enabling uncertainty quantification is a key advantage of Gaussian processes, the authors do not present metrics specific to the predicted uncertainties; all metrics seem to concern the mean predictions only. It would be helpful to evaluate coverage metrics and maybe include an application of the uncertainties, such as in active learning or Bayesian optimization.

      (2) The more complex models, like the general product model, place a heavier burden on the hyperparameter selection step. Explicitly discussing the optimization routine used here would be helpful to potential users of the method and code.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents an ambitious and technically impressive attempt to map how well humans can discriminate between colours across the entire isoluminant plane. The authors introduce a novel Wishart Process Psychophysical Model (WPPM) - a Bayesian method that estimates how visual noise varies across colour space. Using an adaptive sampling procedure, they then obtain a dense set of discrimination thresholds from relatively few trials, producing a smooth, continuous map of perceptual sensitivity. They validate their procedure by comparing actual and predicted thresholds at an independent set of sample points. The work is a valuable contribution to computational psychophysics and offers a promising framework for modelling other perceptual stimulus fields more generally.

      Strengths:

      The approach is elegant and well-described (I learned a lot!), and the data are of high quality. The writing throughout is clear, and the figures are clean (elegant in fact) and do a good job of explaining how the analysis was performed. The whole paper is tremendously thorough, and the technical appendices and attention to detail are impressive (for example, a huge amount of data about calibration, variability of the stim system over time, etc). This should be a touchstone for other papers that use calibrated colour stimuli.

      Weaknesses:

      Overall, the paper works as a general validation of the WPPM approach. Importantly, the authors validate the model for the particular stimuli that they use by testing model predictions against novel sample locations that were not part of the fitting procedure (Figure 2). The agreement is pretty good, and there is no overall bias (perhaps local bias?), but they do note a statistically-significant deviation in the shape of the threshold ellipses. The data also deviate significantly from historical measurements, and I think the paper would be considerably stronger with additional analyses to test the generality of its conclusions and to make clearer how they connect with classical colour vision research. In particular, three points could use some extra work:

      (1) Smoothness prior.<br /> The WPPM assumes that perceptual noise changes smoothly across colour space, but the degree of smoothness (the eta parameter) must affect the results. I did not see an analysis of its effects - it seems to be fixed at 0.5 (line 650). The authors claim that because the confidence intervals of the MOCS and the model thresholds overlap (line 223), the smoothing is not a problem, but this might just be because the thresholds are noisy. A systematic analysis varying this parameter (or at least testing a few other values), and reporting both predictive accuracy and anisotropy magnitude, would clarify whether the model's smoothness assumption is permitting or suppressing genuine structure in the data. Is the gamma parameter also similarly important? In particular, does changing the underlying smoothness constraint alter the systematic deviation between the model and the MOCS thresholds? The authors have thought about this (of course! - line 224), but also note a discrepancy (line 238). I also wonder if it would be possible to do some analysis on the posterior, which might also show if there are some regions of color space where this matters more than others? The reason for doing this is, in part, motivated by the third point below - it's not clear how well the fits here agree with historical data.

      (2) Comparison with simpler models. It would help to see whether the full WPPM is genuinely required. Clearly, the data (both here and from historical papers) require some sort of anisotropy in the fitting - the sensitivities decrease as the stimuli move away from the adaptation point. But it's >not< clear how much the fits benefit from the full parameterisation used here. Perhaps fits for a small hierarchy of simpler models - starting with isotropic Gaussian noise (as a sort of 'null baseline') and progressing to a few low-dimensional variants - would reveal how much predictive power is gained by adding spatially varying anisotropy. This would demonstrate that the model's complexity is justified by the data.

      (3) Quantitative comparison to historical data. The paper currently compares its results to MacAdam, Krauskopf & Karl, and Danilova & Mollon only by visual inspection. It is hard to extract and scale actual data from historical papers, but from the quality of the plotting here, it looks like the authors have achieved this, and so quantitative comparisons are possible. The MacAdam data comparisons are pretty interesting - in particular, the orientations of the long axes of the threshold ellipses do not really seem to line up between the two datasets - and I thought that the orientation of those ellipses was a critical feature of the MacAdam data. Quantitative comparisons (perhaps overall correlations, which should be immune to scaling issues, axis-ratio, orientation, or RMS differences) would give concrete measures of the quality of the model. I know the authors spend a lot of time comparing to the CIE data, and this is great.... But re-expressing the fitted thresholds in CIE or DKL coordinates, and comparing them directly with classical datasets, would make the paper's claims of "agreement" much more convincing.

      Overall, this is a creative and technically sophisticated paper that will be of broad interest to vision scientists. It is probably already a definitive methods paper showing how we can sample sensitivity accurately across colour space (and other visual stimulus spaces). But I think that until the comparison with historical datasets is made clear (and, for example, how the optimal smoothness parameters are estimated), it has slightly less to tell us about human colour vision. This might actually be fine - perhaps we just need the methods?

      Related to this, I'd also note that the authors chose a very non-standard stimulus to perform these measurements with (a rendered 3D 'Greebley' blob). This does have the advantage of some sort of ecological validity. But it has the significant >disadvantage< that it is unlike all the other (much simpler) stimuli that have been used in the past - and this is likely to be one of the reasons why the current (fitted) data do not seem to sit in very good agreement with historical measurements.

    2. Reviewer #2 (Public review):

      Summary:

      Hong et al. present a new method that uses a Wishart process to dramatically increase the efficiency of measuring visual sensitivity as a function of stimulus parameters for stimuli that vary in a multidimensional space. Importantly, they have validated their model against their own hold-out data and against 3 published datasets, as well as against colour spaces aimed at 'perceptual uniformity' by equating JNDs. Their model achieves high predictive success and could be usefully applied in colour vision science and psychophysics more generally, and to tackle analogous problems in neuroscience featuring smooth variation over coordinate spaces.

      Strengths:

      (1) This research makes a substantial contribution by providing a new method to very significantly increase the efficiency with which inferences about visual sensitivity can be drawn, so much so that it will open up new research avenues that were previously not feasible. Secondly, the methods are well thought out and unusually robust. The authors made a lot of effort to validate their model, but also to put their results in the context of existing results on colour discrimination, transforming their results to present them in the same colour spaces as used by previous authors to allow direct comparisons. Hold-out validation is a great way to test the model, and this has been done for an unusually large number of observers (by the standards of colour discrimination research). Thirdly, they make their code and materials freely available with the intention of supporting progress and innovation. These tools are likely to be widely used in vision science, and could of course be used to address analogous problems for other sensory modalities and beyond.

      Weaknesses:

      It would be nice to better understand what constraints the choice of basis functions puts on the space of possible solutions. More generally, could there be particular features of colour discrimination (e.g., rapid changes near the white point) that the model captures less well? The substantial individual differences evident in Figure S20 (comparison with Krauskopf and Gegenfurtner, 1992) are interesting in this context. Some observers show radial biases for the discrimination ellipses away from the white point, some show biases along the negative diagonal (with major axes oriented parallel to the blue-yellow axis), and others show a mixture of the two biases. Are these genuine individual differences, or could the model be performing less accurately in this desaturated region of colour space?

    3. Reviewer #3 (Public review):

      Summary:

      This study presents a powerful and rigorous approach for characterizing stimulus discriminability throughout a sensory manifold, and is applied to the specific context of predicting color discrimination thresholds across the chromatic plane.

      Strengths:

      Color discrimination has played a fundamental role in studies of human color vision and for color applications, but as the authors note, it remains poorly characterized. The study leverages the assumption that thresholds should vary smoothly and systematically within the space, and validates this with their own tests and comparisons with previous studies.

      Weaknesses:

      The paper assumes that threshold variations are due to changes in the level of intrinsic noise at different stimulus levels. However, it's not clear to me why they could not also be explained by nonlinearities in the responses, with fixed noise. Indeed, most accounts of contrast coding (which the study is at least in part measuring because the presentation kept the adapt point close to the gray background chromaticity, and thus measured increment thresholds), assume a nonlinear contrast response function, which can at least as easily explain why the thresholds were higher for colors farther from the gray point. It would be very helpful if a section could be added that explains why noise differences rather than signal differences are assumed and how these could be distinguished. If they cannot, then it would be better to allow for both and refer to the variation in terms of S/N rather than N alone.

      Related to this point, the authors note that the thresholds should depend on a number of additional factors, including the spatial and temporal properties and the state of adaptation. However, many of these again seem to be more likely to affect the signal than the noise.

      An advantage of the approach is that it makes no assumptions about the underlying mechanisms. However, the choice to sample only within the equiluminant plane is itself a mechanistic assumption, and these could potentially be leveraged for deciding how to sample to improve the characterization and efficiency. For example, given what we know about early color coding, would it be more (or less) efficient to select samples based on a DKL space, etc?

    1. Reviewer #1 (Public review):

      In this paper, the authors wished to determine human visuomotor mismatch responses in EEG in a VR setting. Participants were required to walk around a virtual corridor, where a mismatch was created by halting the display for 0.5s. This occurred every 10-15 seconds. They observe an occipital mismatch signal at 180 ms. They determine the specificity of this signal to visuomotor mismatch by subsequently playing back the same recording passively. They also show qualitatively that the mismatch response is larger than one generated in a standard auditory oddball paradigm. They conclude that humans therefore exhibit visuomotor mismatch responses like mice, and that this may provide an especially powerful paradigm for studying prediction error more generally.

      Asking about the role of visuomotor prediction in sensory processing is of fundamental importance to understanding perception and action control, but I wasn't entirely sure what to conclude from the present paradigm or findings. Visuomotor prediction did not appear to have been functionally isolated. I hope the comments below are helpful.

      (1) First, isolating visuomotor prediction by contrasting against a condition where the same video stream is played back subsequently does not seem to isolate visuomotor prediction. This condition always comes second, and therefore, predictability (rather than specifically visuomotor predictability) differs. Participants can learn to expect these screen freezes every 10-15 s, even precisely where they are in the session, and this will reduce the prediction error across time. Therefore, the smaller response in the passive condition may be partly explained by such learning. It's impossible to fully remove this confound, because the authors currently play back the visual specifics from the visuomotor condition, but given that the visuomotor correspondences are otherwise pretty stable, they could have an additional control condition where someone else's visual trace is played back instead of their own, and order counterbalanced. Learning that the freezes occur every 10-15 s, or even precisely where they occur, therefore, could not explain condition differences. At a minimum, it would be nice to see the traces for the first and second half of each session to see the extent to which the mismatch response gets smaller. This won't control for learning about the specific separations of the freezes, but it's a step up from the current information.

      (2) Second, the authors admirably modified their visual-only condition to remove nausea from 6 df of movement (3D position, pitch, yaw, and roll). However, despite the fact it's far from ideal to have nauseous participants, it would appear from the figures that these modifications may have changed the responses (despite some pairwise lack of significance with small N). Specifically, the trace in S3 (6DOF) and 2E look similar - i.e., comparing the visuomotor condition to the visual condition that matches. Mismatch at 4/5 microvolts in both. Do these significantly differ from each other?

      (3) It generally seems that if the authors wish to suggest that this paradigm can be used to study prediction error responses, they need to have controlled for the actions performed and the visual events. This logic is outlined in Press, Thomas, and Yon (2023), Neurosci Biobehav Rev, and Press, Kok, and Yon (2020) Trends Cogn Sci ('learning to perceive and perceiving to learn'). For example, always requiring Ps to walk and always concurrently playing similar visual events, but modifying the extent to which the visual events can be anticipated based on action. Otherwise, it seems more accurately described as a paradigm to study the influence of action on perception, which will be generated by a number of intertwined underlying mechanisms.

      More minor points:

      (1) I was also wondering whether the authors may consider the findings in frontal electrodes more closely. Within the statistical tests of the frontal electrodes against 0, as displayed in Figure 3c, the insignificance of the effect of Fp2 seems attributable to the small included sample size of just 13 participants for this electrode, as listed in Table S1, in combination with a single outlier skewing the result. The small sample size stands out especially in comparison to the sample size at occipital electrodes, which is double and therefore enjoys far more statistical power. It looks like the selected time window is not perfectly aligned for determining a frontal effect, and also the distribution in 3B looks like responses are absent in more central electrodes but present in occipital and frontal ones. I realise the focus of analysis is on visual processing, but there are likely to be researchers who find the frontal effect just as interesting.

      (2) It is claimed throughout the manuscript that the 'strongest predictor (of sensory input) - by consistency of coupling - is self-generated movement'. This claim is going to be hard to validate, and I wonder whether it might be received better by the community to be framed as an especially strong predictor rather than necessarily the strongest. If I hear an ambulance siren, this is an especially strong predictor of subsequent visual events. If I see a traffic light turn red, then yellow, I can be pretty certain what will happen next. Etc.

      (3) The checkerboard inversion response at 48 ms is incredibly rapid. Can the authors comment more on what may drive this exceptionally fast response? It was my understanding that responses in this time window can only be isolated with human EEG by presenting spatially polarized events (cf. c1, e.g., Alilovic, Timmermans, Reteig, van Gaal, Slagter, 2019, Cerebral Cortex)

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates whether visuomotor mismatch responses can be detected in humans. By adapting paradigms from rodent studies, the authors report EEG evidence of mismatch responses during visuomotor conditions and compare them to visual-only stimulation and mismatch responses in other modalities.

      Strengths:

      (1) The authors use a creative experimental design to elicit visuomotor mismatch responses in humans.

      (2) The study provides an initial dataset and analytical framework that could support future research on human visuomotor prediction errors.

      Weaknesses:

      (1) Methodological issues (e.g., volume conduction, channel selection, lack of control for eye movements) make it difficult to confidently attribute the observed mismatch responses to activity in visual cortical regions.

      (2) A very large portion of the data was excluded due to motion artefacts, raising concerns about statistical power and representativeness. The criteria for trial inclusion and the number of accepted trials per participant appear arbitrary and not justified with reference to EEG reliability standards.

      (3) The comparison across sensory modalities (e.g., auditory vs. visual mismatch responses) is conceptually interesting, but due to the choice of analyzing auditory mismatch responses over occipital channels, it has limited interpretability.

      The authors successfully demonstrate that visuomotor mismatch paradigms can, in principle, be applied in human EEG. However, due to the issues outlined above, the current findings are relatively preliminary. If validated with improved methodology, this approach could significantly advance our understanding of predictive processing in the human visual system and provide a translational bridge between rodent and human work.

    3. Reviewer #3 (Public review):

      Summary:

      Solyga, Zelechowski, and Keller present a concise report of an innovative study demonstrating clear visuomotor mismatch responses in ambulating humans, using a mobile EEG setup and virtual reality. Human subjects walked around a virtual corridor while EEGs were recorded. Occasionally, motion and visual flow were uncoupled, and this evoked a mismatch response that was strongest in occipitally placed electrodes and had a considerable signal-to-noise ratio. It was robust across participants and could not be explained by the visual stimulus alone.

      Strengths:

      This is an important extension of their prior work in mice, and represents an elegant translation of those previous findings to humans, where future work can inform theories of e.g., psychiatric diseases that are believed to involve disordered predictive processing. For the most part, the authors are appropriately circumspect in their interpretations and discussions of the implications. I found the discussion of the polarity differences they found in light of separate positive and negative prediction errors, intriguing.

      Weaknesses:

      The primary weaknesses rest in how the results are sold and interpreted.

      Most notably, the interpretation of the results of the comparison of visuomotor mismatches to the passive auditory oddball induced mismatch responses is inappropriate, as suboptimal electrode choices, unclear matching of trial numbers, and other factors. To clarify, regarding the auditory oddball portion in Figure 5, the data quality is a concern for the auditory ERPs, and the choice of Occipital electrodes is a likely culprit. Typically, auditory evoked responses are maximal at Cz or FCz, although these contacts don't seem to be available with this setup. In general, caution is warranted in comparing ERP peaks between two different sensory modalities - especially if attention is directed elsewhere (to a silent movie) during one recording and not during the other. The authors discuss this as a purely "qualitative" comparison in the text, which is appreciated, and do acknowledge the limitations within the results section, but the figure title and, importantly, the abstract set a different tone. At least, for comparisons between auditory mismatch and visuomotor mismatch, trial numbers need to be equated, as ERP magnitude can be augmented by noise (which reduces with increased numbers of trials in the average). And more generally, the size of the mismatch event at the scalp does not scale one-to-one with the size at the level of the neural tissue. One can imagine a number of variables that impact scalp level magnitudes, which are orthogonal to actual cortex-level activation - the size, spread, and polarity variance of the activated source (which all would diminish amplitude at the scalp due to polyphasic summation/cancelation). The variance of phase to a stimulus across trials (cross trial phase locking) vs magnitude of underlying power - the former, in theory, relates to bottom-up activity and the latter can reflect feedback (which has more variability in time across trials; the distance of the scalp electrode from the activated tissue (which, for the auditory system, would be larger (FCz to superior temporal gyrus) than for the visual system (O1 to V1/2)). None of this precludes the inclusion of the auditory mismatch, which is a strength of the study, but interpretations about this supporting a supremacy of sensory-motor mismatch - regardless of validity - are not warranted. I would recommend changing the way this is presented in the abstract.

      Otherwise, the data are of adequate quality to derive most of their conclusions.

      The authors claim that the mismatch responses emanate from within the occipital cortex, but I would require denser scalp coverage or a demonstration of consistent impedances across electrodes and across subjects to make conclusions about the underlying cortical sources (especially given the latencies of their peaks). In EEG, the distribution of voltage on the scalp is, of course, related to but not directly reflective of the distribution of the underlying sources. The authors are mostly careful in their discussion of this, but I would strongly recommend changing the work choice of "in occipital cortex" to "over occipital cortex" or even "posteriorly distributed". Even with very dense electrode coverage and co-registration to MRIs for the generation of forward models that constrain solutions, source localization of EEG signals is very challenging and not a simple problem. Given the convoluted and interior nature of human V1, the ability to reliably detect early evoked responses (which show the mismatch in mouse models) at the scalp in ERP peaks is challenging - especially if one is collapsing ERPs across subjects. And - given the latency of the mismatch responses, I'd imagine that many distributed cortical regions contribute to the responses seen at the scalp.

      I think that Figure 3C, but as a difference of visual mismatch vs halting flow alone (in the open loop) might be additionally informative, as it clarifies exactly where the pure "mismatch" or prediction error is represented.

      As a suggestion, the authors are encouraged to analyse time-frequency power and phase locking for these mismatch responses, as is common in much of the literature (see Roach et al 2008, Schizophrenia Bulletin). This is not to say that doing so will yield insights into oscillations per se, but converting the data to the time-frequency domain provides another perspective that has some advantages. It fosters translations to rodent models, as ERP peaks do not map well between species, but e.g., delta-theta power does (see Lee et al 2018, Neuropsychopharmacology; Javitt et al 2018, Schizophrenia research; Gallimore et al 2023, Cereb Ctx). Further, ERP peaks can be influenced by the actual neuroanatomy of an individual (especially for quantifying V1 responses). Time frequency analyses may aid in interpreting the "early negative deflection with a peak latency of 48 ms " finding as well.

      Finally, the sentence in the abstract that this paradigm " can trigger strong prediction error responses and consequently requires shorter recording 20 times would simplify experiments in a clinical setting" is a nice setup to the paper, but the very fact that one third of recordings had to be removed due to movement artifact, and that hairstyle modulates the recording SnR, is reason that this paradigm, using the reported equipment, may have limited clinical utility in its current form. Further, auditory oddball paradigms are of great clinical utility because they do not require explicit attention and can be recorded very quickly with no behavioral involvement of a hospitalized patient. This should be discussed, although it does not detract from the overall scientific importance of the study. The authors should reconsider putting this statement in the abstract.

    1. Reviewer #1 (Public review):

      Summary:

      Goicoechea et al. conducted a timely and thorough meta-analysis on the potential for indirect hippocampal targeted transcranial magnetic stimulation (TMS) to improve episodic memory. The authors included additional factors of interest in their meta-analysis, which can be used to inform the next generation of studies using this intervention. Their analysis revealed critical factors for consideration: TMS should be applied pre-encoding, individualized spatial targeting improves efficacy, and improvement of recollection was stronger than recognition.

      Strengths:

      As mentioned previously, the meta-analysis is timely and summarizes an emerging set of studies (over the past decade since Wang et al., Science 2014). Those outside of the field may not be aware of the robustness of improvements in episodic memory from hippocampal targeted TMS. The authors were quite thorough in including additional factors that are important for the interpretation of these findings. These factors also address the differences in approach across studies. The evidence that individualized spatial targeting improves TMS efficacy is consistent with recent advances in TMS for major depressive disorder. The specificity of the cognitive improvements to recollection of episodic memory and not for other cognitive domains is consistent with hippocampal targeting. The authors also plan to post the complete dataset on an open-source repository, which enables additional analysis by other researchers.

      Weaknesses:

      The write-up is succinct and emphasizes the scientific decisions that underlie key differences in the various experimental designs. While the manuscript is written for a scientific audience, the authors are likely aware that findings like this will be of broad appeal to the field of neurology, where treatments for memory loss are desperately needed. For this reason, the authors could consider including a statement regarding an interpretation of this meta-analysis from a clinical standpoint. Statements such as 'safe and effective' imply a clinical indication, and yet the manuscript does not engage with clinical trials terminology such as blinding, parallel arm versus crossover design, and trial phase. While the authors might prefer not to engage with this terminology, it can be confusing when studies delivering intervention-like five days of consecutive TMS (e.g., Wang et al., 2014) are clustered with studies that delivered online rhythmic TMS, which tests target engagement (e.g., Hermiller et al., 2020). While the 'sessions' variable somewhat addresses the basic-science versus intervention-like approach, adding an explicit statement regarding this in the discussion might help the reader navigate the broad scope of approaches that are utilized in the meta-analysis.

    2. Reviewer #2 (Public review):

      Summary:

      In 2014, Wang et al. showed that noninvasive stimulation of a parietal site, connected functionally to the hippocampus, increased resting state connectivity throughout a canonical network associated with episodic memory. It also produced a memory boost, which correlated with the connectivity increase across subjects. Their discovery that an imaging biomarker could be used to target a network (rather than a single cortical site) in individual subjects and provide a scaling measure of target modulation should have revolutionized the noninvasive neuromodulation field. This meta-analysis by members of the same group covers memory effects from noninvasive stimulation of various nodes of the "hippocampal" network.

      Strengths:

      This is a very timely summary and meta-analysis of this very promising application of TMS. To the limited extent of my expertise in meta-analysis, the methodology seems rigorous, and the central finding, that high-frequency stimulation of nodes in the hippocampal network reproducibly improves event recall, is amply supported. This should provide impetus for larger clinical trials and further quantification of the optimal dose, duration of effect, etc.

      Weaknesses:

      My critical comments are mainly on the framing and argument:

      (1) While the introduction centers on the role of the hippocampus in episodic memory and posits hippocampal neuromodulation by TMS as causative, the true mechanism may be more complex. Clean hippocampal lesions in primates cause focal loss of spatial and place memory, and I am aware of no specific evidence that the hippocampus does more than this in humans. Moreover, there is evidence that lateral parietal TMS also reaches neighboring temporal lobe regions, which contribute to episodic memory. The hippocampus may, therefore, be a reliable deep seed for connectivity-based targeting of the episodic memory network, but might not be the true or only functional target.

      (2) The meta-analysis combines studies with confirmation of targeting and target-network engagement from fMRI and studies without independent evidence of having stimulated the putative target (e.g., Koch et al). That seems like a more important methodological distinction than merely the use of any individual targeting method. In my experience, atlas-based estimates are at least as accurate as eyeballing cortical areas in individuals. Hence, entering individual functional targeting as a factor might reveal an effect on efficacy.

      (3) The funnel plot and Egger's regression for episodic memory outcomes suggested possible bias, and the average sample size of 23 is small, contributing to the likelihood of false positive results. It would be informative, therefore, to know how many or which studies had formal power estimates and what the predicted effect sizes were.

      (4) In the Discussion, the authors might provide a comparison between the effect size for memory improvement found here with those reported for other brain-targeted interventions and behavioral strategies. It may also be worthwhile pointing out that HITS/memory is one of the very few, or perhaps the only, neuromodulatory effects on cognition that has been extensively reproduced and survived rigorous meta-analysis.

      (5) The section of the Discussion on specificity compares HITS to transcranial electrical stimulation without specifying an anatomical target or intended outcome. A better contrast might be the enormous variety of cognitive and emotional effects claimed for TMS of the dorsolateral prefrontal cortex.

      (6) With reference to why other nodes in the episodic memory network have not been tested, current flow modeling shows TMS of the medial prefrontal cortex is unlikely to be achievable without stronger stimulation of the convexity under the coil, in addition to being uncomfortable. The lateral temporal lobe has been stimulated without undue discomfort.

      (7) Finally, a critical question hanging over the clinical applicability of HITS and other neuromodulation techniques is how well they will work on a damaged substrate. Functional and/or anatomical imaging might answer this question and help screen for likely responders. The authors' opinion on this would be informative.

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript by Goicoechea et al. assesses the influence of hippocampal-network targeted TMS to parietal cortex on episodic memory using a meta-analytic approach. This is an important contribution to the literature, as the number of studies using this approach to modulate memory/hippocampal function has clearly increased since the initial publication by Wang et al. 2014. This manuscript makes an important contribution to the literature. In general, the analysis is straightforward and the conclusions are well-supported by the results; I have mostly minor comments/concerns.

      Strengths:

      (1) A meta-analysis across published work is used to evaluate the influence of hippocampal-network-targeted TMS in parietal cortex on episodic memory. By pooling results across studies, the meta-analytic effects demonstrate an influence of TMS on memory across the diversity of many details in study design (specific tasks, stimuli, TMS protocols, study populations).

      (2) Selectivity with regard to episodic memory vs. non-episodic memory tasks is evaluated directly in the meta-analysis.

      (3) The investigation into supplemental factors as predictors of TMS's influence on memory was tested. This is helpful given the diversity of study designs in the literature. This analysis helps to shed light on which study designs, e.g., TMS protocols, etc., are most effective in memory modulation.

      Weaknesses:

      (1) My only significant concern is how studies are categorized in the 'Timing' factor (when stimulation is applied). Currently, protocols in which TMS is administered across days are categorized as 'pre-encoding' in the Timing factor. This has the potential to be misleading and may lead to inaccurate conclusions. When TMS is administered across multiple days, followed by memory encoding and retrieval (often on a subsequent day), it is not possible to attribute the influence of TMS to a specific memory phase (i.e., encoding or retrieval) per se. Thus, labeling multi-day TMS studies as 'pre-encoding' may be misleading to readers, as it may imply that the influence of TMS is due to modulation of encoding mechanisms per se, which cannot be concluded. For example, multi-day TMS protocols could be labeled as 'pre-retrieval' and be similarly accurate. This approach also pools results from TMS protocols with temporal specificity (i.e., those applied immediately during encoding and not on board during memory testing) and without temporal specificity (i.e., the case of multi-day TMS) regarding TMS timing. Given the variety of paradigms employed in the literature, and to maximize the utility/accuracy of this analysis, one suggestion is to modify the categories within the Timing factor, e.g., using labels like 'Temporally-Specific' and 'Temporally Non-specific'. The 'Temporally-Specific' category could be subdivided based on the specific memory process affected: 'encoding', 'retrieval', or 'consolidation' (if possible). I think this would improve the accuracy of the approach and help to reach more meaningful conclusions, given the variety of protocols employed in the literature.

      (2) As the scope of the meta-analysis is limited to TMS applied to parietal or superior occipital cortex, it is important to highlight this in the Introduction/Abstract. The 'HITS' terminology suggests a general approach that would not necessarily be restricted to parietal/nearby cortical sites.

      Minor:

      (1) To reduce the number of study factors tested, data reduction was performed via Lasso regression to remove factors that were not unique predictors of the influence of TMS on memory. This approach is reasonable; however, one limitation is that factors strongly correlated with others (and predict less unique variance) will be dropped. This may result in a misrepresentation, i.e., if readers interpret factors left out of this analysis as not being strongly related to the influence of TMS on memory. I do see and appreciate the paragraph in the Discussion which appropriately addresses this issue. However, it may be worth also considering an alternative analysis approach, if the authors have not already done so, which explicitly captures the correlation structure in the data (i.e., shown in Figure S2) using a tool like PCA or an appropriate factor analysis. Then, this shared covariance amongst factors can be tested as predictors of the influence of TMS - e.g., by testing whether component scores for dominant PCs are indeed predictive of the influence of TMS. This complementary approach would capture rather than obfuscate the extent to which different factors are correlated and assess their joint (rather than independent) influence on memory, potentially resulting in more descriptive conclusions. For example, TMS intensity and protocol may jointly influence memory.

      (2) Given the specific focus on TMS applied to parietal cortex to modulate hippocampal and related network function, it would be fruitful if the authors could consider adding discussion/speculation regarding whether this approach may be effectively broadened using other stimulation methods (e.g., tACS, tDCS), how it may compare to other non-invasive brain stimulation methods with depth penetration to target hippocampal function directly (transcranial temporal interference, or transcranial focused ultrasound), and/or how or whether other stimulation sites may or may not be effective.

      (3) Studies were only included in the meta-analysis if they contained objective episodic memory tests. How were studies handled that included both objective and subjective memory, or other non-episodic memory measures? For example, Yazar et al. 2014 showed no influence of TMS on objective recall, but an impairment in subjective confidence. I assume confidence was not included in the meta-analysis. Similarly, Webler et al. 2024 report results from both the mnemonic similarity task (presumably included) and a fear conditioning paradigm (presumably excluded). Please clarify in the methods how these distinctions were handled.

      (4) The analysis comparing memory to non-memory measures is important, showing the specificity of stimulation. Did the authors consider further categorizing the non-memory tasks into distinct domains (i.e., language, working memory, etc.)? If possible, this could provide a finer detail regarding the selectivity of influences on memory vs. other aspects of cognition. It is likely that other aspects of cognition dependent on hippocampal function may be modulated as well, i.e., tasks with high relational/associative processing demands.

      (5) In the analysis of the Intensity factor, how were studies using Active (rather than resting) MT categorized? Only resting MT is mentioned in Table S1. This is important as the original theta-burst TMS protocol from Huang et al. 2005 determines intensity based on Active Motor Threshold.

      (6) Is there a reason why the study by Koen et al. 2018 (Cognitive Neuroscience) was not included? TMS was performed during encoding to the left AG, and objective memory was assessed, so it would seemingly meet the inclusion criterion.

      (7) It would be helpful to briefly differentiate the current meta-analysis from that performed by Yeh & Rose (How can transcranial magnetic stimulation be used to modulate episodic memory?: A systematic review and meta-analysis, 2019, Frontiers in Psychology) (other than being more current).

      (8) For transparency and to facilitate further understanding of the literature and potential data re-use, it would be great if the authors consider sharing a supplementary table or file that describes how individual studies/memory measures were categorized under the factors listed in Table S1.

    1. Reviewer #1 (Public review):

      Summary:

      The authors show that the lower frequency (~5Hz) stimulation of the intermittent theta-burst stimulation (iTBS) via repetitive transcranial magnetic stimulation (rTMS) serves as a more effective stimulation paradigm than the high-frequency protocols (HF-rTMS, ~10Hz) with enhancing plasticity effects via long-term potentiation (LTP) and depression (LTD) mechanisms. They show that the 5 Hz patterned pulse structure of the iTBS is an exact subharmonic of the 10 Hz high-frequency rTMS, creating a connection between the two paradigms and acting upon the same underlying synchrony mechanism of the dominant alpha-rhythm of the corticothalamic circuit.

      First, the authors create a corticothalamic neural population model consisting of 4 populations: cortical excitatory pyramidal and inhibitory interneuron, and thalamic excitatory relay and inhibitory reticular populations. Second, the authors include a calcium-dependent plasticity model, in which calcium-related NMDAR-dependent synaptic changes are implemented using a BCM metaplasticity rule. The rTMS-induced fluctuations in intracellular calcium concentrations determine the synaptic plasticity effects.

      Strengths:

      The model (corticothalamic neural population with calcium-dependent plasticity, with TBS input for rTMS) is thoroughly built and analyzed.

      The conclusions seem sound and justified. The authors justifiably link stimulation parameters (especially the alpha subharmonics iTBS frequency) with fluctuations in calcium concentration and their effects on LTP and LTD in relevant parts of the corticothalamic circuit populations leading to a dampening of corticothalamic loop gains and enhancement of intrathalamic gains with an overall circuit-wide feedforward inhibition (= inhibitory activity is enhanced via excitatory inputs onto inhibitory neurons) and a resulting suppression of the activity power. In other words: alpha-resonant iTBS protocols achieve broadband power suppression via selective modulation of corticothalamic FFI.

      (1) The model is well-described, with the model equations in the main text and the parameters in well-formatted tables.

      (2) The relationship between iTBS timing and the phase of rhythms is well explained conceptually.

      (3) Metaplasticity and feedforward inhibition regulation as a driver for the efficacy of iTBS are well explored in the paper.

      (4) Efficacy of TBS, being based on mimicry of endogenous theta patterns, seems well supported by this simulation.

      (5) Recovery between periods of calcium influx as an explanation for why intermittency produces LTP effects where continuous stimulation fails is a good justification for calcium-based metaplasticity, as well as for the role of specific pulse rate.

      (6) Circuit resonance conclusion is interesting as a modulating factor; the paper supports this hypothesis well.

      (7) The analysis of corticothalamic dampening and intrathalamic enhancement in the 3D XYZ loop gain space is a strong aspect of the paper.

      Weaknesses:

      (1) Overall, the paper is difficult to follow narratively - the motivation (formulated as a specific research question) for each section can be a bit unclear. The paper could benefit from a minor rewrite at the start of each section to justify each section's reasoning. The Discussion is too long and should be shortened and limited to the main points.

      (2) While the paper refers to modelling and data in discussion, there is no direct comparison of the simulations in the figures to data or other models, so it's difficult to evaluate directly how well the modelling fits either the existing model space or data from this region. Where exactly the model/plasticity parameters from Table 5 and the NFTsim library come from is not easy to find. The authors should make the link from those parameters to experimental data clearer. For example, which clinical or experimental data are their simulations of the resting-state broadband power suppression based on?

      (3) The figures should be modified to make them more understandable and readable.

      (4) The claim in the abstract that the paper introduces "a novel paradigm for individualizing iTBS treatments" is too strong and sounds like overselling. The paper is not the first computational modelling of TBS - as acknowledged also by the authors when citing previous mean-field plasiticity modelling articles. Btw. the authors could briefly mention and include also references also to biophysically more detailed multi-scale approaches such as https://doi.org/10.1016/j.brs.2021.09.004 and https://doi.org/10.1101/2024.07.03.601851 and https://doi.org/10.1016/j.brs.2018.03.010

      (5) The modelling assumes the same CaDP model/mechanism for all excitatory synapses/afferents. How well is this supported by experimental evidence? Have all excitatory synaptic connections in the cortico-thalamic circuit been shown to express CaDP and metaplasticity? If not, these limitations (or predictions of the model) should be mentioned. Why were LTP calcium volumes never induced within thalamic relay-afferent connections se and sr? What about inhibitory synapses in the circuit model? Were they plastic or fixed?

      (6) Minor point: Metaplasticity is modelled as an activity-dependent shift in NMDAR conductance, which is supported by some evidence, but there are other metaplasticity mechanisms. Altering NMDA-synapse affects also directly synaptic AMPA/NMDA weight and ratio (which has not been modelled in the paper). Would the model still work using other - more phenomenological implementation of the sliding threshold - e.g. based on shifting calcium-dependent LTP/LTD windows or thresholds (for a phenomenological model of spike/voltage-based STDP-BCM rules, see https://doi.org/10.1007/s10827-006-0002-x and https://doi.org/10.1371/journal.pcbi.1004588) - maybe using a metaplasticity extension of Graupner and Brunel CaDP model. A brief discussion of these issues might be added to the manuscript - but this is just a suggestion.

      (7) Short-term plasticity (depression/facilitation) of synapses is neglected in the model. This limitation should be mentioned because adding short-term synaptic dynamics might affect strongly circuite model dynamics.

    2. Reviewer #2 (Public review):

      Transcranial magnetic stimulation is used in several medical conditions to alter brain activity, probably by induction of synaptic plasticity. The authors pursue the idea to personalise parameters of the stimulation protocol by adapting the stimulation frequency to an individual's brain rhythm. The authors test this approach in a population model connecting the cortex with deeper brain areas, the thalamocortical loop, which includes calcium-dependent plasticity for the connections within and between brain regions. While the authors relate literature-based experimental findings with their results, their results are so far not supported by experimental work.

      The authors successfully highlight in their model that personalization of rTMS stimulation frequency to the brain intrinsic frequency has the potential to improve stimulation impact, and they relate this to specific changes in the network. Their arguments that this resonance improves efficacy are intuitive, and their finding that inhibition and excitation are selectively modulated is a good starting point for analysing the underlying mechanism.

      As rTMS is used in clinical contexts, and the idea of aligning intrinsic and stimulation frequency is relatively easy to implement, the paper is conceptually of interest for the rTMS community, despite its weak points on the mechanistic explanation. The authors made the simulation code publicly available, which is a useful resource for further studies on the effects of metaplasticity. The same stimulation parameters have been tested in experiments, and a reanalysis of the experimental results following the idea of this paper could be influential for clinical optimisation of stimulation protocols.

      A strength of the paper is that it takes into account also deeper brain areas, and their interaction with the cortex. The paper carefully measures system changes in response to different frequency differences between thalamocortical loop and stimulation. By explicitly modelling changes to connections, the authors do start dissect the mechanism underlying the observed effect. Unfortunately, the dissection of the mechanistic underpinning in the current version of the manuscript does not yet fully exploits the possibility of a computational model. Here are a couple of points related to this critique:

      (1) The study reports that connections between thalamus and cortex as well as within the thalamus change, but the model is not used to separate the influence of both.

      (2) The paper reports that a resonance between stimulation and brain increases stimulation effectiveness. This conclusion is solely based on the observation of strong reactions in the network to subharmonics of the brain's frequency, and lacks further support such as alternative measures of resonance, or an analysis of the role of the phase difference between stimulation and brain oscillation, which is likely changed by the stimulation. For example, for harmonic oscillators, resonance leads to a 90 degree phase difference between driving force and system response, and for rTMS, phase locking has been shown to be relevant.

      (3) The authors claim that over-engagement of plasticity for HF-rTMS makes their intermittent protocol more effective. Yet, the study lacks a direct comparison between stimulation protocols that shows over-engagement of plasticity for the HF-protocol. The study also does not explore which time-scale of the plasticity mechanism rules the optimal stimulation protocol. Moreover, the study reports that only few number of pulses per burst show a good effect. This should depend on how strongly a single pulse changes the calcium volume, but this relation was not explored in the model.

      (4) The authors report on the frequency spectrum of the cortical excitatory population, with the argument that the power of this population is most closely related to EEG measurements. A report of the other neuronal populations is missing, which might be informative on what is going on in the network.

      Statistics:

      (1) The authors do not state whether they test for assumptions of the multiple regression analysis, such as whether errors have equal variance or that residuals are normally distributed.

      (2) For the statistical analysis, the authors ignore about half of their model simulations for which the change in the power was negligible. It is not clear to me which statistical analysis is meant; whether the figures show all model simulations, whether regression lines where evaluated ignoring them, and whether the multiple regression analysis used only half of the data points.

    3. Reviewer #3 (Public review):

      Summary:

      This article presented a novel computer model to address an important question in the field of brain stimulation, using the magnetic stimulation iTBS protocol as an example, how stimulation parameters, frequency in particular, interfere with the intrinsic brain oscillations via plastic mechanisms. Brain oscillation is a critical feature of functional brains and its alteration signals the onset of many neuropsychiatric diseases or certain brain states. The authors suggested with their model that harmonic and subharmonic stimulations close to the individual alpha frequency achieved strong broadband power suppression.

      Strengths:

      The authors focused on the cortico-thalamic circuitry and managed to generate alpha oscillations in their four-population model. By adding the non-monotonic calcium-based BCM rule, they have also achieved both homeostasis and plasticity in response to magnetic stimulation. This work combined computer simulations and statistical analysis to demonstrate the changes in network architecture and network dynamics triggered by varied magnetic stimulation parameters. By delivering the iTBS protocol to the cortical excitatory population, the key findings are that harmonic and subharmonic stimulations close to the individual alpha frequency (IAF) achieved strong broadband power suppression. This resulted from increased synaptic weights of the corticothalamic feed-forward inhibitory projections, which were mediated by the calcium dynamics perturbed by iTBS magnetic stimulation. This finding endorsed the importance of applying customized stimulation to patients based on their IAFs and suggested the underlying mechanism at the circuitry level.

      Weaknesses:

      The drawbacks of this work are also obvious. Model validation and biological feasibility justification should be better addressed. The primary outcome of their model is the broadband power suppression and the optimal effects of (sub)harmonic stimulation frequency, but it lacks immediate empirical support in the literature. To the best of my knowledge, many alpha frequency tACS studies reported to increase but not suppress the power of certain brain oscillations. A review by Wang et al., 2024 (Frontiers in System Neuroscience) suggested hybrid changes to different brain oscillations by magnetic stimulation. Developing a model to fully capture such changes might be out of the scope of the present study and challenging in the entire field, but it undermines the quality of the present work if not extensively discussed and justified. Clarity and reproducibility of the work can be improved. Although it is intriguing to see how the calcium-dependent BCM plasticity mediates such changes, the writing of the methods part is not hard to follow. It was also not clear why only two populations were considered in the thalamus, how the entire network was connected, or how the LTP/LTD threshold alters with calcium dynamics. The figures were unfortunately prepared in a nested manner. The crowded layout and the tiny font sizes reduce the clarity. The third point comes to contextualization and comparison to existing models. It will strengthen the work if the authors could have compared their work to other TMS modeling work with plasticity rules, e.g, Anil et al., 2024. Besides, magnetic stimulation is unique in being supra-threshold and having focality compared to other brain stimulation modalities, e.g., tDCS and tACS, but they may share certain basic neural mechanisms if accounting for certain parameters, e.g., frequency. A solid literature review and discussion on this part may help the field better perceive the value and potential limitations of this work.

    1. Reviewer #1 (Public review):

      Tamao et al. aimed to quantify the diversity and mutation rate of the influenza (PR8 strain) in order to establish a high-resolution method for studying intra-host viral evolution . To achieve this, the authors combined RNA sequencing with single-molecule unique molecular identifiers (UMIs) to minimize errors introduced during technical processing. They proposed an in vitro infection model with a single viral particle to represent biological genetic diversity, alongside a control model using in vitro transcribed RNA for two viral genes, PB2 and HA.

      Through this approach, the authors demonstrated that UMIs reduced technical errors by approximately tenfold. By analyzing four viral populations and comparing them to in vitro transcribed RNA controls, they estimated that ~98.1% of observed mutations originated from viral replication rather than technical artifacts. Their results further showed that most mutations were synonymous and introduced randomly. However, the distribution of mutations suggested selective pressures that favored certain variants. Additionally, comparison with closely related influenza strain (A/Alaska/1935) revealed two positively selected mutations, though these were absent in the strain responsible for the most recent pandemic (CA01).

      Overall, the study is well-designed, and the interpretations are strongly supported by the data.

      The authors have addressed all the comments from the previous round of reviews. No further concerns.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents a technically oriented application of UMI-based long-read sequencing to study intra-host diversity in influenza virus populations. The authors aim to minimize sequencing artifacts and improve the detection of rare variants, proposing that this approach may inform predictive models of viral evolution. While the methodology appears robust and successfully reduces sequencing error rates, key experimental and analytical details are missing, and the biological insight is modest. The study includes only four samples, with no independent biological replicates or controls, which limits the generalizability of the findings. Claims related to rare variant detection and evolutionary selection are not fully supported by the data presented.

      Strengths:

      The study addresses an important technical challenge in viral genomics by implementing a UMI-based long-read sequencing approach to reduce amplification and sequencing errors. The methodological focus is well presented, and the work contributes to improving the resolution of low-frequency variant detection in complex viral populations.

      Weaknesses:

      The application of UMI-based error correction to viral population sequencing has been established in previous studies (e.g., in HIV), and this manuscript does not introduce a substantial methodological or conceptual advance beyond its use in the context of influenza.

      The study lacks independent biological replicates or additional viral systems that would strengthen the generalizability of the conclusions. Potential sources of technical error are not explored or explicitly controlled. Key methodological details are missing, including the number of PCR cycles, the input number of molecules, and UMI family size distributions. These are essential to support the claimed sensitivity of the method.

      The assertion that variants at {greater than or equal to}0.1% frequency can be reliably detected is based on total read count rather than the number of unique input molecules. Without information on UMI diversity and family sizes, the detection limit cannot be reliably assessed.

      Although genetic variation is described, the functional relevance of observed mutations in HA and NA is not addressed or discussed in the context of known antigenic or evolutionary features of influenza. The manuscript is largely focused on technical performance, with limited exploration of the biological implications or mechanistic insights into influenza virus evolution.

      The experimental scale is small, with only four viral populations derived from single particles analyzed. This limited sample size restricts the ability to draw broader conclusions about quasispecies dynamics or evolutionary pressures.

      Comments on revisions:

      The revised manuscript provides additional methodological detail and clearer presentation, which improves transparency. However, the main limitations persist: the study remains small in scale, lacks independent validation, and relies on theoretical rather than empirical support for its claimed detection sensitivity. As a result, the work represents a modest technical advance rather than a substantive contribution to understanding influenza virus evolution.

    1. Reviewer #2 (Public review):

      The authors present a combined experimental and theoretical workflow to study partitioning noise arising during cell division. Such quantifications usually require time-lapse experiments, which are limited in throughput. To bypass these limitations, the authors propose to use flow-cytometry measurements instead and analyse them using a theoretical model of partitioning noise. The problem considered by the authors is relevant and the idea to use statistical models in combination with flow cytometry to boost statistical power is elegant. The authors demonstrate their approach using experimental flow cytometry measurements and validate their results using time-lapse microscopy. The approach focuses on a particular case, where the dynamics of the labelled component depends predominantly on partitioning, while turnover of components is not taken into account. The description of the methods is significantly clearer than in the previous version of the manuscript.

    2. Reviewer #1 (Public review):

      Summary:

      The aim of this paper is to develop a simple method to quantify fluctuations in the partitioning of cellular elements. In particular, they propose a flow-cytometry based method coupled with a simple mathematical theory as an alternative to conventional imaging-based approaches.

      Strengths:

      The approach they develop is simple to understand, and its use with flow-cytometry measurements is clearly explained. Understanding how the fluctuations in the cytoplasm partition varies for different kinds of cells is particularly interesting.

      Weaknesses:

      The theory only considers fluctuations due to cellular division events. Fluctuations in cellular components are largely affected by various intrinsic and extrinsic sources of noise and only under particular conditions does partitioning noise become the dominant source of noise. In the revised version of the manuscript, they argue that in their setup, noise due to production and degradation processes are negligible but noise due to extrinsic sources such as those stemming from cell-cycle length variability may still be important. To investigate the robustness of their modelling approach to such noise, they simulated cells following a sizer-like division strategy, a scenario that maximizes the coupling between fluctuations in cell-division time and partitioning noise. They find that estimates remain within the pre-established experimental error margin.

      Comments on previous version:

      The authors have addressed all of my comments.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, participants completed two different tasks. A perceptual choice task in which they compared the sizes of pairs of items and a value-different task in which they identified the higher value option among pairs of items with the two tasks involving the same stimuli. Based on previous fMRI research, the authors sought to determine whether the superior frontal sulcus (SFS) is involved in both perceptual and value-based decisions or just one or the other. Initial fMRI analyses were devised to isolate brain regions that were activated for both types of choices and also regions that were unique to each. Transcranial magnetic stimulation was applied to the SFS in between fMRI sessions and it was found to lead to a significant decrease in accuracy and RT on the perceptual choice task but only a decrease in RT on the value-different task. Hierarchical drift diffusion modelling of the data indicated that the TMS had led to a lowering of decision boundaries in the perceptual task and a lower of non-decision times on the value-based task. Additional analyses show that SFS covaries with model derived estimates of cumulative evidence, that this relationship is weakened by TMS.

      Strengths:

      The paper has many strengths, including the rigorous multi-pronged approach of causal manipulation, fMRI and computational modelling, which offers a fresh perspective on the neural drivers of decision making. Some additional strengths include the careful paradigm design, which ensured that the two types of tasks were matched for their perceptual content while orthogonalizing trial-to-trial variations in choice difficulty. The paper also lays out a number of specific hypotheses at the outset regarding the behavioural outcomes that are tied to decision model parameters and well justified.

      Weaknesses:

      In my previous comments (1.3.1 and 1.3.2) I noted that key results could be potentially explained by cTBS leading to faster perceptual decision making in both the perceptual and value-based tasks. The authors responded that if this were the case then we would expect either a reduction in NDT in both tasks or a reduction in decision boundaries in both tasks (whereas they observed a lowering of boundaries in the perceptual task and a shortening of NDT in the value task). I disagree with this statement. First, it is important to note that the perceptual decision that must be completed before the value-based choice process can even be initiated (i.e. the identification of the two stimuli) is no less trivial than that involved in the perceptual choice task (comparison of stimulus size). Given that the perceptual choice must be completed before the value comparison can begin, it would be expected that the model would capture any variations in RT due to the perceptual choice in the NDT parameter and not as the authors suggest in the bound or drift rate parameters since they are designed to account for the strength and final quantity of value evidence specifically. If, in fact, cTBS causes a general lowering of decision boundaries for perceptual decisions (and hence speeding of RTs) then it would be predicted that this would manifest as a short NDT in the value task model, which is what the authors see.

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test whether a TMS-induced reduction in excitability of the left Superior Frontal Sulcus influenced evidence integration in perceptual and value-based decisions. They directly compared behaviour-including fits to a computational decision process model---and fMRI pre and post TMS in one of each type of decision-making task. Their goal was to test domain-specific theories of the prefrontal cortex by examining whether the proposed role of the SFS in evidence integration was selective for perceptual but not value-based evidence.

      Strengths:

      The paper presents multiple credible sources of evidence for the role of the left SFS in perceptual decision making, finding similar mechanisms to prior literature and a nuanced discussion of where they diverge from prior findings. The value-based and perceptual decision-making tasks were carefully matched in terms of stimulus display and motor response, making their comparison credible.

      Weaknesses:

      -I was confused about the model specification in terms of the relationship between evidence level and drift rate. While the methods (and e.g. supplementary figure 3) specify a linear relationship between evidence level and drift rate, suggesting, unless I misunderstood, that only a single drift rate parameter (kappa) is fit. However, the drift rate parameter estimates in the supplementary tables (and response to reviewers) do not scale linearly with evidence level.

      -The fit quality for the value-based decision task is not as good as that for the PDM, and this would be worth commenting on in the paper.

    1. Reviewer #1 (Public review):

      The manuscript by Yin and colleagues addresses a long-standing question in the field of cortical morphogenesis, regarding factors that determine differential cortical folding across species and individuals with cortical malformations. The authors present work based on a computational model of cortical folding evaluated alongside a physical model that makes use of gel swelling to investigate the role of a two-layer model for cortical morphogenesis. The study assesses these models against empirically derived cortical surfaces based on MRI data from ferret, macaque monkey, and human brains.

      The manuscript is clearly written and presented, and the experimental work (physical gel modeling as well as numerical simulations) and analyses (subsequent morphometric evaluations) are conducted at the highest methodological standards. It constitutes an exemplary use of interdisciplinary approaches for addressing the question of cortical morphogenesis by bringing together well-tuned computational modeling with physical gel models. In addition, the comparative approaches used in this paper establish a foundation for broad-ranging future lines of work that investigate the impact of perturbations or abnormalities during cortical development.

      The cross-species approach taken in this study is a major strength of the work. However, correspondence across the two methodologies did not appear to be equally consistent in predicting brain folding across all three species. The results presented in Figures 4 (and Figures S3 & S4) show broad correspondence in shape index and major sulci landmarks across all three species. Nevertheless, the results presented for the human brain lack the same degree of clear correspondence for the gel model results as observed in the macaque and ferret. While this study clearly establishes a strong foundation for comparative cortical anatomy across species and the impact of perturbations on individual morphogenesis, further work that fine-tunes physical modeling of complex morphologies, such as that of the human cortex, may help to further understand the factors that determine cortical functionalization and pathologies.

    2. Reviewer #2 (Public review):

      This manuscript explores the mechanisms underlying cerebral cortical folding using a combination of physical modelling, computational simulations, and geometric morphometrics. The authors extend their prior work on human brain development (Tallinen et al., 2014; 2016) to a comparative framework involving three mammalian species: ferrets (Carnivora), macaques (Old World monkeys), and humans (Hominoidea). By integrating swelling gel experiments with mathematical differential growth models, they simulate sulcification instability and recapitulate key features of brain folding across species. The authors make commendable use of publicly available datasets to construct 3D models of fetal and neonatal brain surfaces: fetal macaque (ref. [26]), newborn ferret (ref. [11]), and fetal human (ref. [22]).

      Using a combination of physical models and numerical simulations, the authors compare the resulting folding morphologies to real brain surfaces using morphometric analysis. Their results show qualitative and quantitative concordance with observed cortical folding patterns, supporting the view that differential tangential growth of the cortex relative to the subcortical substrate is sufficient to account for much of the diversity in cortical folding. This is a very important point in our field, and can be used in the teaching of medical students.

      Brain folding remains a topic of ongoing debate. While some regard it as a critical specialization linked to higher cognitive function, others consider it an epiphenomenon of expansion and constrained geometry. This divergence was evident in discussions during the Strüngmann Forum on cortical development (Silver et al., 2019). Though folding abnormalities are reliable indicators of disrupted neurodevelopmental processes (e.g., neurogenesis, migration), their relationship to functional architecture remains unclear. Recent evidence suggests that the absolute number of neurons varies significantly with position-sulcus versus gyrus-with potential implications for local processing capacity (e.g., https://doi.org/10.1002/cne.25626). The field is thus in need of comparative, mechanistic studies like the present one.

      This paper offers an elegant and timely contribution by combining gel-based morphogenesis, numerical modelling, and morphometric analysis to examine cortical folding across species. The experimental design - constructing two-layer PDMS models from 3D MRI data and immersing them in organic solvents to induce differential swelling - is well-established in prior literature. The authors further complement this with a continuum mechanics model simulating folding as a result of differential growth, as well as a comparative analysis of surface morphologies derived from in vivo, in vitro, and in silico brains.

      Conclusion:

      This is a well-executed and creative study that integrates diverse methodologies to address a longstanding question in developmental neurobiology. While a few aspects-such as regional folding peculiarities, sensitivity to initial conditions, and available human data-could be further elaborated, they do not detract from the overall quality and novelty of the work. I enthusiastically support this paper and believe that it will be of broad interest to the neuroscience, biomechanics, and developmental biology communities.

      [Editor's note: The reviewers were satisfied with the authors' response. The eLife Assessment was slightly updated to reflect the author's response.]

    1. Reviewer #2 (Public review):

      Summary:

      This study aims to show how structural and functional brain organization develops during childhood and adolescence using two large neuroimaging datasets. It addresses whether core principles of brain organization are stable across development, how they change over time, and how these changes relate to cognition and psychopathology. The study finds that brain organization is established early and remains stable but undergoes gradual refinement, particularly in higher-order networks. Structural-functional coupling is linked to better working memory but shows no clear relationship with psychopathology.

      Comments on revisions:

      Follow-up: I would like to thank the authors for their thoughtful and comprehensive revisions. The additional analyses addressing developmental differences in structure-function coupling between CALM and NKI are valuable and clearly strengthen the manuscript. I particularly appreciate the inclusion of the neurotypical subgroup within CALM to disentangle neurotypicality from potential site-related effects, as well as the expanded discussion of these findings in the context of individual variability and equifinality.

      Regarding my earlier comment on the use of COMBAT, I realize that "exclusion" may have been a poor choice of wording. What I meant was that harmonization procedures like COMBAT can, in some cases, weaken extremes or reduce variability by shrinking values toward the mean, rather than literally excluding participants from the analysis. Nevertheless, I appreciate the authors' careful consideration of this point and their additional analysis examining sample coverage following motion-based exclusions.

      Overall, I am satisfied with the revisions, and I believe the manuscript has been substantially improved.

    1. Reviewer #1 (Public Review):

      The manuscript by Verma et al. is a simple and concise assessment of the in-cell motility parameters of cytoplasmic dynein. Although numerous studies have focused on understanding the mechanism by which dynein is activated using a complement of in vitro methodologies, an assessment of dynein motility in cells has been lacking. It has been unclear whether dynein exhibits high processivity within the crowded and complicated environment of the cell. For example, does cargo-bound dynein exhibit short, non-processive motility (as has been recently suggested; Tirumala et al., 2022 bioRxiv)? Does cargo-bound dynein move against opposing forces generated by cargo-bound kinesins? Do cargoes exhibit bidirectional switching due to stochastic activation of kinesins and dyneins? The current work addresses these questions quite simply by observing and quantitating the motility of natively tagged dynein in HeLa cells.

    2. Reviewer #2 (Public Review):

      Verma et al. provide a short technical report showing that endogenously tagged dynein and dynactin molecules localize to growing microtubule plus-ends and also move processively along microtubules in cells. The data are convincing, and the imaging and movies very nicely demonstrate their claims. I don't have any large technical concerns about the work. It is perhaps not surprising that dynein-dynactin complexes behave this way in cells due to other reports on the topic, but the current data are among some of the nicest direct demonstrations of this phenomenon. It may be somewhat controversial since a separate group has reported that dynein does not move processively in mammalian cells

      (https://www.biorxiv.org/content/10.1101/2021.04.05.438428v3).

    3. Reviewer #3 (Public Review):

      In this manuscript, Verma et al. set out to visualize cytoplasmic dynein in living cells and describe their behaviour. They first generated heterozygous CRISPR-Cas9 knock-ins of DHC1 and p50 subunit of dynactin and used spinning disk confocal microscopy and TIRF microscopy to visualize these EGFP-tagged molecules. They describe robust localization and movement of DHC and p50 at the plus tips of MTs, which was abrogated using SiR tubulin to visualize the pool of DHC and p50 on the MTs. These DHC and p50 punctae on the MTs showed similar, highly processive movement on MTs. Based on comparison to inducible EGFP-tagged kinesin-1 intensity in Drosophila S2 cells, the authors concluded that the DHC and p50 punctae visualized represented 1 DHC-EGFP dimer+1 untagged DHC dimer and 1 p50-EGFP+3 untagged p50 molecules.

    1. Reviewer #1 (Public review):

      This manuscript by Yang et al. presents a potentially novel mechanism by which Plscr1 defends against influenza virus infection. Using a global knockout (KO) and a tissue-specific overexpression mouse model, the authors demonstrate that Plscr1-KO mice exhibit increased susceptibility and inflammation following IAV infection. In contrast, overexpression of Plscr1 in ciliated epithelial cells protects mice from infection. Through transcriptomic analysis in mice and mechanistic studies in cell culture models, the authors reveal that Plscr1 transcriptionally upregulates Ifnlr1 expression and physically interacts with this receptor on the plasma membrane, thereby enhancing IFN-λ-mediated viral clearance.

      Overall, it's a well-performed study, however, causality between Plscr1 and Ifnlr1 expression needs to be more firmly established. This is because two recent studies of PLSCR1 KO cells infected with different viruses found no major differences in gene expression levels compared with their WT controls (Xu et al. Nature, 2023; LePen et al. PLoS Biol, 2024). There were also defects in the expression of other cytokines (type I and II IFNs plus TNF-alpha) so a clear explanation of why Ifnlr1 was chosen should also be given.

      While Plscr1 has long been recognized as a cell-intrinsic antiviral restriction factor, few studies have explored its broader physiological role. This study thus provides interesting insights into a specific function of Plscr1 in IAV-permissive airway epithelial cells and its contribution to whole body anti-viral immunity.

      Comments on revisions:

      Most of the requested changes and experiments have been done. One very informative experiment is the expression of Plscr1 in Ifnlr1-KO cells to determine if it still inhibits IAV infection. The authors have indicated that this experiment is currently being pursued by crossing mice to introduce Plscr1 expression into ciliated epithelial cells on an Ifnlr1 KO background. It will show if there are Ifnlr1-independent anti-flu activities that still require Plscr1.

    1. Reviewer #1 (Public review):

      Here the authors discuss mechanisms of ligand binding and conformational changes in GlnBP (a small E Coli periplasmic binding protein, which binds and carries L-glutamine to the inner membrane ATP-binding cassette (ABC) transporter). The authors have distinguished records in this area and have published seminal works. They include experimentalists and computational scientists. Accordingly, they provide a comprehensive, high quality, experimental and computational work.

      They observe that apo- and holo- GlnBP do not generate detectable exchange between open and (semi-) closed conformations on timescales between 100 ns and 10 ms. Especially, the ligand binding and conformational changes in GlnBP that they observe are highly correlated. Their analysis of the results indicates a dominant induced-fit mechanism, where the ligand binds GlnBP prior to conformational rearrangements. They then suggest that an approach resembling the one they undertook can be applied to other protein systems where the coupling mechanism of conformational changes and ligand binding.

      They argue that the intuitive model where ligand binding triggers a functionally relevant conformational change was challenged by structural experiments and MD simulations revealing the existence of unliganded closed or semi-closed states and their dynamic exchange with open unbound conformations, discuss alternative mechanisms that were proposed, their merits and difficulties, concluding that the findings were controversial, which, they suggest is due to insufficient availability of experimental evidence to distinguish them. As to further specific conclusions they draw from their results, they determine that a conformational selection mechanism is incompatible with their results, but induced fit is. They thus propose induced fit as the dominant pathway for GlnBP, further supported by the notion that the open conformation is much more likely to bind substrate than the closed one based on steric arguments.

      The paper here, which clearly embodies massive careful and high-quality work, is extensive, making use of a range of experimental approaches, including isothermal titration calorimetry, single-molecule Förster resonance energy transfer, and surface-plasmon resonance spectroscopy. The problem the authors undertake is of fundamental importance.

    2. Reviewer #2 (Public review):

      The authors provide convincing data from a whole set of different binding kinetic and thermodynamic experiments to explore whether glutamine binding protein binds glutamine via an induced fit or a conformational selection process.

      Weaknesses:

      The single-molecule TIRF-smFRET data appear to include spots that may represent more than one molecule, which raises the general issue of how rigorously traces were selected for single photobleaching events.

    1. Reviewer #1 (Public review):

      Summary:

      This study focuses on the bacterial metabolite TMA, generated from dietary choline. These authors and others have previously generated foundational knowledge about the TMA metabolite TMAO, and its role in metabolic disease. This study extends those findings to test whether TMAO's precursor, TMA, and its receptor TAAR5 are also involved and necessary for some of these metabolic phenotypes. They find that mice lacking the host TMA receptor (Taar5-/-) have altered circadian rhythms in gene expression, metabolic hormones, gut microbiome composition, and olfactory and innate behavior. In parallel, mice lacking bacterial TMA production or host TMA oxidation have altered circadian rhythms.

      Strengths:

      These authors use state-of-the-art bacterial and murine genetics to dissect the roles of TMA, TMAO, and their receptor in various metabolic outcomes (primarily measuring plasma and tissue cytokine/gene expression). They also follow a unique and unexpected behavioral/olfactory phenotype. Statistics are impeccable.