10,000 Matching Annotations
  1. Oct 2024
    1. eLife Assessment

      This useful study reports that the exogenous expression of the microRNA miR-195 can partially compensate in early B cell development for the loss of EBF1, one of the key transcription factors in B cells. While this finding will be of interest to those studying lymphocyte development, the evidence, particularly with regard to the molecular mechanisms that underpin the effect of miR-195, is currently incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      Here, the authors are proposing a role for miR-196, a microRNA that has been shown to bind and enhance the degradation of mRNA targets in the regulation of cell processes, and has a novel role in allowing the emergence of CD19+ cells in cells in which Ebf1, a critical B-cell transcription factor, has been genetically removed.

      Strengths:

      That over-expression of mR-195 can allow the emergence of CD19+ cells missing Ebf1 is somewhat novel.

      Their data does perhaps support to a degree the emergence of a transcriptional network that may bypass the absence of Ebf1, including the FOXO1 transcription factor, but this data is not strong or definitive.

      Weaknesses:

      It is unclear whether this observation is in fact physiological. When the authors analyse a knockout model of miR-195, there is not much of a change in the B-cell phenotype. Their findings may therefore be an artefact of an overexpression system.

      The authors have provided insufficient data to allow a thorough appraisal of the step-wise molecular changes that could account for their observed phenotype.

    3. Reviewer #2 (Public review):

      Summary:

      The authors investigate miRNA miR-195 in the context of B-cell development. They demonstrate that ectopic expression of miR-195 in hematopoietic progenitor cells can, to a considerable extent, override the consequences of deletion of Ebf1, a central B-lineage defining transcription factor, in vitro and upon short-term transplantation into immunodeficient mice in vivo. In addition, the authors demonstrate that the reverse experiment, genetic deletion of miR-195, has virtually no effect on B-cell development. Mechanistically, the authors identify Foxo1 phosphorylation as one pathway partially contributing to the rescue effect of miR-195. An additional analysis of epigenetics by ATACseq adds potential additional factors that might also contribute to the effect of ectopic expression of miR-195.

      Strengths:

      The authors employ a robust assay system, Ebf1-KO HPC, to test for B-lineage promoting factors. The manuscript overall takes on an interesting perspective rarely employed for the analysis of miRNA by overexpressing the miRNA of interest. Ideally, this approach may reveal, if not the physiological function of this miRNA, the role of distinct pathways in developmental processes.

      Weaknesses:

      At the same time, this approach constitutes a major weakness: It does not reveal information on the physiological role of miR-195. In fact, the authors themselves demonstrate in their KO approach, that miR-195 has virtually no role in B-cell development, as has been demonstrated already in 2020 by Hutter and colleagues. While the authors cite this paper, unfortunately, they do so in a different context, hence omitting that their findings are not original.

      Conceptually, the authors stress that a predominant function of miRNA (in contrast to transcription factors, as the authors suggest) lies in fine-tuning. However, there appears to be a misconception. Misregulation of fine-tuning of gene expression may result in substantial biological effects, especially in developmental processes. The authors want to highlight that miR-195 is somewhat of an exception in that regard, but this is clearly not the case. In addition to miR-150, as referenced by the authors, also the miR-17-92 or miR-221/222 families play a significant role in B-cell development, their absence resulting in stage-specific developmental blocks, and other miRNAs, such as miR-155, miR-142, miR-181, and miR-223 are critical regulators of leukocyte development and function. Thus, while in many instances a single miRNA moderately affects gene expression at the level of an individual target, quite frequently targets converge in common pathways, hence controlling critical biological processes.

      The paper has some methodological weaknesses as well: For the most part, it lacks thorough statistical analysis, and only representative FACS plots are provided. Many bar graphs are based on heavy normalization making the T-tests employed inapplicable. No details are provided regarding the statistical analysis of microarrays. Generation of the miR-195-KO mice is insufficiently described and no validation of deletion is provided. Important controls are missing as well, the most important one being a direct rescue of Ebf1-KO cells by re-expression of Ebf1. This control is critical to quantify the extent of override of Ebf1-deficiency elicited by miR-195 and should essentially be included in all experiments. A quantitative comparison is essential to support the authors' main conclusion highlighted in the title of the manuscript. As the manuscript currently stands, only negative controls are provided, which, given the profound role of Ebf1, are insufficient, because many experiments, such as assessment of V(D)J recombination, IgM surface expression, or class-switch recombination, are completely negative in controls. In addition, the authors should also perform long-term reconstitution experiments. While it is somewhat surprising that the authors obtained splenic IgM+ B cells after just 10 days, these experiments would be certainly much more informative after longer periods of time. Using "classical" mixed bone marrow chimeras using a combination of B-cell defective (such as mb1/mb1) bone marrow and reconstituted Ebf1-KO progenitors would permit much more refined analyses.

      With regard to mechanism, the authors show that the Foxo1 phosphorylation pathway accounts for the rescue of CD19 expression, but not for other factors, as mentioned in the discussion. The authors then resort to epigenetics analysis, but their rationale remains somewhat vague. It remains unclear how miR-195 is linked to epigenetic changes.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, Miyatake et al. present the interesting finding that ectopic expression of miR-195 in EBF1-deficient hematopoietic progenitor cells can partially rescue their developmental block and allow B cells to progress to a B220+ CD19+ cells stage. Notably, this is accompanied by an upregulation of B-cell-specific genes and, correspondingly, a downregulation of T, myeloid, and NK lineage-related genes, suggesting that miR-195 expression is at least in part equivalent to EBF1 activity in orchestrating the complex gene regulatory network underlying B cell development. Strengthening this point, ATAC sequencing of miR-195-expressing EBF1-deficient B220+CD19+ cells and a comparison of these data to public datasets of EBF1-deficient and -proficient cells suggest that miR-195 indirectly regulates gene expression and chromatin accessibility of some, but not all regions regulated by EBF1.

      Mechanistically, the authors identify a subset of potential target genes of miR-195 involved in MAPK and PI3K signalling. Dampening of these pathways has previously been demonstrated to activate FOXO1, a key transcription factor for early B cells downstream of EBF1. Accordingly, the authors hypothesize that miR-195 exerts its function through FOXO1. Supporting this claim, also exogenous FOXO1 expression is able to promote the development of EBF1-deficient cells to the B220+CD19+ stage and thus recapitulates the miR-195 phenotype.

      Strengths:

      The strength of the presented study is the detailed assessment of the altered chromatin accessibility in response to ectopic miR-195 expression. This provides insight into how miR-195 impacts the gene regulatory network that governs B-cell development and allows the formation of mechanistic hypotheses.

      Weaknesses:

      The key weakness of this study is that its findings are based on the artificial and ectopic expression of a miRNA out of its normal context, which in my opinion strongly limits the biological relevance of the presented work.

      While the authors performed qPCRs for miR-195 on different B cell populations and show that its relative expression peaks in early B cells, it remains unclear whether the absolute miR-195 expression is sufficiently high to have any meaningful biological activity. In fact, other miRNA expression data from immune cells (e.g. DOI 10.1182/blood-2010-10-316034 and DOI 10.1016/j.immuni.2010.05.009) suggest that miR-195 is only weakly, if at all, expressed in the hematopoietic system.

      The authors support their finding by a CRISPR-derived miR-195 knockout mouse model which displays mild, but significant differences in the hematopoietic stem cell compartment and in B cell development. However, they fail to acknowledge and discuss a lymphocyte-specific miR-195 knockout mouse that does not show any B cell defects in the bone marrow or spleen and thus contradicts the authors' findings (DOI 10.1111/febs.15493). Of note, B-1 B cells in particular have been shown to be elevated upon loss of miR-15-16-1 and/or miR-15b-16-2, which contradicts the data presented here for loss of the family member miR-195.

      A second weakness is that some claims by the authors appear overstated or at least not fully backed up by the presented data. In particular, the findings that miR-195-expressing cells can undergo VDJ recombination, express the pre-BCR/BCR and class switch needs to be strengthened. It would be beneficial to include additional controls to these experiments, e.g. a RAG-deficient mouse as a reference/negative control for the ddPCR and the surface IgM staining, and cells deficient in class switching for the IgG1 flow cytometric staining.

      Moreover, the manuscript would be strengthened by a more thorough investigation of the hypothesis that miR-195 promotes the stabilization and activity of FOXO1, e.g. by comparing the authors' ATACseq data to the FOXO1 signature.

    1. eLife Assessment

      This important study presents a new method for generating cell-type restricted knockouts in zebrafish and it reports several interesting applications of this method to study pigmentation and melanomagenesis. The evidence supporting the conclusions is convincing, with rigorous characterization of several knock out mutations that provide a proof of principle. The work will be of broad interest to cell, skin, and cancer biologists.

    2. Reviewer #1 (Public review):

      Summary:

      Perlee et al. sought to generate a zebrafish line where CRISPR-based gene editing is exclusively limited to the melanocyte lineage, allowing assessment of cell-type restricted gene knockouts. To achieve this, they knocked in Cas9 to the endogenous mitfa locus, as mitfa is a master regulator of melanocyte development. The authors use multiple candidate genes - albino, sox10, tuba1a, ptena/ptenb, tp53 - to demonstrate their system induces lineage-restricted gene editing. This method allows researchers to bypass embryonic lethal and non-cell autonomous phenotypes emerging from whole body knockout (sox10, tuba1a), drive directed phenotypes, such as depigmentation (albino), and induce lineage-specific tumors, such as melanomas (ptena/ptenb, tp53, when accompanied with expression of BRAFV600E). While the genetic approaches are solid, the argued increase in efficiency of this model compared to current tools was untested, and therefore unable to be assessed. Furthermore, the mechanistic explanations proposed to underlie their phenotypes are mostly unfounded, as discussed further in the Weaknesses section. Despite these concerns, there is still a clear use for this genetic methodology and its implementation will be of value to many in vivo researchers.

      Strengths:

      The strongest component of this manuscript is the genetic control offered by the mitfa:Cas9 system and the ability to make stable, lineage-specific knockouts in zebrafish. This is exemplified by the studies of tuba1a, where the authors nicely show non-cell autonomous mechanisms have obfuscated the role of this gene in melanocyte development. In addition, the mitfa:Cas9 system is elegantly straightforward and can be easily implemented in many labs. Mostly, the figures are clean, controls are appropriate, and phenotypes are reproducible. The invented method is a welcomed addition to the arsenal of genetic tools used in zebrafish.

      Weaknesses:

      The major weaknesses of the manuscript include the overly bold descriptions of the value of the model and the superficial mechanistic explanations for each biological vignette.

      The authors argue that a major advantage of this system is its high efficiency. However, no direct comparison is made with other tools that achieve the same genetic control, such as MAZERATI. This is a missed opportunity to provide researchers the ability to evaluate these two similar genetic approaches. In addition, Fig.1 shows that not all melanocytes express Cas9. This is a major caveat that goes unaddressed. It is of paramount importance to understand the percentage of mitfa+ cells that express Cas9. The histology shown is unclear and too zoomed out of a scale to make any insightful conclusions, especially in Fig.S1. It would also be beneficial to see data regarding Cas9 expression in adult melanocytes, which are distinct from embryonic melanocytes in zebrafish. Moreover, this system still requires the injection of a plasmid encoding gRNAs of interest, which will yield mosaicism. A prime example of this discrepancy is in Fig.6, where sox10 is clearly still present in "sox10 KO" tumors.

      The authors argue that their model allows rapid manipulation of melanocyte gene expression. Enthusiasm for the speed of this model is diminished by minimal phenotypes in the F0, as exemplified in Fig.2. Although the authors say >90% of fish have loss of pigmentation, this is misleading as the phenotype is a very weak, partial loss. Only in the F1 generation do robust phenotypes emerge, which takes >6 months to generate. How this is more efficient than other tools that currently exist is unclear and should be discussed in more detail.

      In Figure 3, the authors find that melanocyte-specific knockout of sox10 leads to only a 25% reduction in melanocytes in the F1 generation. This is in contradiction to prior literature cited describing sox10 as indispensable for melanocyte development. In addition, the authors argue that sox10 is required for melanocyte regeneration. This claim is not accurate, as >50% of melanocytes killed upon neocuproine treatment can regenerate. This data would indicate that sox10 is required for only a subset of melanocytes to develop (Fig.3C) and for only a subset to regenerate (Fig.3G). This is an interesting finding that is not discussed or interrogated further.

      Tumor induction by this model is weak, as indicated by the tumor curves in Figs.5,6. This might be because these fish are mitfa heterozygous. Whereas the avoidance of mitfa overexpression driven by other models including MAZERATI is a benefit of this system, the effect of mitfa heterozygosity on tumor incidence was untested. This is an essential question unaddressed in the manuscript.

      In Fig.6, the authors recapitulate previous findings with their model, showing sox10 KO inhibits tumor onset. The tumors that do develop are argued to be highly invasive, have mesenchymal morphology, and undergo phenotypic switching from sox10 to sox9 expression. The data presented do not sufficiently support these claims. The histology is not readily suggestive of invasive, mesenchymal melanomas. Sox10 is still present in many cells and sox9 expression is only found in a small subset (<20%). Whether sox10-null cells are the ones expressing sox9 is untested. If sox9-mediated phenotypic switching is the major driver of these tumors, the authors would need to knockout sox9 and sox10 simultaneously and test whether these "rare" types of tumors still emerge. Additional histological and genetic evaluation is required to make the conclusions presented in Fig.6. It feels like a missed opportunity that the authors did not attempt to study genes of unknown contribution to melanoma with their system.

      Overall, this manuscript introduces a solid method to the arsenal of zebrafish genetic tools but falls short of justifying itself as a more efficient and robust approach than what currently exists. The mechanisms provided to explain observed phenotypes are tenuous. Nonetheless, the mitfa:Cas9 approach will certainly be of value to many in vivo biologists and lays the foundation to generate similar methods using other tissue-specific regulators and other Cas proteins.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript describes a genetic tool utilizing mutant mitfa-Cas9 expressing zebrafish to knockout genes to analyze their function in melanocytes in a range of assays from developmental biology to tumorigenesis. Overall, the data are convincing and the authors cover potential caveats from their model that might impact its utility for future work.

      Strengths:

      The authors do an excellent job of characterizing several gene deletions that show the specificity and applicability of the genetic mitfa-Cas9 zebrafish to studying melanocytes.

      Weaknesses:

      Variability across animals not fully analyzed.

    4. Reviewer #3 (Public review):

      Summary:

      Perlee et al. present a method for generating cell-type restricted knockouts in zebrafish, focusing on melanocytes. For this method, the authors knock-in a Cas9 encoding sequence into the mitfa locus. This mitfaCas9 line has restricted Cas9 expression, allowing the authors to generate melanocyte-specific knockouts rapidly by follow-up injection of sgRNA expressing transposon vectors.

      The paper presents some interesting vignettes to illustrate the utility of their approach. These include 1) a derivation of albino mutant fish as a demonstration of the method's efficiency, 2) an interrogation and novel description of tuba1a as a potential non-autonomous contributor to melanocyte dispersion, and 3) the generation of sox10 deficient melanoma tumors that show "escape" of sox10 loss through upregulation of sox9. The latter two examples highlight the usefulness of cell-type targeted knockouts (Body-wide sox10 and tuba1a loss elicit developmental defects). Additionally, the tumor models involve highly multiplexed sgRNAs for tumor initiation which is nicely facilitated by the stable Cas9.

      Strengths:

      The approach is clever and could prove very useful for studying melanocytes and other cell types. As the authors hint at in their discussion, this approach would become even more powerful with the generation of other Cas9-restricted lineages so a single sgRNA construct can be screened across many lineages rapidly (or many sgRNA and fish lines screened combinatorially).

      The biological findings used to demonstrate the power of the approach are interesting in their own right. If it proves true, tuba1a's non-autonomous effects on melanosome dispersion are striking, and this example demonstrates very nicely how one could use Perlee et al.'s approach to search for other non-autonomous mechanisms systematically. Similarly, the observation of the sox9 escape mechanism with sox10 loss is a beautiful demonstration of the relevance of SOX10/SOX9's reciprocal regulation in vivo. This system would be a very nice model for further interrogating mechanisms/interventions surrounding Sox10 in melanoma.

      Finally, the figure presentation is very nice. This work involves complex genetic approaches including multiple fish generations and multiplexed construct injections. The vector diagrams and breeding schemes in the paper make everything very clear/"grok-able," and the paper was enjoyable to read.

      Weaknesses:

      The mitfa-driven GFP on their sgRNA-expressing cassette is elegant, but it makes one wonder why the endogenous knock-in is necessary. It would strengthen the motivation of the work if the authors could detail the potential advantages and disadvantages of their system compared to expressing Cas9 with a lineage-specific promoter from a transposon in their introduction or discussion.

      Related to the above - is mitfa haplosufficient? If the mitfaCas9/+ fish have any notable phenotypes, it would be worth noting for others interested in using this approach to study melanoma and pigmentation.

      A core weakness (and also potential strength) of the system is that introduced edits will always be non-clonal (Fig 2H/I). The activity of individual sgRNAs should always be validated in the absence of any noticeable phenotype to interpret a negative result. Additionally, caution should be taken when interpreting results from rare events involving positive outgrowth (like tumorogenesis) to account for the fact many cells in the population might not have biallelic null alleles (i.e., 100% of the gene product removed).

      Along those lines: in my opinion, the tuba1a results are the most provocative finding in the paper, but they lack key validation. With respect to cutting activity, the Alt-R and transgenic sgRNA expression approaches are not directly comparable. Since there is no phenotype in the melanocyte specific tuba1a knockouts, the authors must confirm high knockout efficiency with this set of reagents before making the claim there is a non-autonomous phenotype. This can be achieved with GFP+ sorting and NGS like they performed with their albino melanocytes.

      The whole-body tuba1a knockout phenotype is expected to be pleiotropic, and this expectation might mask off-target effects. Controls for knockout specificity should be included. For instance, confidence in the claims would greatly increase if the dispersed melanosome phenotype could be recovered with guide-resistant tuba1a re-expression and if melanocyte-restricted tuba1a re-expression failed to rescue. As a less definitive but adequate alternative, the authors could also test if another guide or a morpholino against tuba1a phenocopies the described Alt-R edited fish.

      I have similar questions about the sox10 escapers, but these suggestions are less critical for supporting the authors claims (especially given the nice staining). Are the sox10 tumors relatively clonal with respect to sox10 mutations? And are the sox10 tumor mutations mostly biallelic frameshifts or potential missense mutations/single mutations that might not completely remove activity? I am particularly curious as SOX10 doesn't seem to be completely absent (and is still very high in some nuclei) in the immunohistochemistry.

    1. Author response:

      The following is the authors’ response to the current reviews.

      We thank the reviewers and editor for their positive assessment of our work. For the Version of Record, we have made small revisions addressing the remaining concerns of reviewer #3. We have also reformatted the supplementary material to conform to eLife’s style.

      While the manuscript was under review, we discussed our work with Bill Bialek, who suggested clarifying the effect of cell rearrangements on genetic patterns. Using the tracked cell trajectories we found that the highly coordinated intercalations in the germ band preserve the relative AP positions of cells. We have added an Appendix subsection (Appendix 1.5) explaining this finding and highlighting its relevance in a short paragraph added to the discussion.

      Reviewer #2

      Main comment from 1st review:

      Weaknesses:

      The modeling is interesting, with the integration of tension through tension triangulation around vertices and thus integrating force inference directly in the vertex model. However, the authors are not using it to test their hypothesis and support their analysis at the tissue level. Thus, although interesting, the analysis at the tissue level stays mainly descriptive.

      Comments on the revised version:

      My main concern was that the author did not use the analysis of mutant contexts such as Snail and Twist to confirm their predictions. They made a series of modifications, clarifying their conclusions. In particular, they now included an analysis of Snail mutant and show that isogonal deformations in the ventro-lateral regions are absent when the external pulling force of the VF is abolished, supporting the idea that isogonal strain could be used as an indicator of external forces (Fig7 and S6).

      They further discuss their results in the context of what was published regarding the mutant backgrounds (fog, torso-like, scab, corkscrew, ksr) where midgut invagination is disrupted, and where germ band buckles, and propose that this supports the importance of internal versus external forces driving GBE.

      Overall, these modifications, in addition to clarifications in the text, clearly strengthen the manuscript.

      We thank the reviewer for assessing our manuscript again and are happy to hear that they find the added data on the snail mutant convincing and that our revised manuscript is stronger.

      Reviewer #3

      In their article "The Geometric Basis of Epithelial Convergent Extension", Brauns and colleagues present a physical analysis of drosophila axis extension that couples in toto imaging of cell contours (previously published dataset), force inference, and theory. They seek to disentangle the respective contributions of active vs passive T1 transitions in the convergent extension of the lateral ectoderm (or germband) of the fly embryo.

      The revision made by the authors has greatly improved their work, which was already very interesting, in particular the use of force inference throughout intercalation events to identify geometric signatures of active vs passive T1s, and the tension/isogonal decomposition. The new analysis of the Snail mutant adds a lot to the paper and makes their findings on the criteria for T1s very convincing.

      About the tissue scale issues raised during the first round of review. Although I do not find the new arguments fully convincing (see below), the authors did put a lot of effort to discuss the role of the adjacent posterior midgut (PMG) on extension, which is already great. That will certainly provide the interested readers with enough material and references to dive into that question.

      We appreciate the referee’s positive assessment of our manuscript and their careful reading and constructive feedback. In particular, we are happy to hear that the referee finds our added data on the snail mutant very convincing and finds that the extended discussion on the role of the PMG is helpful. We address the remaining concerns in our detailed response below.

      I still have some issues with the authors' interpretation on the role of the PMG, and on what actually drives the extension. Although it is clear that T1 events in the germ band are driven by active local tension anisotropy (which the authors show but was already well-established), it does not show that the tissue extension itself is powered by these active T1s. Their analysis of "fence" movies from Collinet et al 2015 (Tor mutants and Eve RNAi) is not fully convincing. Indeed, as the authors point out themselves, there is no flow in Tor mutant embryos, even though tension anisotropy is preserved. They argue that in Tor embryos the absence of PMG movement leaves no room for the germband to extend properly, thus impeding the flow. That suggests that the PMG acts as a barrier in Tor mutants - What is it attached to, then?

      We thank the referee for pointing out this omission: The PMG is attached to the vitelline membrane in the scab domain (Munster et al. Nature 2019) and is also obstructed from moving by more anterior laying tissue (amnioserosa). It therefore acts as an obstacle for GBE extension if it fails to invaginate (e.g. in a Tor embryo). We have clarified this in the discussion of the Tor mutants.

      The authors also argue that the posterior flow is reduced in "fenced" Eve RNAi embryos (which have less/no tension anisotropy), to justify their claim that it is the anisotropy that drives extension. However, previous data, including some of the authors' (Irvine and Wieschaus, 1994 - Fig 8), show that the first, rapid phase of germband extension is left completely unaffected in Eve mutants (that lack active tension anisotropy). Although intercalation in Eve mutants is not quantified in that reference, this was later done by others, showing that it is strongly reduced.

      The quantification of GBE in Irvine and Wieschaus 1994 was based on the position of the PMG from bright field imaging, making it hard to distinguish the contributions of ventral furrow, PMG, and germ band, particularly during the early phase of GBE where all these processes happen simultaneously. More detailed quantifications based on PIV analysis of in toto light-sheet imaging show significantly reduced tissue flow in eve mutants after the completion of ventral furrow invagination (Lefebvre et al., eLife 2023). That the initial fast flow is driven by ventral furrow invagination, not by the PMG is apparent from twist/snail embryos where the initial phase is significantly slower (Lefebvre et al., eLife 2023, Gustafson et al., Nat Comms 2022). We have added these references to the re-analysis and discussion of the Collinet et al 2015 experiments.

      Similarly, the Cyto-D phenotype from Clement et al 2017, in which intercalation is also strongly reduced, also displays normal extension.

      We agree that a careful quantification of tissue flow in Cyto-D-treated embryos would be interesting. Whether they show normal extension is not clear from the Clement et al. 2017 paper, as no quantification of total tissue flow is performed and no statements regarding extension are made there.

      Reviewer #3 (Recommendations For The Authors):

      • A lot of typos / grammar mistakes / repetitions are still found here and there in the paper. Authors should plan a careful re-reading prior to final publication.

      We have carefully checked the manuscript and fixed the typos and grammar mistakes.

      • I failed to point to a very relevant reference in the previous round of review, which I think the authors should cite and comment: A review by Guirao & Bellaiche on the mechanics of intercalation in the fly germband, which notably discusses the passive/active and stress-relaxing/stress-generating nature of T1s. (Guirao and Bellaiche, Current opinions in cell biology 2017), in particular figures 1 and 2.

      We thank the referee for pointing us to this relevant reference which we now cite in the introduction.

      • Any new arguments/discussion the authors see fit to include in the paper to comment on the Eve/Tor phenotypes. As far as I am concerned, I am not fully convinced at the moment (see review), but I think the paper has other great qualities and findings, and now (since the first round of review) sufficiently discusses that particular matter. I leave it up to the authors how much (more) they want to delve into this in their final version!

      We have added clarifications and references to the discussion of the Eve/Tor phenotypes.


      The following is the authors’ response to the original reviews.

      Public Review:

      Joint Public Review:

      Summary:

      Brauns et al. work to decipher the respective contribution of active versus passive contributions to cell shape changes during germ band elongation. Using a novel quantification tool of local tension, their results suggest that epithelial convergent extension results from internal forces.

      Reading this summary, and the eLife assessment, we realized that we failed to clearly communicate important aspects of our findings in the first version of our manuscript. We therefore decided to largely restructure and rewrite the abstract and introduction to emphasize that:

      ● Our analysis method identifies active vs passive contributions to cell and tissue shape changes during epithelial convergent extension

      ● In the context of Drosophila germ band extension, this analysis provides evidence for a major role for internal driving forces rather than external pulling force from neighboring tissue regions (posterior midgut), thus settling a question that has been debated due to apparently conflicting evidence from different experiments.

      ● Our findings have important implications for local, bottom-up self-organization vs top-down genetic control of tissue behaviors during morphogenesis.

      Strengths:

      The approach developed here, tension isogonal decomposition, is original and the authors made the demonstration that we can extract comprehensive data on tissue mechanics from this type of analysis.

      They present an elegant diagram that quantifies how active and passive forces interact to drive cell intercalations.

      The model qualitatively recapitulates the features of passive and active intercalation for a T1 event.

      Regions of high isogonal strains are consistent with the proximity of known active regions.

      We think this statement is somewhat ambiguous and does not summarize our findings precisely. A more precise statement would be that high isogonal strain identifies regions of passive deformation, which is caused by adjacent active regions.

      They define a parameter (the LTC parameter) which encompasses the geometry of the tension triangles and allows the authors to define a criterium for T1s to occur.

      The data are clearly presented, going from cellular scale to tissue scale, and integrating modeling approaches to complement the thoughtful description of tension patterns.

      Weaknesses:

      The modeling is interesting, with the integration of tension through tension triangulation around vertices and thus integrating force inference directly in the vertex model. However, the authors are not using it to test their hypothesis and support their analysis at the tissue level. Thus, although interesting, the analysis at the tissue level stays mainly descriptive.

      We fully agree that a full tissue scale model is crucial to support the claims about tissue scale self-organization we make in the discussion. However, the full analysis of such a model is beyond the scope of the present manuscript. We have therefore split off that analysis into a companion manuscript (Claussen et al. 2023). In this paper, we show that the key results of the tissue-scale analysis of the Drosophila embryo, in particular the order-to-disorder transition associated with slowdown of tissue flow, are reproduced and rationalized by our model.

      We now refer more closely to this companion paper to point the reader to the results presented there.

      Major points:

      (1) The authors mention that from their analysis, they can predict what is the tension threshold required for intercalations in different conditions and predict that in Snail and Twist mutants the T1 tension threshold would be around √2. Since movies of these mutants are most probably available, it would be nice to confirm these predictions.

      This is an excellent suggestion. We have included an analysis of a recording of a Snail mutant, which is presented in the new Figures 4 and S6. As predicted, we find that isogonal deformations in the ventro-lateral regions are absent when the external pulling force of the VF is abolished. Further, in the absence of isogonal deformation, T1 transitions indeed occur at a critical tension of approx. √2, as predicted by our model. Both of these results provide important experimental evidence for our model and for isogonal strain as a reliable indicator of external forces.

      (2) While the formalism is very elegant and convincing, and also convincingly allows making sense of the data presented in the paper, it is not all that clear whether the claims are compatible with previous experimental observations. In particular, it has been reported in different papers (including Collinet et al NCB 2015, Clement et al Curr Biol 2017) that affecting the initial Myosin polarity or the rate of T1s does not affect tissue-scale convergent extension. Analysis/discussion of the Tor phenotype (no extension with myosin anisotropy) and the Eve/Runt phenotype (extension without Myosin anisotropy), which seem in contradiction with an extension mostly driven by myosin anisotropy.

      We are happy to read that the referees find our approach elegant and convincing. The referees correctly point out that we have failed to clearly communicate how our findings connect to the existing literature on Drosophila GBE. Indeed, the conflicting results reported in the literature on what drives GBE – internal forces (myosin anisotropy) or external forces (pulling by the posterior midgut) – were a motivation for our study. We have extensively rewritten the introduction, results section (“Isogonal strain identifies regions of passive tissue deformation”), and discussion (“Internal and external contributions to germ band extension”) in response to the referee’s request.

      In brief, distinguishing active internal vs passive external driving of tissue flow has been a fundamental open question in the literature on morphogenesis. Our tension-isogonal decomposition now provides a way to answer this question on the cell scale, by identifying regions of passive deformation due to external forces. As we now explain more clearly, our analysis shows that germ band extension is predominantly driven by internal tension dynamics, and not pulling forces from the posterior midgut.

      We put this cell-scale evidence into the context of previous experimental observations on the tissue scale: Genetic mutants (fog, torso-like, scab, corkscrew, ksr), where posterior midgut invagination is disrupted (Muenster et al. 2019, Smits et al. 2023). In these mutants, the germ band buckles forming ectopic folds or twists into a corkscrew shape as it extends, pointing towards a buckling instability characteristic of internally driven extensile flows.

      To address the apparently conflicting evidence from Collinet et al. 2015, we carried out a

      quantitative re-analysis of the data presented in that reference (see new SI section 3 and Fig.

      S11). The results support the conclusion that the majority of GBE flow is driven internally, thus resolving the apparent conflict.

      Lastly, as far as we understand, Clement et al. 2017 appears to be compatible with our picture of active T1 transitions. Clement et al. report that the actin cortex, when loaded by external forces, behaves visco-elastically with a relaxation time of the order of minutes, in line with our model for emerging interfaces post T1.

      We again thank the referees for prompting us to address these important issues and believe that including their discussion has significantly strengthened our manuscript.

      Recommendations for the authors:

      Minor points:

      - Fig 2 : authors should state in the main text at which scale the inverse problem is solved. (Intercalating quartet, if I understood correctly from the methods) ? and they should explain and justify their choice (why not computing the inverse at a larger scale).

      We have rephrased the first sentence of the section “Cell scale analysis” to clarify that we use local tension inference. This local inference is informative about the relative tension of one interface to its four neighbors. The focus on this local level is justified because we are interested in local cell behaviors, namely rearrangements. Tension inference is also most robust on the local level, since this is where force balance, the underlying physical determinant of the link between mechanics and geometry, resides. In global tension inference, spurious large scale gradients can appear when small deviations from local force balance accumulate over large distances. We have added a paragraph in SI Sec. 1.4 to explain these points.

      -Fig 2 : how should one interpret that tension after passive intercalation (amnioserosa) is higher than before. On fig 2E, tension has not converged yet on the plot, what happens after 20 minutes ?

      Recall that the inferred tension is the total tension on an interface. While on contracting interfaces, the majority of this tension will be actively generated by myosin motors, on extending interfaces there is also a contribution carried by passive crosslinkers. The passive tension can be effectively viewed as viscous dissipation on the elongating interface as crosslinkers turn over (Clement et al. 2017). Note that this passive tension is explicitly accounted for in the model presented in Fig. 5. Notably, it is crucial for the T1 process to resolve in a new extending junction. In the amnioserosa, the tension post T1 remains elevated because the amnioserosa is continually stretched by the convergence of the germ band. The tension hence does not necessarily converge back to 1. However, our estimates for the tension after 20 mins post T1 are very noisy because most of the T1s happen relatively late in the movie (past the 25 min mark) and therefore there are only a few T1s where we can track the post-T1 dynamics for more than 20 mins.

      We have added a brief explanation of the high post-T1 tension at the end of the section entitled “Relative tension dynamics distinguishes active and passive intercalations”. Further, we have moved up the section describing the minimal model right after the analysis of the relative tension during intercalations. We believe that this helps the reader better understand these findings before moving on to the tension-isogonal decomposition which generalizes them to the tissue scale.

      Page 7-8 / Figure 3: It is unclear how the decomposition into 1) physical shape 2) tension shape 2) isogonal shape works exactly. A more detailed explanation and more clear illustration of what a quartet is and its labels could help.

      We have added a more detailed explanation in the main text. See our response to the longer question regarding this point below.

      -What exactly defines the boundary curve in figure 3E? How is it computed?

      We have added a sentence in the caption for Fig. 3E explaining that the boundary curve is found by solving Eq. (1) with l set to zero for the case of a symmetric quartet. We have also added a brief explanation immediately below Eq. (1) pointing out that this equation defines the T1 threshold in the space of local tensions T_i in terms of the isogonal length l_iso.

      -The authors should consider incorporating some details described in the SI file to the main text to clarify some points, as long as the accessible style of the manuscript can be kept. The points mentioned below may also be clarified in the SI doc. The specific points that could be elaborated are: Page 7-8 / Figure 3: It is unclear how the decomposition into 1) physical shape 2) tension shape 2) isogonal shape works exactly. A more detailed explanation and more clear illustration of what a quartet is and its labels could help. The mapping to Maxwell-Cremona space is fine, but which subset is the quartet? For a set of 4 cells with two shared vertices and a junction, aren't there 5 different tension vectors? Are we talking two closed force triangles? Separately, how do you exactly decompose the deformation (of 4 full cell shapes or a subset?) into isogonal and non-isogonal parts? What is the least squares fit done over - is this system underdetermined? Is this statistically averaged or computed per quartet and then averaged?

      We thank the referees for pointing us to unclear passages in our presentation. We hope that our revisions have resolved the referee’s questions. As described above, we have clarified the tension-isogonal decomposition in the main text. We have also revised the corresponding SI section (1.5) to address the above questions. A sketch of the quartet with labels is found in SI Fig. S7A which we now refer to explicitly in the main text.

      We always consider force-balance configurations, i.e. closed force triangles. Therefore in the “kite” formed by two adjacent tension triangles, only three tension vectors are independent.

      The decomposition of deformation is performed as follows: For each of the four cells, the center of mass c_i is calculated. Next, tension inference is performed to find the two tension triangles with tension vectors T_ij. Now there are three independent centroidal vectors c_j - c_i and three corresponding independent tension vectors T_ij. We define the isogonal deformation tensor I_quratet as the tensor that maps the centroidal vectors to the tension vectors. In general this is not possible exactly, because I_quartet has only three independent components, but there are six equations.

      The plots in Fig. 3C, C’ are obtained by performing this decomposition for each intercalating quartet individually. The data is then aligned in time and ensemble averages are calculated for each timepoint.

      For tissue-scale analysis in Fig. 6, the decomposition is performed for individual vertices (i.e. the corresponding centroidal and tension triangles) and then averaged locally to find the isogonal strain fields shown in Fig. 6B, B’.

      - Line 468: "Therefore, tissue-scale anisotropy of active tension is central to drive and orient convergent-extension flow [10, 57, 59, 60]." Authors almost never mention the contribution of the PMG to tissue extension. Yet it is known to be crucial (convergent extension in Tor mutants is very much affected). Please discuss this point further.

      The referees raise an important point: as discussed in our response to major point (2), we now explicitly discuss the role of internal (active tension) and external (PMG pulling) forces during germ band extension. Please see our response to major point (2) for the changes we made to the manuscript to address this.

      In particular, we now explain that in mutants where PMG invagination is impaired (fog, torso-like, torso, scab, corkscrew), the germ band buckles out of plane or extends in a twisted, corkscrew fashion (Smits et al. 2023). This shows that the germ band generates extensile forces largely internally. In torso mutants, the now stationary PMG acts as a barrier which blocks GBE extension; the germ band buckles as a response.

      The role of PMG invagination hence lies not in creating pulling forces to extend the germ band, but rather in “making room” to allow for its orderly extension. As shown by the genetics mutants just discussed, the synchronization of PMG invagination and GBE is crucial for successful gastrulation.

      -Typos:

      Line 74: how are intercalations are

      Line 84: vertices vertices

      Line 233: very differently

      Line 236: are can

      Line 390: energy which is the isogonal mode must

      Line 1585: reveals show

      Line 603: area Line 618: in terms of on the

      We have fixed these typos.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study revisits the effects of substitution model selection on phylogenetics by comparing reversible and non-reversible DNA substitution models. The authors provide evidence that 1) non time-reversible models sometimes perform better than general time-reversible models when inferring phylogenetic trees out of simulated viral genome sequence data sets, and that 2) non time-reversible models can fit the real data better than the reversible substitution models commonly used in phylogenetics, a finding consistent with previous work. However, the methods are incomplete in supporting the main conclusion of the manuscript, that is that non time-reversible models should be incorporated in the model selection process for these data sets.

      The non-reversible models should be incorporated in the selection model process not because the significantly perform better but only because the do not perform worse than the reversible models and that true biochemical processes of nucleotide substitution does support the science of non-reversibility.

      Reviewer #1 (Public Review):

      The study by Sianga-Mete et al revisits the effects of substitution model selection on phylogenetics by comparing reversible and non-reversible DNA substitution models. This topic is not new, previous works already showed that non-reversible, and also covarion, substitution models can fit the real data better than the reversible substitution models commonly used in phylogenetics. In this regard, the results of the present study are not surprising. Specific comments are shown below.

      True.

      Major comments

      It is well known that non-reversible models can fit the real data better than the commonly used reversible substitution models, see for example,

      https://academic.oup.com/sysbio/article/71/5/1110/6525257

      https://onlinelibrary.wiley.com/doi/10.1111/jeb.14147?af=R

      The manuscript indicates that the results (better fitting of non-reversible models compared to reversible models) are surprising but I do not think so, I think the results would be surprising if the reversible models provide a better fitting.

      I think the introduction of the manuscript should be increased with more information about non-reversible models and the diverse previous studies that already evaluated them. Also I think the manuscript should indicate that the results are not surprising, or more clearly justify why they are surprising.

      The surprise in the findings is in NREV12 performing better than NREV6 for double stranded DNA viruses as it was expected that NREV6 would perform better given the biochemical processes discussed in the introduction.

      In the introduction and/or discussion I missed a discussion about the recent works on the influence of substitution model selection on phylogenetic tree reconstruction. Some works indicated that substitution model selection is not necessary for phylogenetic tree reconstruction, https://academic.oup.com/mbe/article/37/7/2110/5810088 https://www.nature.com/articles/s41467-019-08822-w https://academic.oup.com/mbe/article/35/9/2307/5040133

      While others indicated that substitution model selection is recommended for phylogenetic tree reconstruction, https://www.sciencedirect.com/science/article/pii/S0378111923001774 https://academic.oup.com/sysbio/article/53/2/278/1690801 https://academic.oup.com/mbe/article/33/1/255/2579471

      The results of the present study seem to support this second view. I think this study could be improved by providing a discussion about this aspect, including the specific contribution of this study to that.

      In our conclusion we have stated that: The lack of available data regarding the proportions of viral life cycles during which genomes exist in single and double stranded states makes it difficult to rationally predict the situations where the use of models such as GTR, NREV6 and NREV12 might be most justified: particularly in light of the poor over-all performance of NREV6 and GTR relative to NREV12 with respect to describing mutational processes in viral genome sequence datasets. We therefore recommend case-by-case assessments of NREV12 vs NREV6 vs GTR model fit when deciding whether it is appropriate to consider the application of non-reversible models for phylogenetic inference and/or phylogenetic model-based analyses such as those intended to test for evidence of natural section or the existence of molecular clocks.

      The real data was downloaded from Los Alamos HIV database. I am wondering if there were any criterion for selecting the sequences or if just all the sequences of the database for every studied virus category were analysed. Also, was any quality filter applied? How gaps and ambiguous nucleotides were considered? Notice that these aspects could affect the fitting of the models with the data.

      We selected varying number of sequences of the database for every studied virus type. Using the software aliview we did quality filter by re-aligning the sequences per virus type.

      How the non-reversible model and the data are compared considering the non-reversible substitution process? In particular, given an input MSA, how to know if the nucleotide substitution goes from state x to state y or from state y to state x in the real data if there is not a reference (i.e., wild type) sequence? All the sequences are mutants and one may not have a reference to identify the direction of the mutation, which is required for the non-reversible model. Maybe one could consider that the most abundant state is the wild type state but that may not be the case in reality. I think this is a main problem for the practical application of non-reversible substitution models in phylogenetics.

      True.

      Reviewer #2 (Public Review):

      The authors evaluate whether non time reversible models fit better data presenting strand-specific substitution biases than time reversible models. Specifically, the authors consider what they call NREV6 and NREV12 as candidate non time-reversible models. On the one hand, they show that AIC tends to select NREV12 more often than GTR on real virus data sets. On the other hand, they show using simulated data that NREV12 leads to inferred trees that are closer to the true generating tree when the data incorporates a certain degree of non time-reversibility. Based on these two experimental results, the authors conclude that "We show that non-reversible models such as NREV12 should be evaluated during the model selection phase of phylogenetic analyses involving viral genomic sequences". This is a valuable finding, and I agree that this is potentially good practice. However, I miss an experiment that links the two findings to support the conclusion: in particular, an experiment that solves the following question: does the best-fit model also lead to better tree topologies?

      By NREV12 leading to inferred trees that are closer to the true generating tree as compared to GTR, it then shows that the best-fit model in this case being NREV12 leads to better tree topologies.

      On simulated data, the significance of the difference between GTR and NREV12 inferences is evaluated using a paired t test. I miss a rationale or a reference to support that a paired t test is suitable to measure the significance of the differences of the wRF distance. Also, the results show that on average NREV12 performs better than GTR, but a pairwise comparison would be more informative: for how many sequence alignments does NREV12 perform better than GTR?

      We have used the popular paired t-test as it is the most widely used when comparing means values between two matched samples where the difference of each mean pair is normally distributed. And the wRF distances do match the guidelines above.

      The paired t-test contains the pairwise comparison and the boxplots side by side show the pairwise wRF comparisions..

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      The reversible and non-reversible models used in this study assume that all the sites evolve under the same substitution matrix, which can be unrealistic. This aspect could be mentioned.

      Done.

      The manuscript indicates that "a phylogenetic tree was inferred from an alignment of real sequences (Avian Leukosis virus) with an average sequence identity (API) of ~90%.". I was wondering under which substitution model that phylogenetic tree reconstruction was performed? could the use of that model bias posterior results in terms of favoring results based on such a model?

      We have stated on page ….. that the GTR+G model was used to reconstruct the tree. The use of the GTR+G model could yes bias the posterior results as we have stated on page ….

      I was wondering which specific R function was used to calculate the weighted Robinson-Foulds metric. I think this should be included in the manuscript.

      We stated that We used the weighted Robinson-Foulds metric (wRF; implemented in the R phangorn package (Schliep, 2011)⁠)

      Despite a minority, several datasets fitted better with a reversible model than with a non-reversible model. I think that should be clearly indicated.

      In addition, in my opinion the AIC does not enough penalizes the number of parameters of the models and favors the non-reversible models over the reversible models, but this is only my opinion based on the definition of AIC and it is not supported. Thus, I think the comparison between phylogenetic trees reconstructed under different substitution models was a good idea (but see also my second major comment).

      Noted.

      When comparing phylogenetic trees I was wondering if one should consider the effect of the estimation method and quality of the studied data? For example, should bootstrap values be estimated for all the ancestral nodes and only ancestral nodes with high support be evaluated in the comparison among trees?

      Yes the estimation method and quality of the studied data should be considered. When using RF unlike wRF this will not matter but for weighted RF it does. When building the trees, using RaxML only high support nodes are added to the tree.

      In Figure 3, I do not see (by eye) significant differences among the models. I see in the legend that the statistical evaluation was based on a t test but I am not much convinced. Maybe it is only my view. Exactly, which pairs of datasets are evaluated with the t test? Next, I would expect that the influence of the substitution model on the phylogenetic tree reconstruction is higher at large levels of nucleotide diversity because with more substitution events there is more information to see the effects of the model. However, the t test seems to show that differences are only at low levels of nucleotide diversity (and large DNR), what could be the cause of this?

      The paired T-tests compares the wRF distances of the inferred tree real tree and the trees simulated using the GTR model verses the wRF distances of the inferred true tree from the trees simulated using the NREV12 model.

      The reason why the influence of the NREV12 model on the tree reconstructed is not significantly higher at large levels of nucleotide diversity could be because at a certain level the DNR are simply unrealistic.

      Can the user perform substitution model selection (i.e., AIC) among reversible and non-reversible substitution models with IQTREE? If yes, then doing that should be the recommendation from this study, correct?

      But, can DNR be estimated from a real dataset? DNR seems to be the key factor (Figure 3) for the phylogenetic analysis under a proper model.

      Substitution model selection can be performed among reversible and non-reversible using both HyPhy and IQTREE. And we have recommended that model tests should be done as a first step before tree building. Estimating DNR from real datasets requires a substation rate matrix of a non-reversible.

      The manuscript has many text errors (including typos and incorrect citations). For example, many citations in page 20 show "Error! Reference source not found.". I think authors should double check the manuscript before submitting. Also, some text is not formally written. For example, "G represents gamma-distributed rates", rates of what? The text should be clear for readers that are not familiar with the topic (i.e., G represents gamma-distributed substitution rates among sites). In general, I recommend a detailed revision of the whole text of the manuscript.

      Done.

      Reviewer #2 (Recommendations For The Authors):

      The authors reference Baele et al., 2010 for describing NREV6 and NREV12. I suggest using the same name used in the referenced paper: GNR-SYM and GNR respectively. Although I do not think there is a standard name for these models, I would use a previously used one.

      We have built studies based on the names NREV6 and NREV12. We would like to keep the naming as standard for our studies.

      GTR and NREV12 models are already described in many other papers. I do not see the need to include such an extensive description. Also, a reference should be included to the discrete Gamma rate categories [1]

      We included the extensive description to enable other readers who are not super familiar with these models better understanding since we have given the models our own naming different from those used in other papers.

      We have added referencing for the discrete gamma rate as recommended. (Yang, 1994)

      To evaluate the exhaustiveness and correctness of the results, I would recommend publishing as supplementary material the simulated data sets or the scripts for generating the data set, the scripts or command lines for the analysis, and the versions of the software used (e.g., IQTREE). Also, to strongly support the main conclusion of the manuscript, I suggest adding to the simulations section results the RF-distances of the best-fit selected model under AIC, AICc, and BIC as well.

      We can go ahead and submit all the needed datasets. The simulated data RF-Distances results are available and will be submitted. We cannot however add them to the main document as this will create very long data tables.

      In some instances, it is mentioned that the selection criterion used is AIC, while in others, AIC-c is referenced. Even in the table captions, both terms are mixed. It should be made clearer which criterion is being employed, as AIC is not suitable for addressing the overparameterization of evolutionary models, given that it does not account for the sample size. A previous pre-print of this article [2] does not mention AIC-c, but also explicitly includes the formulas for AIC that do not take the sample size into account, and reports the same results as this manuscript, what indicates that AIC and not AIC-c was used here. This should be clarified. It is recommended to use AIC-c instead of AIC, especially if the sample size to model parameters ratio is low [3]. Two things may be appointed here: some authors consider tree branch lengths as model free parameters and others do not. In this paper it is not specified how the model parameters are counted. AIC tends to select more parameterized models than AIC-c, and overparameterization can lead to different tree inferences, as evidenced in Hoff et al., 2016. Therefore, it is expected that NREV12 is more frequently selected than NREV6 and GTR.

      In my opinion, a pairwise comparison between GTR and NREV12 performance is of great interest here, and the whiskers plots are not useful. Scatterplots would display the results better.

      Boxplots are meant to offer a simplified view of the results as the paired t-tests does all of the comparisons. We shall provide the scatter plots as supplementary information so that readers can get full detailed plots as recommended.

      Some references are missing

      Missing references added

    1. eLife Assessment

      This is a valuable manuscript analyzing single-cell RNA-sequencing data from the mouse vomeronasal organ. Convincing evidence in this manuscript allows the authors to identify and verify the differential expression of genes that distinguish apical and basal vomeronasal neurons. The authors also show that Gnao1 neurons exhibit enriched expression of ER-related genes, which they verify with in situ hybridizations and immunostaining and also explore via electron microscopy.

    2. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review): 

      Devakinandan et al. present a revised version of their manuscript. Their scRNA-seq data is a valuable resource to the community, and they further validate their findings via in situ hybridizations and electron microscopy. Overall, they have addressed my major concerns. I only have two minor comments. 

      (1) The authors note in Figure 4I, and K that because the number of C2 V2Rs or H2-Mv receptors increased while the normalized expression of Gnao1 remained constant (and likewise for V1Rs and Gnai2 in Figure 4-S4C) that their results are unlikely to be capturing doublets. I'm not sure that this is the case. If the authors added together two V2R cells the total count of every gene might double, but the normalized expression of Gnao1 would remain the same. To address this concern, the authors should also show the raw counts for Gnao1 as well as the total number of UMIs for these cells. 

      In Figure 4I, 4K and Figure 4-Figure supplement 4C, on Y-axis, we plotted the sum of normalized counts of all V1R/V2R/H2-Mv genes expressed in each cell along with the normalized expression value of Gnao1/Gnai2. Both VR/H2-Mv and Gnao1/Gnai2 are normalized values, with normalization based on LogNormalize (mentioned in methods). We show here plots of total expression calculated from raw counts corresponding to the same Figure. Raw counts of VRs/H2-Mv, Gnao1/Gnai2 are plotted separately due to difference in scale. The overall trend matches normalized counts, with minor fluctuations in Gnao1/Gnai2.     

      Author response image 1.

      As mentioned in our response to version-1 reviews and in our manuscript, doublets generally are a random combination of two cells and the probability that a combinatorial pattern is due to doublet is proportional to the abundance of cells expressing those genes. It is possible that some of the family-C V2R combinations represented by 2 cells are doublets because of their widespread expression. The frequency of combinatorial expression patterns, greater than a set threshold of 2 cells, that we observed for family ABD V2Rs or V1Rs (supplementary tables 7, 8) is an indication of co-expression and unlikely from random doublets. For instance, 134 cells express two V1Rs, of which 44 cells express Vmn1r85+Vmn1r86, 21 cells express Vmn1r184+Vmn1r185, 13 express Vmn1r56+Vmn1r57, 6 express Vmn1r168+Vmn1r177. Some of the co-expression combinations we reported were also identified and verified experimentally in Lee et al., 2019 and Hills et. al., 2024.

      The co-expression of multiple family-C2 V2Rs (Vmn2r2-Vmn2r7) along with ABD V2Rs per cell as shown in our data, has been shown experimentally in earlier studies.      

      (2) As requested, the authors have now added a colorbar to the pseudocolored images in Figures 7. However, this colorbar still doesn't have any units. Can the authors add some units, or clarify in the methods how the raw data relates to the colors (e.g. is it mapped linearly, at a logscale, with gamma or other adjustments, etc.)? Moreover, it's also unclear what the dots in the backgrounds of plots like Figure 7E mean. Are they pixels? Showing the individual lines, the average for each animal, or omitting them entirely, might make more sense. 

      We used the Fire LUT with linear scale within Fiji / Image-J software to assign scale to the pseudo-colored images in Figure 7. We will include this description in our methods and thank the reviewer for pointing it out. The dots in the background are mentioned in Figure 7 legend as fluorescence intensity values normalized to a 0-1 scale and color coded for each antibody. The trendline was fitted on these values.  

      Reviewer #2 (Public review): 

      Summary: 

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons. This reviewer has some minor concerns and some suggestions to improve the manuscript. 

      Strengths: 

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic. 

      (2) The analysis suggest that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors. 

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons. 

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. 

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons. 

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community. 

      Reviewer #3 (Public review): 

      Summary: 

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report an enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and co-expression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns. 

      Strengths: 

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting in a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes. 

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...). 


      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Devakinandan and colleagues present a manuscript analyzing single-cell RNAsequencing data from the mouse vomeronasal organ. The main advances in this manuscript are to identify and verify the differential expression of genes that distinguish apical and basal vomeronasal neurons. The authors also identify the enriched expression of ER-related genes in Gnao1 neurons, which they verify with in situ hybridizations and immunostaining, and also explore via electron microscopy. Finally, the results of this manuscript are presented in an online R shiny app. Overall, these data are a useful resource to the community. I have a few concerns about the manuscript, which I've listed below. 

      General Concerns: 

      (1) The authors mention that they were unable to identify the cells in cluster 13. This cluster looks similar to the "secretory VSN" subtype described in a recent preprint from C. Ron Yu's lab (10.1101/2024.02.22.581574). The authors could try comparing or integrating their data with this dataset (or that in Katreddi et al. 2022) to see if this is a common cell type across datasets (or arises from a specific type of cell doublets). In situ hybridizations for some of the marker genes for this cluster could also highlight where in the VNO these cells reside. 

      Cluster13 (Obp2a+) cells identified in our study have similar gene expression markers to “putative secretory” cells mentioned in Hills et al.. At the time this manuscript was available publicly, our publication was already communicated. We have now performed RNA-ISH to Obp2a, the topmost marker identified with this cluster, and found it to be expressed in cells from glandular tissue on the non-sensory side. Some of the other markers associated with this cluster such as Obp2b, Lcn3, belong to the lipocalin family of proteins. Hence in our estimate these markers collectively represent non-sensory glandular tissue. We have added Obp2a RNA-ISH to Figure 2-figure supplement-1A and results section in our revised manuscript. Cluster-13 also has cells expressing Vmn1r37, which typically is expressed in neuronal cells. However, we do not see Obp2a mRNA in the sensory epithelium. It is possible that cluster-13 comprises a heterogenous mixture of cells, some of which are clearly non-sensory cells from glandular tissue, co-clustered with other cell types as well as a  possibility that Obp2a is expressed below the detection level of our assay in neurons, which will require further experiments. We do not have any possible reason to confidently assign this cluster as a neuronal cell type, hence, we excluded it in downstream analysis of neurons. 

      We used the data from Hills et al., to compare co-expression characteristic of V2Rs, which is added as Figure 3-figure supplement 3. 

      (2) I found the UMAPs for the neurons somewhat difficult to interpret. Unlike Katreddi et al. 2022 or Hills et al. 2024, it's tricky to follow the developmental trajectories of the cells in the UMAP space. Perhaps the authors could try re-embedding the data using gene sets that don't include the receptors? It would also be interesting to see if the neuron clusters still cluster by receptor-type even when the receptors are excluded from the gene sets used for clustering. Plots relating the original clusters to the neuronal clusters, or dot plots showing marker gene expression for the neuronal clusters might both be useful. For example, right now it's difficult to interpret clusters like n8-13. 

      a) We have revised the UMAP in Figure 3A, and labeled mature, immature, progenitor neurons so that it is easier to follow the developmental trajectory. 

      b) In our revised text we have explicitly drawn equivalence between neuronal clusters from Figure 1 to re-clustered neurons in subsequent figures (Figure 3 and 4 in revised submission). For developmental analysis, we merged mature Gnao1, Gnai2 neuronal subclusters to two major clusters that are equivalent to original neuronal clusters in Figure 1. As UMAP is an arbitrary representation of cells, we also show expression of markers for major neuronal cell types in Figure 1C and Figure 3-figure supplement 1B, helpful in making the connection.  

      c) The purpose of re-clustering with higher resolution was to identify sub-populations within Gnao1 and Gnai1 neurons. It was useful to make sense of mature Gnao1 neurons, where family-C Vmn2r and H2-Mv expression maps onto distinct subclusters. Along with neuronal subclusters in revised Figure 3-figure supplement-1 we include a dot plot of gene expression markers. 

      d) In Figure 3-figure supplement-2, we show a comparison of neuronal clusters with and without VRs. Exclusion of VRs did not substantially alter mature neuron dichotomy into Gnao1/Gnai2. Only Gnao1 subclusters n1/n3 whose organization is dependent on family-C Vmn2r expression were affected, as well as redistribution of subcluster n8 from Gnai2 neurons. VR expression does not seem to be the primary determinant of VSN cluster identity.

      Reviewer #2 (Public Review): 

      Summary: 

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons. This reviewer has some minor concerns and some suggestions to improve the manuscript. 

      Strengths: 

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic. 

      (2) The analysis suggests that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors. 

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons. 

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. 

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons. 

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community. 

      Weaknesses: 

      (1) The connection between observations from sc RNA-seq and EM is unclear.

      (2) The lack of quantification for the ER phenotype is a concern. 

      We have extensively quantified the ER phenotype as shown in Figure 7, Figure 7-figure supplement-1 in our revised version. We would like to point out that the connection between scRNA-seq and EM was made due to our observations in the same figures, that levels of a number of ER luminal and ER membrane proteins were higher in Gnao1 compared to Gnai2 neurons. This led us to hypothesize a differential ER content or ultrastructure, which was verified by EM.

      Reviewer #3 (Public Review): 

      Summary: 

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report the enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and coexpression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns. 

      Strengths: 

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting of a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes. 

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...). 

      Weaknesses: 

      The study still requires refined analyses of the data and rigorous quantification to support the main claims. 

      The method description for filtering and clustering single-cell RNA-sequencing data is incomplete. The Seurat package has many available pipelines for single-cell RNA-seq analysis, with a significant impact on the output data. How did the authors pre-process and normalize the data? Was the pipeline used with default settings? What batch correction method was applied to the data to mitigate possible sampling or technical effects? Moreover, the authors do not describe how cell and gene filtering was performed. The data in Figure 7-Supplement 3 show that one-sixth of the V1Rs do not express any chemoreceptor, while over a hundred cells express more than one chemoreceptor. Do these cells have unusually high or low numbers of genes or counts? To exclude the possibility of a technical artifact in these observations, the authors should describe how they dealt with putative doublet cells or debris. Surprisingly, some clusters are characterized by the expression of specific chemoreceptors (VRs). Have these been used for clustering? If so, clustering should be repeated after excluding these receptors. 

      The identification of the VSN types should be consistent across the different analyses and validated. The data presented in Figure 1 lists four mature VSN types, whereas the re-clustering of neurons presented in Figure 3 leads to a different subdivision. At present, it remains unclear whether these clusters reflect the biology of the system or are due to over-clustering of the data, and therefore correspond to either noise or arbitrary splitting of continua. Clusters should be merged if they do not correspond to discrete categories of cells, and correspondence should be established between the different clustering analyses. To validate the detected clusters as cell types, markers characteristic of each of these populations can be evaluated by ISH or IHC. 

      There is a lack of quantification of imaging data, which provides little support for the ERrelated main claim. Quantification of co-expression and statistics on labeling intensity or coverage would greatly strengthen the conclusions and the title of the paper. 

      a) scRNA-seq data analysis methods: Our revised submission has expanded on the methods section with details of parameters, filtering criterion and software used.

      b) Inclusion/exclusion of VRs: Figure 3-Figure supplement-2 of our revised submission shows a comparison of neuronal sub-clusters with and without VRs. Overall sub-cluster identities were not affected by VR exclusion, except for Gnao1 sub-clusters n1/n3 -governed by family C Vmn2r1/Vmn2r2 and redistribution of Gnai2 cluster n8. The minimal effect of VRs on Gnai2 sub-clustering can also be confirmed by lack of V1R in the dot plot showing markers of neuronal clusters. 

      c) Neuronal clusters and potential over-clustering: we pooled neuronal cells from Figure-1 and re-clustered to identify sub-populations within Gnao1 and Gnai1 neurons. Several neuronal sub-clusters identified by us including progenitors, immature neurons and mature neurons are validated by previous studies with wellknown markers. Amongst the mature neurons, the biological basis of four Gnao1 neuron sub-clusters (n1-n4) is discussed in our co-expression section (Figure 4AE) and these are also validated by previous experimental studies. These Gnao1 clusters are organized according to the expression of family-C V2Rs (Vmn2r1 or Vmn2r2) as well as H2M_v_ genes. Within Gnai2 sub-clusters, n12 and n13 exclusively express markers that distinguish them from n8-n11 which we have described in our revised version. However, n8-n11 do not have definitive markers and whether these sub-clusters are part of a continuum or over-clustered, will require further extensive experiments and analysis. We prefer to show all subclusters, including Gnai2 sub-clusters, in Figure 3-Figure supplement-1, along with a dot plot of sub-cluster gene expression, so that this data is available for future experiments and analysis.  We share the concern that some Gnai2 sub-clusters may not have an obvious biological basis at this time. Hence in our revised submission, we have merged mature Gnao1 and mature Gnai2 sub-clusters for the developmental analysis shown in Figure 3A. 

      d) Quantification of the ER phenotype: In our revised submission, we provide extensive quantification of the ER phenotype in Figure 7, Figure7-figure supplement-1.   

      e) We think that the cells expressing zero as well as two V1Rs are real and cannot be attributed to debris or doublets for the following reasons:

      i) Cells expressing no V1Rs are not necessarily debris because they express other neuronal markers at the same level as cells that express one or two V1Rs. For instance, Gnai2 expression level across cells expressing 0, 1, 2 V1Rs is the same, which we have included in Figure 4-figure supplement 4-C of our revised submission. Higher expression threshold value used in our analysis may have somewhat increased the proportion of cells with zero V1Rs. Similarly, Gnao1 levels across cells expressing multiple V2Rs and H2-M_v_ per cell stay the same, indicating that these are unlikely to be doublets (Figure 4 I-K). The frequency of each co-expression combination (Supplementary Table 7 and 8) itself is an indication of whether it is represented by a single cell or an artifact.

      ii) Cells co-expressing V1R genes: We listed the frequency of cells co-expressing V1R gene combinations in Supplementary table - 8. Among 134 cells that express two V1Rs, 44 cells express Vmn1r85+Vmn1r86, 21 express Vmn1r184+Vmn1r185, 13 express Vmn1r56+Vmn1r57, 6 express Vmn1r168+Vmn1r177, and so on. Doublets generally are a random combination of two cells. Here, each specific co-expression combination represents multiple cells and is highly unlikely by random chance. Some of the co-expression combinations we reported were also identified and verified experimentally in Lee et al., 2019 and Hills et. al., 2024.  

      Recommendations for the authors:

      Reviewing Editor (Recommendations for the Authors): 

      The editor had a query about the analysis of FPRs, which are a third family of sensory receptors in the rodent VNO. 

      FPRs were found in our study as expressed in subsets of Gnai2 and Gnao1 neurons as well as non-neuronal cells. These can be easily searched in www.scvnoexplorer.com. For instance, Fpr1 and Fpr2 are expressed in immune cell clusters - 2,6,8,10; whereas Fpr-3 is expressed in Gnao1 subcluster n1. Consistent with earlier reports (10.1073/pnas.0904464106, 10.1038/nature08029) expression of Fpr-rs3, Fpr-rs4, Fprrs6, Fpr-rs7 is restricted to Gnai2 neurons, of which Fpr-rs3 and Fpr-rs4 are limited to Tmbim1+ Gnai2 neurons.  

      Reviewer #1 (Recommendations For The Authors):

      (1) The reference to "genders" on page 3 should be changed to "sexes". 

      We have modified the text.   

      (2) Did the authors identify any Ascl1+ GBCs in their data? 

      Ascl1+ GBCs were identified and are now marked in our revised version Figure3-figure supplement 1B.    

      (3) The plots in Figures 1B and 2B say they're depicting gene "Expression", but it looks like the gene expression was z-scored. If so, the authors should describe how the expression was scaled. 

      We have modified the legend title to ‘scaled expression’ and described the basis of scaling in the methods section of our revised version. 

      (4) The main text mentions Figure 2C, but maybe this refers to the right part of Figure 2B?

      Panel 2C was mistakenly not marked in the figure. We have now marked it in revised Figure 2.    

      (5) The authors should attempt to describe the other branch points in the trajectory shown in Figure 3A. If they don't seem biologically plausible, then the authors might want to reconsider using Slingshot for their analyses.

      We do not seek to claim additional branch points within mature Gnao1 or Gnai2 neurons from our analysis. Whether there exist additional branch points leading to subcategories within mature neurons, requires extensive experimental investigation. Hence, in our revised submission, we have merged mature Gnai2 / Gnao1 subclusters for pseudotime developmental analysis and to keep our analysis focused on the single branch point at immature neurons.    

      (6) The most significantly enriched gene in Figure 3B in immature Gnao1+ neurons is Cnpy1, which is also an ER protein. It could also be interesting to look at its expression or speculate on its function in immature neurons. 

      Multiple ER genes were found to be enriched in Gnao1 neurons. We would not be comfortable speculating on the function of individual genes, without a proper study, which is beyond the scope of this manuscript.      

      (7) For figures with pseudo-colored expressions, it would be useful to have color bars. I'm also not sure the pseudocolors are necessary; presenting the data in grayscale or a single color like green might also be sufficient. 

      We used pseudocolor in the IHC images of ER proteins, because there is a wide variation in the fluorescence signal intensity across apical to basal axis for various proteins. In some cases, gray scale images could lead to the false impression that there is no signal in apical Gnai2 neurons, whereas pseudocolor shows low fluorescence level in these neurons. We have added intensity scale bar to the figures in our revision version.  

      (8) For in situ images with two colors it would be more colorblind-friendly to use green and magenta rather than green and red.

      Since no single color palette can help readers with different types of colorblindness, we decided to rely on user’s operating systems that offer rendering of the images to a color palette based on their type of colorblindness. We believe this  would be a better option as described here: https://markusmeister.com/2021/07/26/figure-design-for-colorblindreaders-is-outdated/

      (9) The heatmap in Figure 7E would likely look more accurate without interpolation/aliasing/smoothing. 

      We have not performed smoothening on any of the heatmaps. We have noticed that sometimes heatmaps take time to load in software (such as Adobe Acrobat) leading to the impression of smoothing. Changing the zoom level or reopening the file may fix this.     

      (10) Rather than just citing the literature on the unfolded protein response in the MOE, it could be useful to cite work on the ATF5 expression and the UPR in the VNO (e.g.

      10.1101/239830v1 or 10.12688/f1000research.13659.1).

      We have cited and commented on the ATF5 VNO expression in our discussion. 

      (11) I might try to condense the discussion. Additionally, in the discussion, the section on receptor co-expression comes before that on the VNO ER, so I might consider reorganizing the figures and results to present all of the scRNA-seq analyses (including the receptor co-expression figure) first before the figures on the ER. 

      We welcome this suggestion and have reorganized figures and results such that the scRNA-seq analysis flow is maintained before ER results.   

      Reviewer #2 (Recommendations For The Authors): 

      (1) Upregulation of ER-related mRNAs and expanded ER lumen in Gnao1-positive neurons is interesting, but the connection between these observations is unclear. The authors can strengthen the link by adding immunohistochemistry of representative ER proteins to test if the upregulation of mRNAs related to ER results in increased levels of these proteins in the ER of these neurons.

      Connection between scRNA-seq and EM was made due to our observations that levels of a number of ER luminal and membrane proteins were higher in Gnao1 compared to Gnai2 neurons (Figure 7, Figure 7-figure supplement-1 in our revised submission). This led us to hypothesize a differential ER content or ultrastructure, which was verified by EM. We have also addressed the question of whether upregulation of mRNAs related to ER proteins results in their increased levels (Figure 7-figure supplement-2). In some cases, for example Hspa5 (Bip), mRNA as well as protein levels are upregulated in Gnao1 neurons (see Figure 3A volcano plot, Figure 5-figure supplement-1 RNA-ISH, Figure 7-figure supplement-1 comparison of mRNA levels, Figure 7F immunofluorescence). However, there are other genes in the same figures, for which mRNA levels are not upregulated, yet protein levels are higher in Gnao1 neurons. As mentioned in our text and discussion, upregulated mRNA levels as well as post-transcriptional mechanisms are both likely to play a role in upregulating ER protein levels in Gnao1 neurons.       

      (2) In Figure 3, the authors seemed to exclude cluster 13 from Figure1 in the pseudotime analysis without justification. 

      Cluster13 has markers such as Obp2a, Obp2b, Lcn3. We confirmed via RNA-ISH (Figure 2-figure supplement-1A in our revised submission) that Obp2a maps to cells from glandular tissue on the non-sensory side. Cluster-13 also has cells expressing Vmn1r37, which typically is expressed in neuronal cells. However, we do not see Obp2a mRNA in the sensory epithelium. It is possible that cluster-13 comprises a heterogenous mixture of cells, some of which are non-sensory glandular cells, co-clustered with other cell types as well as the possibility that Obp2a is expressed in neurons, below the detection level of our assay. Further experiments will be required to distinguish between these possibilities. We do not have any possible reason to confidently assign this cluster as a neuronal cell type, hence, it was excluded in the downstream analysis of neurons.

      (3) In Figure 3, the line appears to suggest that Gnao1-positive cells can be progenitors of Gnai2-positive cells. Please clarify. 

      We thank the reviewer for pointing this out. We did not seek to give the impression that Gnao1 cells can be progenitors of Gnai2 cells. This may be due to the placement of dots in the trajectory leading to misinterpretation and the UMAP itself. We have modified the pseudotime trajectory in our revised version to make it more intuitive. 

      (4) Figure 3: Please label pseudotime lineage cluster identities. 

      Cluster identities are now labeled in Figure 3A pseudotime lineage as well as in Figure 3-figure supplement-1 dot plot.     

      (5) Figure 4: Please label the genes used for in situ hybridization in the volcano plot. 

      Genes used for RNA-ISH are labeled (bold font) in the volcano plot in Figure 5A.  

      (6) Figure 4: Please clarify which genes shown in the in situ hybridization figures correspond to which GO terms. 

      We have added supplementary table-10 containing gene ontology terms associated with genes for which RNA-ISH was performed. 

      (7) The EM shown in Figure 5 makes this work unique and intriguing. However, the lack of quantification for the ER phenotype is a concern. For example, does the ER area of a given cell correlate with the relative position of the cells along the apical-basal axis of the vomeronasal organ? What about the ER morphology in the progenitor cells? 

      We show here a quantification of the ER area from the low magnification EM image shown in Figure 8A. The ER area shows an increase going towards the basal side of the cross-section. However, this quantification is complicated by the following factors: a) Processing for EM, results in some shrinkage of the tissue, b) Gnao1 neurons follow an invaginating pattern in cross-sections. Due to these reasons, some Gnao1 neurons could come very close to, and at times lie adjacent to Gnai2 neurons in EM cross-section. Due to a lack of contrast, it is harder to identify the ER within the cell at low mag, especially in the apical zone. The plot shown here does indicate that roughly, the ER area of a cell correlates with its position along the apical-basal axis. In our revised submission, we have quantified the fluorescence intensities of various ER proteins along the apical basal axis from confocal images (Figure 7, Figure 7-figure supplement-1).    

      Author response image 2.

      ROIs (yellow) are manually drawn in the sensory epithelium, wherever possible to identify ER without ambiguity. Area and centroid of ROI are calculated and x coordinates of centroid of each ROI are used to position ER area along the apical-basal axis as shown in the plot below.

      Establishing ER ultrastructure in progenitor or immature cells, as well as unambiguous quantification of ER area in mature neurons, requires identification of these cells in crosssections using fluorescent molecular markers, followed by performing correlative light and electron microscopy (CLEM). This procedure being technically challenging is beyond the scope of our manuscript.      

      Reviewer #3 (Recommendations For The Authors): 

      (1) The main claim is about ER differences between Gnao1+ and Gnai2+ VSN. The ISH, IHC, and EM microscopy images are not quantified and, therefore, poorly support this main claim.

      In our revised submission, we provide extensive quantification of the ER phenotype in Figure 7, Figure7-Figure supplement-1.  Quantification of ER area from EM images is challenging and described above it in our response to reviewer #2 recommendation 7.

      (2) The annotation of VSN subclusters should be more rigorous, consistent throughout the paper (VSN clusters are inconsistent between Figure 1 and Figure 3, and the multiplication of subclusters in Figure 3 is not discussed), and verified (using ISH or IHC) that they reflect discrete, actual cell types. The authors should provide a list of differentiating marker genes for the clusters in Figure 3. At present, it remains unclear whether these clusters are the result of over-clustering of cells (and therefore represent either noise or arbitrary splits of continua) or whether they reflect the biology of the system. Subsequent characterization of these curated VSN subtypes (as done in Figure 4) would add value to the study.

      We pooled neuronal cells from Figure-1 and re-clustered at higher resolution to identify subtypes. Several neuronal sub-clusters identified by us including progenitors, immature neurons and mature neurons are validated by previous studies with well-known markers. Amongst the mature neurons, the biological basis of four Gnao1 neuron sub-clusters (n1n4) is discussed in our analysis and these are also validated by previous experimental studies. These Gnao1 clusters are organized according to the expression of family-C V2Rs (Vmn2r1 or Vmn2r2) as well as H2Mv genes. Within Gnai2 sub-clusters, n12 and n13 exclusively express markers that distinguish them from n8-n11 which we have described in our revised version. However, Gnai2 n8-n11 do not have definitive markers and whether these sub-clusters are part of a continuum or over-clustered, will require further extensive experiments and analysis. We prefer to show all sub-clusters, including Gnai2 sub-clusters, in Figure 3-Figure supplement-1, along with a dot plot of sub-cluster gene expression, so that this data is available for future experiments and analysis. We share the concern that some Gnai2 sub-clusters may not have an obvious biological basis at this time. Hence in our revised submission, we have merged mature Gnao1 and mature Gnai2 sub-clusters for the developmental analysis shown in Figure 3A.

      (3) Some clusters are characterized by the expression of specific chemoreceptors (VRs). Have these been used for clustering? If so, clustering should be repeated after excluding these receptors.

      Figure 3-Figure supplement-2 of our revised submission shows a comparison of neuron clusters with and without VRs. We also describe in the results, specific clusters that are affected by exclusion of VRs.  

      (4) Given the title and the data, the paper should be structured around its main claim (i.e. differential ER environment between VSN types). For example, Figure 7, which deals with the characterization of receptor expression and co-expression in VSNs, is sandwiched between the validation of ER substructure (Figure 6) and the timing of coexpression of ER chaperone genes (Figure 8). The data presented in Figure 7 would fit better if used as a validation of the dataset prior to the investigation presented in the current Figure 4. In addition, we suggest that expression and co-expression diagnostics should be used to filter cells for subsequent analyses.

      We appreciate this suggestion and have reorganized the figures in our revised version.  Our subsequent analysis showing enrichment of ER related genes at RNA, protein level covers all Gnao1 neurons and is not restricted to a specific subset. This is reflected in the ISH and IHC of ER genes. 

      (5) Figure 7-Supplement 3 suggests the presence of co-expressed V1Rs in VSNs. It is unclear from the data presented whether these co-expressing cells are artifactual cell doublets and should be removed from the analysis or whether the expression of the coexpressed receptors reflects a reality. To better address this observation, one may want to see the expression levels of the individual co-expressed V1rs in Figure 7-Supplemet 3 rather than the sum of V1r expression. I am also concerned about the unusually high frequency of "empty" neurons (i.e. without expressed VRs). Could these be debris? 

      We think that the cells expressing zero as well as two V1Rs are real and cannot be attributed to debris or doublets for the following reasons:

      i) Cells expressing no V1Rs are not necessarily debris because they express other neuronal markers at the same level as cells that express one or two V1Rs. For instance Gnai2 expression level across cells expressing 0, 1, 2 V1Rs is the same, which we have included in Figure 4-figure supplement 4-C of our revised submission. Higher expression threshold values used in our analysis may have somewhat increased the proportion of cells with zero V1Rs. Similarly, Gnao1 levels across cells expressing multiple V2Rs and H2-M_v_ per cell stay the same, indicating that these are unlikely to be doublets (Figure 4 I-K). As doublets are formed randomly, the frequency of each co-expression combination (Supplementary Table 7 and 8) itself is an indication of whether it is represented by a single cell or an artifact.

      ii) Cells co-expressing V1R genes: All cells used for co-expression analysis were filtered via an expression threshold (Figure 4-figure supplement 1D), which eliminates cells with low counts of V1R expression. We listed the frequency of cells co-expressing V1R gene combinations in Supplementary table - 8. Among 134 cells that express two V1Rs, 44 cells express Vmn1r85+Vmn1r86, 21 express Vmn1r184+Vmn1r185, 13 express Vmn1r56+Vmn1r57, 6 express Vmn1r168+Vmn1r177, and so on. Doublets generally are a random combination of two cells. Here, each specific co-expression combination represents multiple cells and is highly unlikely by random chance.  iii) Some of the co-expression combinations we reported were identified earlier and verified experimentally in Lee et al., 2019 using FACS based single collection in 96-well plates following the cellseq-2 protocol with very low chance of doublets, and Hills et. al., 2024.  

      (6) The authors use either dot plots or scatter plots to show gene expression in cell clusters. It looks nice, but it is very difficult to deduce population levels of expression from these plots. Could we see the distribution of gene expression across clusters using more quantitative visualizations such as violin or box plots?

      Dot plots are majorly used in our manuscript to show markers of cell clusters in Figure 1, Figure 2 and Figure 3-figure supplement 1. We would like to show at least 5 gene markers for each cluster that are important to identify the cell type. Using violin plot or bar plot for this will make the panel extremely big and overwhelming, especially with 16 clusters in Figure 1 and 13 clusters in Figure 3-figure supplement 1 or make the bars/violin too small to interpret.  Hence, for the sake of simplicity, we used dot plots to give our reader a birds-eye of gene expression differences across clusters. Scatter plots were used when we want to compare the expression levels of genes between male and female samples and show the expression of two genes (VRs) simultaneously in a single cell. This cannot be achieved by Violin/box plot. However, we have made our dataset available at scvnoexplorer.com to explore the expression patterns across cell clusters with different visualization options, including violin or box plots.  

      (7) To investigate whether sex might bias clustering, the authors calculated the Pearson coefficient of gene expression between sexes for each cluster. Given the high coefficient observed across all clusters (although no threshold is used), the authors conclude that there was no bias. While the overall effect may show a strong similarity in gene expression in each cluster between the sexes, this overlooks all the genes that are significantly differentially expressed. It would be worth investigating and discussing these differences. Relatedly, what batch correction method was applied to the data (to mitigate any possible sampling or technical effect)?

      We chose the Pearson coefficient as a representative parameter to show that there is no bias. In addition, we have performed differential expression analysis for each cluster and the results are in supplementary table-1. Except known sexually dimorphic genes, other genes are not differentially expressed significantly with adjusted p-values greater than 0.05. This was also shown by earlier studies using bulk RNAseq (doi.org/10.1371/journal.pgen.1004593, doi.org/10.1186/s12864-017-4364-4). We used depth normalization to integrate samples and described this in the methods section of our revised version.

      (8) We found the method description to be incomplete for the single-cell RNA sequencing analyses. The method section should include a detailed explanation of the code used by the authors to analyze the data. The Seurat package has many available pipelines for single-cell RNA-seq analysis, which have a major impact on the output data. It is therefore imperative to describe which of these pipelines were used and whether the pipeline was run with default settings. 

      Our revised submission has expanded on the methods section with details of parameters, filtering criterion and software used.

    3. Reviewer #1 (Public review):

      Devakinandan et al. present a revised version of their manuscript. Their scRNA-seq data is a valuable resource to the community, and they further validate their findings via in situ hybridizations and electron microscopy. Overall, they have addressed my major concerns. I only have two minor comments.

      (1) The authors note in Figure 4I, and K that because the number of C2 V2Rs or H2-Mv receptors increased while the normalized expression of Gnao1 remained constant (and likewise for V1Rs and Gnai2 in Figure 4-S4C) that their results are unlikely to be capturing doublets. I'm not sure that this is the case. If the authors added together two V2R cells the total count of every gene might double, but the normalized expression of Gnao1 would remain the same. To address this concern, the authors should also show the raw counts for Gnao1 as well as the total number of UMIs for these cells.

      (2) As requested, the authors have now added a colorbar to the pseudocolored images in Figures 7. However, this colorbar still doesn't have any units. Can the authors add some units, or clarify in the methods how the raw data relates to the colors (e.g. is it mapped linearly, at a logscale, with gamma or other adjustments, etc.)? Moreover, it's also unclear what the dots in the backgrounds of plots like Figure 7E mean. Are they pixels? Showing the individual lines, the average for each animal, or omitting them entirely, might make more sense.

    4. Reviewer #2 (Public review):

      Summary:

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons.

      Strengths:

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic.

      (2) The analysis suggest that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors.

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons.

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons.

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons.

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community.

      Comments on latest version:

      In the revised manuscript, the authors have thoroughly addressed all of this reviewer's concerns.

    5. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report an enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and co-expression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns.

      Strengths:

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting in a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes.

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...).

    1. eLife Assessment

      This observational study from the UK Biobank provides an important investigation into the associations between menopausal hormone therapy and brain health in a large, population-based cohort of females in the UK. A solid model of brain aging using an open source algorithm is used. While some modest adverse brain health characteristics were associated with current mHT use and older age at last use, the findings do not support a general neuroprotective effect of mHT nor severe adverse effects on the female brain. This work addresses a topic that is of grave importance since menopausal hormone therapy and its effect on the brain should be better understood in order to provide individualized effective medical support to women going through menopause.

    2. Reviewer #1 (Public review):

      Summary:

      This study takes a detailed approach to understanding the effect of menopausal hormone therapy (MHT) in the brain aging of females. Neuroimaging data from the UK Biobank is used to explore brain aging and shows an unexpected effect of current MHT use and poorer brain health outcomes relative to never users. There is considerable debate about the benefits of MHT and estrogens in particular for brain health, and this analysis illustrates that the effects are certainly not straightforward and require greater consideration.

      Strengths:

      (1) The detailed approach to obtaining important information about MHT use from primary care records. Prior studies have suggested that factors such as estrogen/progestin type, route of administration, duration, and timing of use relative to menopause onset can contribute to whether MHT benefits brain health.

      (2) Consideration of type of menopause (spontaneous, or surgical) in the analysis, as well as sensitivity diagnoses to rule out the effect being driven by those with clinical conditions.

      (3) The incorporation of the brain age estimate along with hippocampal volume to address brain health.

      (4) The complex data are also well explained and interpretations are reasonable.

      (5) Limitations of the UK Biobank data are acknowledged

      Weaknesses:

      (1) Lifestyle factors are listed and the authors acknowledge group differences (at least between current users and never users of MHT). I was not able to find these analyses showing these differences.

      (2) The distribution of women who were not menopausal was unequal across groups, and while the authors acknowledge this, one wonders to what extent this explains the observed findings.

      (3) While the interpretations are reasonable, and relevant theories (healthy cell & critical window) are mentioned, the discussion is missing a more zoomed-out perspective of the findings. While I appreciate wanting to limit speculation, the reader is left having to synthesize a lot of complex details on their own. A particularly difficult finding to reconcile is under what conditions these women benefit from MHT and when do they not (and why that may be).

    3. Reviewer #2 (Public review):

      Summary:<br /> In this observational study, Barth et al. investigated the association between menopausal hormone therapy and brain health in middle- to older-aged women from the UK Biobank. The study evaluated detailed MHT data (never, current, or past user), duration of mHT use (age first/last used), history of hysterectomy with or without bilateral oophorectomy, APOEE4 genotype, and brain characteristics in a large, population-based sample. The researchers found that current mHT use (compared to never-users), but not past use, was associated with a modest increase in gray and white matter brain age gap (GM and WM BAG) and a decrease in hippocampal volumes. No significant association was found between the age of mHT initiation and brain measures among mHT users. Longer duration of use and older age at last MHT use post-menopause were associated with higher GM and WM BAG, larger WMH volumes, and smaller hippocampal volumes. In a sub-sample, after adjusting for multiple comparisons, no significant associations were found between detailed mHT variables (formulations, route of administration, dosage) and brain measures. The association between mHT variables and brain measures was not influenced by APOEE4 allele carrier status. Women with a history of hysterectomy with or without bilateral oophorectomy had lower GM BAG compared to those without such a history. Overall, these observational data suggest that the association between mHT use and brain health in women may vary depending on the duration of use and surgical history.

      Strengths:

      The study has several strengths, including a large, population-based sample of women in the UK, and comprehensive details of demographic variables such as menopausal status, history of oophorectomy/hysterectomy, genetic risk factors for Alzheimer's disease (APOE ε4 status), age at mHT initiation, age at last use, duration of mHT, and brain imaging data (hippocampus and WMH volume).

      In a sub-sample, the study accessed detailed mHT prescription data (formulations, route of administration, dosage, duration), allowing the researchers to study how these variables were associated with brain health outcomes. This level of detail is generally missing in observational studies investigating the association of mHT use with brain health.

      Weaknesses:

      While the study has many strengths, it also has some weaknesses. As highlighted in an editorial by Kantarci & Manson (2023), women with symptoms such as subjective cognitive problems, sleep disturbances, and elevated vasomotor symptoms combined with sleep disturbances tend to seek mHT more frequently than those without these symptoms. The authors of this study have also indicated that the need of mHT use which might be associated with these symptoms may be indicators of preexisting neurological changes, potentially reflecting worse brain health scores, including higher BAG and lower hippocampal volume and/or higher WMH. However, among current users, how many of these women have these symptoms could not be reported in the study. Women with these vasomotor symptoms who are using mHT are more likely to stay longer in the healthcare system compared with those without these symptoms and no MHT use history. The authors noted that the UK Biobank lacks detailed information on menopausal symptoms and perimenopausal staging, limiting the study's ability to understand how these variables influence outcomes.

      Earlier observational studies have reported conflicting results regarding the association between mHT use and the risk of dementia and brain health. Contrary to some observational studies, three randomized trials (WHI, KEEPS, ELITE) (Espeland et al 2013, Gleason et al 2015; Henderson et al 2016) demonstrated neither beneficial nor harmful effects of mHT (with varying doses and formulations) when initiated closer to menopause (<5 years). While strong efforts were made to run proper statistical analyses to investigate the association between mHT use and brain health, these results reflect mainly associations, but not causal relationships as also stated by the authors.

      Furthermore, observational studies have intrinsic limitations, such as a lack of control over switching mHT doses and formulations, a lack of laboratory measures to confirm mHT use, and reliance on self-reported data, which may not always be reliable. The authors caution that these findings should not guide individual-level decisions regarding the benefits versus risks of mHT use. However, the study raises new questions that should be addressed by randomized clinical trials to investigate the varying effects of MHT on brain health and dementia risk.

    4. Reviewer #3 (Public review):

      In this study Barth et al. present results of detailed analyses of the relationships between menopausal hormone therapy (MHT), APOE ε4 genotype, and measures of anatomical brain age in women in the UK Biobank. While past studies have investigated the links between some of these variables (including works by the authors themselves), this new study adds more detailed MHT variables, surgical status, and additional brain aging measures. The UK biobank sample is large, but it is a population cohort and many of the MHT measures are self-reported (as the authors point out). However, the authors present a solid analysis of the available information which shows associations between MHT user status, length of MHT use, as well as surgical status with brain age. However, as the authors themselves state, the results do not unequivocally support the neuroprotective or adverse effect of MHT on the brain. I think this work strengthens the case for the need of better-designed longitudinal studies investigating the effect of MHT on the brain in the peri/post-menopausal stage.

      Strengths:

      The authors addressed the statistical analyses rigorously. For example, multiple testing corrections, outlier removal, and sensitivity analysis were performed carefully. Ample background information is provided in the introduction allowing even individuals not familiar with the field to understand the motivation behind the work. The discussion section also does a great job of addressing open questions and limitations. Very detailed results of all statistical tests are provided either in the main text or in the supplementary information.

      Weaknesses:

      For me, the biggest weakness was the presentation of the results. As many variables are involved and past studies have investigated several of these questions, it would have helped to better clarify the analysis and questions that are addressed by this study in particular and what sets this work apart from past studies. The information is present in the manuscript but better organization might have helped. For example, a figure depicting the key questions near the beginning of the manuscript would have been very helpful for me. The Tables also contain a lot of information but I wonder if there might be a way to capture the most relevant information more succinctly (either in Table format or in a figure) for the main text.<br /> Another concern I had was the linear models investigating the effects of these MHT variables on the brain age gap. The authors have included "age" as one of the parameters in this analysis. I wonder if adding a quadratic age factor age2 in the model might have improved the fit since many brain phenotypes tend to show quadratic brain age effects in the 40 to 80-year age range.

    1. eLife Assessment

      This is an important study, supported by solid data, that suggests a model for diet selection in C. elegans. The significance is that while C. elegans has long been known to be attracted to bacterial volatiles, what specific bacterial volatiles may signify to C. elegans is largely unknown. This study also provides evidence for a possible odorant/GPCR pairing.

    2. Reviewer #1 (Public review):

      Summary:

      Siddiqui et al., investigate the question of how bacterial metabolism contributes to the attraction of C. elegans to specific bacteria. They show that C. elegans prefers three bacterial species when cultured in a leucine-enriched environment. These bacterial species release more isoamyl alcohol, a known C. elegans attractant, when cultured with leucine supplement than without leucine supplement. The study shows correlative evidence that isoamyl alcohol is produced from leucine by the Ehrlich pathway. In addition, they show that SRD-12 is likely a receptor for isoamyl alcohol because a null mutant of this receptor exhibits lower chemotaxis to isoamyl alcohol and lower preference for leucine-enriched bacteria.

      Strengths:

      (1) This study takes a creative approach to examine the question of what specific volatile chemicals released by bacteria may signify to C. elegans by examining both bacterial metabolism and C. elegans preference behavior. Although C. elegans has long been known to be attracted to bacterial metabolites, this study may be one of the first to examine the role of a specific bacterial metabolic pathway in mediating attraction.

      (2) A strength of the paper is the identification of SRD-12 as a likely receptor for isoamyl alcohol. The ligands for very few olfactory receptors have been identified in C. elegans and so this is a significant addition to the field. The srd-12 null mutant strain will likely be a useful reagent for many labs examining olfactory and foraging behaviors.

      Weaknesses:

      (1) The authors write that the leucine metabolism via the Ehrlich pathway is required for the production of isoamyl alcohol by three bacteria (CEent1, JUb66, BIGb0170), but their evidence for this is correlation and not causation. They write that the gene ilvE is a bacterial homolog of the first gene in the yeast Ehrlich pathway (it would be good to include a citation for this) and that the gene is present in these three bacterial strains. In addition, they show that this gene, ilvE, is upregulated in CEent1 bacteria upon exposure to leucine. To show causation, they need to knockout ilvE from one of these strains, show that the bacteria does not have increased isoamyl alcohol production when cultured on leucine, and that the bacteria is no longer attractive to C. elegans.

      (2) The authors examine three bacterial strains that C. elegans showed increased preference when grown with leucine supplementation vs. without leucine supplementation. However, there also appears to be a strong preference for another strain, JUb0393, when grown on plus leucine (Figure 1B). It would be good to include statistics and criteria for selecting the three strains.

      3. Although the behavioral evidence that srd-12 gene encodes a receptor for isoamyl alcohol is compelling, it does not meet the standard for showing that it is an olfactory receptor in C. elegans. To show it is indeed a likely receptor one or more of the following should be done:<br /> (a) Calcium imaging of AWC neurons in response to isoamyl alcohol in the receptor mutant with the expectation that the response would be reduced or abolished in the mutant compared to wildtype.<br /> (b)"A receptor swap" experiment where the SRD-12 receptor is expressed in AWB repulsive neuron in SRD-12 receptor mutant background with the expectation that with receptor swap C. elegans will now be repulsed from isoamyl alcohol in chemotaxis assays (experiment from Sengupta et al., 1996 odr-10 paper).

      (4) The authors conclude that C. elegans cannot detect leucine in chemotaxis assays. It is important to add the method for how leucine chemotaxis assay was done in order to interpret these results. Because leucine is not volatile if leucine is put on the plates immediately before the worms are added (as in a traditional odor chemotaxis assay), there is no leucine gradient for the worm to detect. It would be good to put leucine on the plate several hours before worms are introduced so worms have the possibility to be able to detect the gradient of leucine (for example, see Wakabayashi et al., 2009).

      (5) The bacterial preference assay entitled "odor-only assay" is a misleading name. In the assay, C. elegans is exposed to both volatile chemicals (odors) and non-volatile chemicals because the bacteria are grown on the assay plate for 12 hours before the worms are introduced to the assay plate. In that time, the bacteria is likely releasing non-volatile metabolites into the plate which may affect the worm's preference. A true odor-only assay would have the bacteria on the lid and the worms on the plate.

      (6) The findings of the study should be discussed more in the context of prior literature. For example, AWC neurons have been previously shown to be involved in bacterial preference (Harris et al., 2014; Worthy et al., 2018). In addition, CeMbio bacterial strains (the strains examined in this study) have been previously shown to release isoamyl alcohol (Chai et al. 2024).

    3. Reviewer #2 (Public review):

      Summary:

      Siddiqui et al. show that C. elegans prefers certain bacterial strains that have been supplemented with the essential amino acid (EEA) leucine. They convincingly show that some leucine enriched bacteria stimulate the production of isoamyl alcohol (IAA). IAA is an attractive odorant that is sensed by the AWC. The authors an identify a receptor, SRD-12, that is expressed in the AWC chemosensory neurons and is required for chemotaxis to IAA. The authors propose that IAA is a predominant olfactory cue that determines diet preference in C. elegans. Since leucine is an EAA, the authors propose that worm IAA sensing allows the animal provides a proxy mechanism to identify EAA rich diets.

      Strengths:

      The authors propose IAA as a predominant olfactory cue that determines diet preference in C. elegans providing molecular mechanism underlying diet selection. They show that wild isolates of C. elegans have a strong chemotactic response to IAA indicating that IAA is an ecologically relevant odor for the worm. The paper is well written, and the presented data are convincing and well organized. This is an interesting paper that connects chemotactic response with bacterially produced odors and thus provides an understanding of how animals adapt their foraging behavior through the perception of molecules that may indicate the nutritional value.

      Weaknesses:

      Major:

      While I do like the way the authors frame C. elegans IAA sensing as mechanisms to identify leucine (EAA) rich diets it is not fully clear whether bacterial IAA production is a proxy for bacterial leucine levels.

      (1) Can the authors measure leucine (or other EAA) content of the different CeMbio strains? This would substantiate the premise in the way they frame this in the introduction. While the authors convincingly show that leucine supplementation induces IAA production in some strains, it is not clear if there are lower leucine levels in the different in non-preferred strains.

      (2) It is not clear whether the non-preferred bacteria in Figure 1A and 1B have the ability to produce IAA. To substantiate the claim that C. elegans prefers CEent1, JUb66, and BIGb0170 due to their ability to generate IAA from leucine, it would measure IAA levels in non-preferred bacteria (+ and - leucine supplementation). If the authors have these data it would be good to include this.

      (3) The authors would strengthen their claim if they could show that deletion or silencing ilvE enzyme reduces IAA levels and eliminates the increased preference upon leucine supplementation.

      (4) While the three preferred bacteria possess the ilvE gene, it is not clear whether this enzyme is present in the other non-preferred bacterial strains. As far as I know, the CeMbio strains have been sequenced so it should be easy to determine if the non-preferred bacteria possess the capacity to make IAA. Does the expression of ilvE in e.g. E. coli increase its preference index or are the other genes in the biosynthesis pathway missing?

      (5) It is strongly implied that leucine-rich diets are beneficial to the worm. Do the authors have data to show the effect on leucine supplementation on C. elegans healthspan, life-span or broodsize?

      Other comments:

      Page 6. Figure 2c. While the authors' conclusions are correct based on AWC expts. it would be good at this stage to include the possibility that odors that enriched in the absence of leucine may be aversive.

      Page 6. IAA increases 1.2-4 folds upon leucine supplementation. If the authors perform a chemotaxis assay with just IAA with 1-2-4 fold differences do you get the shift in preference index as seen with the bacteria? i.e. is the difference in IAA concentration sufficient to explain the shift in bacterial PI upon leucine supplementation? Other attractants such as Acetoin and isobutanol go up in -Leu conditions.

      Page 14-15. The authors identify a putative IAA receptor based on expression studies. I compliment the authors for isolating two CRISPR deletion alleles. They show that the srd-12 mutants have obvious defects in IAA chemotaxis. Very few ligand-odorant receptors combinations have been identified so this is an important discovery. CenGen data indicate that srd-12 is expressed in a limited set of neurons. Did the authors generate a reporter to show the expression of srd-12? This is a simple experiment that would add to the characterization of the SRD-12 receptor. Rescue experiments would be nice even though the authors have independent alleles. To truly claim that SRD-12 is the ligand for IAA and activates the AWC neurons would require GCamp experiments in the AWC neuron or heterologous expression system. I understand that GCamp imaging might not be part of the regular arsenal of the lab but it would be a great addition (even in collaboration with one of the many labs that do this regularly). Comparing AWC activity using GCaMP in response IAA-producing bacteria with high leucine levels in both wild-type and SRD-12-deficient backgrounds, would further support their narrative. I leave that to the authors.

      Minor:

      Page 4 "These results suggested that worms can forage for diets enriched in specific EAA, leucine...." More precise at this stage would be to state " These results indicated that worms can forage for diets supplemented with specific EAA...".

      Page 5."these findings suggested that worms not only rely on odors to choose between two bacteria but also to find leucine enriched bacteria" This statement is not clear to me and doesn't follow the data in Fig. S2. Preferred diets in odorant assays are the IAA producing strains.

      Page 5. Figure S2A provides nice and useful data that can be part of the main Figure 1.

    4. Reviewer #3 (Public review):

      Summary:

      The authors first tested whether EAA supplementation increases olfactory preference for bacterial food for a variety of bacterial strains. Of the EAAs, they found only leucine supplementation increased olfactory preference (within a bacterial strain), and only for 3 of the bacterial strains tested. Leucine itself was not found to be intrinsically attractive.

      They determined that leucine supplementation increases isoamyl alcohol (IAA) production in the 3 preferred bacterial strains. They identify the biochemical pathway that catabolizes leucine to IAA, showing that a required enzyme for this pathway is upregulated upon supplementation.

      Consistent with earlier studies, they find that AWC olfactory neuron is primarily responsible for increased preference for IAA-producing bacteria.

      Testing volatile compounds produced by bacteria and identified by GC/MS, and identified several as attractive, most of them require AWC for the full effect. Adaptation assays were used to show that odorant levels produced by bacterial lawns were sufficient to induce olfactory adaptation, and adaptation to IAA reduced chemotaxis to leucine-supplemented lawns. They then showed that IAA attractiveness is conserved across wild strains, while other compounds are more variable, suggesting IAA is a principal foraging cue.

      Finally, using the CeNGEN database, they developed a list of candidate IAA receptors. Using behavioral tests, they show that mutation of srd-12 greatly impairs IAA chemotaxis without affecting locomotion or attraction to another AWC-sensed odor, PEA.

      Comments

      This study will be of great interest in the field of C. elegans behavior, chemical senses and chemical ecology, and understanding of the sensory biology of foraging.

      Strengths:

      The identification of a receptor for IAA is an excellent finding. The combination of microbial metabolic chemistry and the use of natural bacteria and nematode strains makes an extremely compelling case for the ecological and adaptive relevance of the findings.

      Weaknesses:

      AWC receives synaptic input from other chemosensory neurons, and thus could potentially mediate navigation behaviors to compounds detected in whole or in part by those neurons. Language concluding detection by AWC should be moderated (e.g. p9 "worms sense an extensive repertoire...predominantly using AWC") unless it has been demonstrated.

      srd-12 is not exclusively expressed in AWC. Normally, cell-specific rescue or knockdown would be used to demonstrate function in a specific cell. The authors should provide such a demonstration or explain why they are confident srd-12 acts in AWC.

      A comparison of AWC's physiological responses between WT and srd-12, preferably in an unc-13 background, would be nice. Even further, the expression of srd-12 in a different neuron type and showing that it confers responsiveness to IAA (in this case, inhibition) would be very convincing.

    1. eLife assessment

      This important study advances our understanding of the role energy metabolism, specifically anaerobic glycolysis, plays during development. Convincing genetic and pharmacological evidence demonstrates that glycolytic flux is not only necessary during retinogenesis but also controls the rate of retinal progenitor cell proliferation and photoreceptor maturation. Interesting evidence suggests potential downstream roles for intracellular pH and Wnt/β-catenin signaling; however, more direct evidence is needed to show they are the key mediators of glycolysis. This work is expected to stimulate broad interest and possible future studies investigating the link between metabolism and development in other tissue systems.

    2. Reviewer #1 (Public review):

      Summary:

      This paper seeks to understand the upstream regulation and downstream effectors of glycolysis in retinal progenitor cells, using mouse retinal explants as the main model system. The paper presents evidence that high glycolysis in retinal progenitor cells is required for their proliferation and timely differentiation into photoreceptors. Retinal glycolysis increases after the deletion of Pten. The authors suggest that high glycolysis controls cell proliferation and differentiation by promoting intracellular alkalinization, beta-catenin acetylation and stabilization, and consequent activation of the canonical Wnt pathway.

      Strengths:

      (1) The experiments showing that PFKFB3 overexpression is sufficient to increase the proliferation of retinal progenitors (which are already highly dividing cells) and photoreceptor differentiation are striking and the result is unanticipated. It suggests that glycolytic flux is normally limiting for proliferation in embryos.

      (2) Likewise the result that an increase in pH from 7.4 to 8.0 is sufficient to increase proliferation implies that pH regulation may have instructive roles in setting the tempo of retinal development and embryonic cell proliferation. Similarly, the results show that acetate supplementation increases proliferation (I think this result should be moved to the main figures).

      Weaknesses:

      (1) Epistatic experiments to test if changes in pH mediate the effects of glycolysis on photoreceptor differentiation, or if Wnt activation is the main downstream effector of glycolysis in controlling differentiation are not presented.

      (2) It is likely that metabolism changes ex vivo vs in vivo, and therefore stable isotope tracing experiments in the explants may not reflect in vivo metabolism.

      (3) The retina at P0 is composed of both progenitors and differentiated cells. It is not clear if the results of the RNA-seq and metabolic analysis reflect changes in the metabolism of progenitors, or of mature cells, or changes in cell type composition rather than direct metabolic changes in a specific cell type.

      (4) The biochemical links between elevated glycolysis and pH and beta-catenin stability are unclear. White et al found that higher pH decreased beta-catenin stability (JCB 217: 3965) in contrast to the results here. Oginuma et al found that inhibition of glycolysis or beta-catenin acetylation does not affect beta-catenin stability (Nature 584:98), again in contrast to these results. Another paper showed that acidification inhibits Wnt signaling by promoting the expression of a transcriptional repressor and not via beta-catenin stability (Cell Discovery 4:37). There are also additional papers showing increased pH can promote cell proliferation via other mechanisms (e.g. Nat Metab 2:1212). It is possible that there is organ-specificity in these signaling pathways however some clarification of these divergent results is warranted.

      (5) The gene expression analysis is not completely convincing. E.g. the expression of additional glycolytic genes should be shown in Figure 1. It is not clear why Hk1 and Pgk1 are specifically shown, and conclusions about changes in glycolysis are difficult to draw from the expression of these two genes. The increase in glycolytic gene expression in the Pten-deficient retina is generally small.

      (6) Is it possible that glycolytic inhibition with 2DG slows down the development and production of most newly differentiated cells rather than specifically affecting photoreceptor differentiation?

      (7) Are the prematurely-born cells caused by PFKFB3 overexpression photoreceptors as assessed by morphology or markers (in addition to position)?

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Hanna et al., addresses the question of energy metabolism in the retina, a neuronal tissue with an inordinately high energy demand. Paradoxically, the retina appears to employ to a large extent glycolysis to satisfy its energetic needs, even though glycolysis is far less efficient than oxidative phosphorylation (OXPHOS). The focus of the present study is on the early development of the retina and the retinal progenitor cells (RPCs) that proliferate and differentiate to form the seven main classes of retinal neurons. The authors use different genetic and pharmacological manipulations to drive the metabolism of RPCs or the retina towards higher or lower glycolytic activity. The results obtained suggest that increased glycolytic activity in early retinal development produces a more rapid differentiation of RPCs, resulting in a more rapid maturation of photoreceptors and photoreceptor segment growth. The study is significant in that it shows how metabolic activity can determine cell fate decisions in retinal neurons.

      Strengths:

      This study provides important findings that are highly relevant to the understanding of how early metabolism governs the development of the retina. The outcomes of this study could be relevant also for human diseases that affect early retinal development, including retinopathy of maturity where an increased oxygenation likely causes a disturbance of energy metabolism.

      Weaknesses:

      The restriction to only relatively early developmental time points makes it difficult to assess the consequences of the different manipulations on the (more) mature retina. Notably, it is conceivable that early developmental manipulations, while producing relevant effects in the young post-natal retina, may "even out" and may no longer be visible in the mature, adult retina.

    4. Reviewer #3 (Public review):

      Summary:

      This study examines the metabolic regulation of progenitor proliferation and differentiation in the developing retina. The authors observe dynamic changes in glycolytic gene expression in retinal progenitors and use various strategies to test the role of glycolysis. They find that elevated glycolysis in Pten-cKO retinas results in alteration of RPC fate, while inhibition of glycolysis has converse effects. They specifically test the role of elevated glycolysis using dominant active cytoPFKB3, which demonstrates the selective effects of elevated glycolysis on progenitor proliferation and rod differentiation. They then show that elevated glycolysis modulates both pHi and Wnt signaling, and provide evidence that these pathways impact proliferation and differentiation of progenitors, particularly affecting rod photoreceptor differentiation.

      Strengths:

      This is a compelling and rigorous study that provides an important advance in our understanding of metabolic regulation of retina development, addressing a major gap in knowledge. A key strength is that the study utilizes multiple genetic and pharmacological approaches to address how both increased or decreased glycolytic flux affect retinal progenitor proliferation and differentiation. They discover elevated Wnt signaling pathway genes in Pten cKO retina, revealing a potential link between glycolysis and Wnt pathway activation. Altogether the study is comprehensive and adds to the growing body of evidence that regulation of glycolysis plays a key role in tissue development.

      Weaknesses:

      (1) Following the expression of cytoPFKB3, which results in increased glycolytic flux, BrDU labeling was performed at e12.5 and increased labeled cells were detected in the outer nuclear layer. However whether these are cones or rods is not established. The rest of the analysis is focused on the precocious maturation of rhodopsin-labeled outer segments, and the major conclusions emphasize rod photoreceptor differentiation. Therefore it is unclear whether there is an effect on cone differentiation for either Pten cKO or cytoPFKB3 transgenic retina. It is also not established whether rods are born precociously. Presumably, this would be best detected by BrDU labeling at later embryonic stages.

      (2) The authors find that there is upregulation of multiple Wnt pathway components in Pten cKO retina. They further show that inhibiting Wnt signaling phenocopies the effects of reducing glycolysis. However, they do not test whether pharmacological inhibition of Wnt signaling reverses the effects of high glycolytic activity in Pten cKO retinas. Thus the argument that Wnt is a key downstream effector pathway regulating rod photoreceptor differentiation is weak.

      (3) The use of sodium acetate to force protein acetylation is quite non-specific and will have effects beyond beta-catenin acetylation (which the authors acknowledge). Thus it is a stretch to state that "forced activation of beta-catenin acetylation" mimics the impact of Pten loss/high glycolytic activity in RPCs since the effects could be due to acetylation of other proteins.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper seeks to understand the upstream regulation and downstream effectors of glycolysis in retinal progenitor cells, using mouse retinal explants as the main model system. The paper presents evidence that high glycolysis in retinal progenitor cells is required for their proliferation and timely differentiation into photoreceptors. Retinal glycolysis increases after the deletion of Pten. The authors suggest that high glycolysis controls cell proliferation and differentiation by promoting intracellular alkalinization, beta-catenin acetylation and stabilization, and consequent activation of the canonical Wnt pathway.

      Strengths:

      (1) The experiments showing that PFKFB3 overexpression is sufficient to increase the proliferation of retinal progenitors (which are already highly dividing cells) and photoreceptor differentiation are striking and the result is unanticipated. It suggests that glycolytic flux is normally limiting for proliferation in embryos.

      In our BrdU birthdating experiment, we showed that PFKB3 expression drives the precocious differentiation of retinal progenitor cells (RPCs) into photoreceptors. However, we did not determine if there is an associated change in the number of dividing RPCs. To examine the proliferative status of PFKB3-overexpressing RPCs, we will perform short-term BrdU labeling to measure the number of RPCs in S-phase of the cell cycle. Additionally, we will count the number of RPCs expressing pHH3, a mitotic marker, and Ki67, a marker of cycling cells in all cell cycle phases.

      (2) Likewise the result that an increase in pH from 7.4 to 8.0 is sufficient to increase proliferation implies that pH regulation may have instructive roles in setting the tempo of retinal development and embryonic cell proliferation. Similarly, the results show that acetate supplementation increases proliferation (I think this result should be moved to the main figures).

      We thank the reviewer for these positive comments on our work. We will move the acetate data to the main figure as requested.

      Weaknesses:

      (1) Epistatic experiments to test if changes in pH mediate the effects of glycolysis on photoreceptor differentiation, or if Wnt activation is the main downstream effector of glycolysis in controlling differentiation are not presented.

      Traditionally, epistasis is tested using double knock-out (DKO) studies with null mutant alleles. If two genes operate in the same pathway, the downstream phenotype prevails, whereas phenotypic worsening is observed if two genes act in parallel pathways. Our data suggests the following order of events: Pten¯®glycolysis­®intracellular pH­®Wnt signaling­®photoreceptor differentiation. In this model, Wnt signaling is the downstream-most effector. To test our epistatic model, we will assess RPC proliferation and the differentiation of Crx+ photoreceptor precursors with the following assays:

      (1) To confirm that Wnt signaling acts downstream of Pten, we will generate DKOs of Pten and Ctnnb1, a downstream effector of Wnt signaling. We know that fewer photoreceptors are generated in single Pten-cKO and Ctnnb1-cKO retinas, with a disruption of the outer nuclear layer only in Ctnnb1-cKOs. If Pten and Wnt act in the same pathway, Pten;Ctnnb1 DKOs will resemble single Ctnnb1-cKOs.

      (2) While epistasis is traditionally examined using genetic mutants, we will perform proxy experiments using pharmacological agents. To test whether Wnt activation acts downstream of a pH increase, we will activate Wnt signaling with recombinant Wnt3a at high and low pH. While low pH inhibits photoreceptor differentiation, if Wnt signaling is downstream, it should promote differentiation even at low pH. Conversely, we will alter pH in the presence of a Wnt inhibitor, FH535, which should block the positive effects of high pH on photoreceptor differentiation.

      (3) To test whether Wnt activation acts downstream of glycolysis to increase photoreceptor differentiation, we will apply recombinant Wnt3a to retinal explants while simultaneously inhibiting glycolysis with 2DG.  While 2DG inhibits photoreceptor differentiation, if Wnt signaling is downstream, it should still be able to promote differentiation. 

      (4) To test whether pharmacological inhibition of Wnt signaling reverses the effects of high glycolytic activity in Pten cKO retinas, we will treat wild-type and Pten-cKO retinas with the Wnt inhibitor FH535 and/or the glycolytic inhibitor 2DG.

      (2) It is likely that metabolism changes ex vivo vs in vivo, and therefore stable isotope tracing experiments in the explants may not reflect in vivo metabolism.

      We agree with the reviewer that metabolism likely changes ex vivo compared to in vivo. However, we did not perform stable isotope tracing experiments to directly examine glycolytic flux in this study. While outside the scope of the current study, this type of analysis is an important future direction that we will bring up in the discussion.

      (3) The retina at P0 is composed of both progenitors and differentiated cells. It is not clear if the results of the RNA-seq and metabolic analysis reflect changes in the metabolism of progenitors, or of mature cells, or changes in cell type composition rather than direct metabolic changes in a specific cell type.

      We mined a scRNA-seq dataset to show that Pgk1, a rate-limiting enzyme for glycolysis, is specifically elevated in early-stage RPCs versus later stage. We have since analysed additional glycolytic pathway genes, and observed a similar enrichment of Pfkl, Eno1 and Slc16a3 transcripts in early RPCs, while other genes were equally expressed in both early and late RPCs.

      To functionally demonstrate that there are differences in glycolysis between early and late RPCs, we will use CD133 to sort RPCs at E15 (early) and P0 (late). We will perform qPCR on sorted cells to validate the transcriptional differences in glycolytic gene expression. Additionally, we will perform two proxy measures of glycolysis: 1) We will measure lactate levels in sorted RPCs at both stages, and 2) We will use a Seahorse assay and assess ECAR in sorted RPCs at both stages.

      (4) The biochemical links between elevated glycolysis and pH and beta-catenin stability are unclear. White et al found that higher pH decreased beta-catenin stability (JCB 217: 3965) in contrast to the results here. Oginuma et al found that inhibition of glycolysis or beta-catenin acetylation does not affect beta-catenin stability (Nature 584:98), again in contrast to these results. Another paper showed that acidification inhibits Wnt signaling by promoting the expression of a transcriptional repressor and not via beta-catenin stability (Cell Discovery 4:37). There are also additional papers showing increased pH can promote cell proliferation via other mechanisms (e.g. Nat Metab 2:1212). It is possible that there is organ-specificity in these signaling pathways however some clarification of these divergent results is warranted.

      The pleiotropic actions of Wnt signaling on cell proliferation and differentiation are well known, even shifting from pro-proliferative to anti-proliferative depending on tissue or cell type. It is thus not surprising that different studies found unique effects of pH and glycolysis on b-catenin modifications and the activation of downstream signaling. Thus, as suggested by the reviewer, the difference between our data and other studies could be attributed to tissue and organism. In our revision, we will more fully assess our findings in the context of published studies, as recommended by the reviewer.

      To summarize our data, in the developing retina, we found that non-phosphorylated b-catenin protein levels increase in Pten-cKO retinas in vivo, while conversely, non-phosphorylated b-catenin protein levels decrease upon 2DG treatment and at low pH 6.5 in vitro.

      The Oginuma et al. 2020 (Nature 584: 98-101) study was performed on the chick tailbud and investigated lineage decisions by neuromesodermal progenitors in the presomitic mesoderm. In this context, WNT activity, glycolysis and pHi all decline in tandem, complementary to our findings. However, Oginuma et al. found that while phosphorylated and non-phosphorylated b-catenin levels do not vary, K49 b -catenin acetylation is reduced at low pHi. In their system, K49 b -catenin acetylation is associated with a switch in cell fate choice from neural to mesodermal in the chick tailbud. We will now assess this modification.

      Hauck et al. 2021 (Cell Death & Differentiation 28:1398-1417) found that by mutating Pkm, a rate-limiting glycolytic enzyme, b-catenin can more efficiently shuttle to the nucleus to activate Wnt-signaling and promote cardiomyocyte proliferation. This study highlights the importance of examining b-catenin protein levels in both cytoplasmic and nuclear fractions. They also examined transcriptional targets of Wnt signaling, such as Axin2, Ccnd1, Myc, Sox2 and Tnnt3, which we will also now assess.

      In a separate study in cancer cells, high pH leads to increased expression of Ccnd1, a b-catenin target gene, and promotes proliferation (Koch et al. 2020. Nat Metab. 2:1212-1222). These findings are consistent with our demonstration that b-catenin levels are stabilized at pH 8, and RPC proliferation is enhanced. A separate study by Melnik et al 2018 (Cell Discovery 4:37) performed in cancer cells found that acidification induced by metformin indirectly suppresses Wnt signaling by activating the DDIT3 transcriptional repressor, consistent with our data showing low pH suppresses b-catenin stability. Melnik et al also used Mcl inhibitors, as we did in our study, and showed that this treatment blocked Wnt signaling. While we did not look at the impact of CNCn on Wnt signaling, we did see a decline in proliferation, as expected if Wnt levels are low. The relationship between CNCn and Wnt activity will now be assessed.

      The one study that fits less well is from Czowski and White (BioRxiv), where they found that higher pH levels decrease b-catenin levels in the cytoplasm, nucleus and junctional complexes in MDCK cells. In this study, the authors altered pH using inhibitors for a sodium-proton exchanger and a sodium bicarbonate transporter. The Oginuma paper instead used the ionophores nigericin and valinomycin to equilibrate intracellular pHi to media pH, which we will now incorporate into our study.

      In summary, to more comprehensively examine the link between Pten loss, glycolytic activity, pHi and Wnt signaling, we will examine levels of phosphorylated, non-phosphorylated and K49 acetylated b-catenin after each manipulation (i.e., Pten loss, pH manipulations, CNCn treatment, glycolysis inhibition, acetate treatments). For pH manipulations, we will use nigericin and valinomycin to equilibrate pH. These studies will be performed on cytoplasmic and nuclear fractions from CD133+ MACS-enriched RPCs, to add cell type and stage specificity to our study. We will also use qPCR to examine Wnt signaling genes, such as Axin2, Ccnd1, Myc, Sox2 and Tnnt3.

      (5) The gene expression analysis is not completely convincing. E.g. the expression of additional glycolytic genes should be shown in Figure 1. It is not clear why Hk1 and Pgk1 are specifically shown, and conclusions about changes in glycolysis are difficult to draw from the expression of these two genes. The increase in glycolytic gene expression in the Pten-deficient retina is generally small.

      See response to point 3.

      (6) Is it possible that glycolytic inhibition with 2DG slows down the development and production of most newly differentiated cells rather than specifically affecting photoreceptor differentiation?

      We thank the reviewer for this excellent suggestion. We will examine the impact of  2DG on the differentiation of other retinal cell types, including bipolar and amacrine cells and Muller glia. For technical reasons, we will exclude ganglion cells, which die in culture and are not possible to examine in explants, and horizontal cells, which are a rare cell type, and hence, difficult to accurately quantify.

      (7) Are the prematurely-born cells caused by PFKFB3 overexpression photoreceptors as assessed by morphology or markers (in addition to position)?

      We will immunostain treated retinas with additional cell-type specific markers to examine rod and cone photoreceptor numbers and morphologies.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Hanna et al., addresses the question of energy metabolism in the retina, a neuronal tissue with an inordinately high energy demand. Paradoxically, the retina appears to employ to a large extent glycolysis to satisfy its energetic needs, even though glycolysis is far less efficient than oxidative phosphorylation (OXPHOS). The focus of the present study is on the early development of the retina and the retinal progenitor cells (RPCs) that proliferate and differentiate to form the seven main classes of retinal neurons. The authors use different genetic and pharmacological manipulations to drive the metabolism of RPCs or the retina towards higher or lower glycolytic activity. The results obtained suggest that increased glycolytic activity in early retinal development produces a more rapid differentiation of RPCs, resulting in a more rapid maturation of photoreceptors and photoreceptor segment growth. The study is significant in that it shows how metabolic activity can determine cell fate decisions in retinal neurons.

      Strengths:

      This study provides important findings that are highly relevant to the understanding of how early metabolism governs the development of the retina. The outcomes of this study could be relevant also for human diseases that affect early retinal development, including retinopathy of maturity where an increased oxygenation likely causes a disturbance of energy metabolism.

      We thank the reviewer for these positive comments on our study.

      Weaknesses:

      The restriction to only relatively early developmental time points makes it difficult to assess the consequences of the different manipulations on the (more) mature retina. Notably, it is conceivable that early developmental manipulations, while producing relevant effects in the young post-natal retina, may "even out" and may no longer be visible in the mature, adult retina.

      While we agree that it would be interesting to observe the long-term consequences of our manipulations, we are limited by our retinal explant model, which can at best be cultured for 2 weeks in vitro. Additional limitations include the lack of photoreceptor outer segment development in our in vitro model. However, we can perform more extensive analyses of our genetic models in vivo (i.e., Pten-cKO, cyto-PFKB3-GOF, Ctnnb1-cKO). For these lines, we will focus on more in-depth analyses of photoreceptor differentiation and outer segment maturation using additional markers and one later stage of development.

      Reviewer #3 (Public review):

      Summary:

      This study examines the metabolic regulation of progenitor proliferation and differentiation in the developing retina. The authors observe dynamic changes in glycolytic gene expression in retinal progenitors and use various strategies to test the role of glycolysis. They find that elevated glycolysis in Pten-cKO retinas results in alteration of RPC fate, while inhibition of glycolysis has converse effects. They specifically test the role of elevated glycolysis using dominant active cytoPFKB3, which demonstrates the selective effects of elevated glycolysis on progenitor proliferation and rod differentiation. They then show that elevated glycolysis modulates both pHi and Wnt signaling, and provide evidence that these pathways impact proliferation and differentiation of progenitors, particularly affecting rod photoreceptor differentiation.

      Strengths:

      This is a compelling and rigorous study that provides an important advance in our understanding of metabolic regulation of retina development, addressing a major gap in knowledge. A key strength is that the study utilizes multiple genetic and pharmacological approaches to address how both increased or decreased glycolytic flux affect retinal progenitor proliferation and differentiation. They discover elevated Wnt signaling pathway genes in Pten cKO retina, revealing a potential link between glycolysis and Wnt pathway activation. Altogether the study is comprehensive and adds to the growing body of evidence that regulation of glycolysis plays a key role in tissue development.

      We thank the reviewer for these positive comments on our study.

      Weaknesses:

      (1) Following the expression of cytoPFKB3, which results in increased glycolytic flux, BrDU labeling was performed at e12.5 and increased labeled cells were detected in the outer nuclear layer. However whether these are cones or rods is not established. The rest of the analysis is focused on the precocious maturation of rhodopsin-labeled outer segments, and the major conclusions emphasize rod photoreceptor differentiation. Therefore, it is unclear whether there is an effect on cone differentiation for either Pten cKO or cytoPFKB3 transgenic retina. It is also not established whether rods are born precociously. Presumably, this would be best detected by BrDU labeling at later embryonic stages.

      We agree with the reviewer that we should expand our study to also examine cone differentiation and outer segment maturation, which we will now do by adding additional markers to our study.

      (2) The authors find that there is upregulation of multiple Wnt pathway components in Pten cKO retina. They further show that inhibiting Wnt signaling phenocopies the effects of reducing glycolysis. However, they do not test whether pharmacological inhibition of Wnt signaling reverses the effects of high glycolytic activity in Pten cKO retinas. Thus the argument that Wnt is a key downstream effector pathway regulating rod photoreceptor differentiation is weak.

      See Reviewer 1, point 1

      (3) The use of sodium acetate to force protein acetylation is quite non-specific and will have effects beyond beta-catenin acetylation (which the authors acknowledge). Thus it is a stretch to state that "forced activation of beta-catenin acetylation" mimics the impact of Pten loss/high glycolytic activity in RPCs since the effects could be due to acetylation of other proteins.

      As outlined in our response to Reviewer #1, point 4, we will now assess K49 b-catenin acetylation levels, as conducted by Oginuma et al. This analysis will allow us to determine whether b-catenin acetylation is altered with manipulations of Pten, glycolysis, pH or acetate treatments.

    1. eLife Assessment

      This important study asks whether motor neurons within the vestibulo-ocular circuit of zebrafish are required to determine the identity, connectivity, and function of upstream premotor neurons. They provide compelling and comprehensive genetic, anatomical and behavioral evidence that the answer is, "No!". This work will be of general interest to developmental neurobiologists and will motivate future studies of whether motor neurons are dispensable for assembly of other sensorimotor neural circuits.

    2. Reviewer #1 (Public review):

      Summary:

      This study has as its goal to determine how the structure and function of the circuit that stabilizes gaze in the larval zebrafish depends on the presence of the output cells, the motor neurons. A major model of neural circuit development posits that the wiring of neurons is instructed by their postsynaptic cells, transmitting signals retrogradely on which cells to contact and, by extension, where to project their axons. Goldblatt et al. remove the motor neurons from the circuit by generating null mutants for the phox2a gene. The study then shows that, in this mutant that lacks the isl1-labelled extraocular motor neurons, the central projection neurons have 1) largely normal responses to vestibular input; 2) normal gross morphology; 3) minimally changed transcriptional profiles. From this, the authors conclude that the wiring of the circuit is not instructed by the output neurons, refuting the major model.

      Strengths:

      I found the manuscript to be exceptionally well-written and presented, with clear and concise writing and effective figures that highlight key concepts. The topic of neural circuit wiring is central to neuroscience, and the paper's findings will interest researchers across the field, and especially those focused on motor systems.

      The experiments conducted are clever and of a very high standard, and I liked the systematic progression of methods to assess the different potential effects of removing phox2a on circuit structure and function. Analyses (including statistics) are comprehensive and appropriate and show the authors are meticulous and balanced in most of the conclusions that they draw. Overall, the findings are interesting and should leave little doubt about the paper's main conclusions.

      Weaknesses:

      All conclusions are supported by the data, and the characterisation of the effects of the main manipulation in the study, removing phox2a to take out the extra-ocular motor neurons, is extensive. I cannot see weaknesses that affect the conclusions in this manuscript.

      The study raises interesting questions that could be addressed in future work, which would further explain how the projection neurons develop. While the cells that would have been extraocular motor neurons are still there in phox2a mutants, they can no longer be called motor neurons as they lack expression of vachta and isl1. It would therefore be interesting to see what an alternative manipulation, e.g., the physical removal of the motor neurons using laser ablation, would have. Furthermore, the motor neurons are dispensable for the projection neurons' wiring, but the projection neurons innervate several other cell types that could affect their development. A future project could determine the precise contribution of each postsynaptic population on the projection neurons' development.

    3. Reviewer #2 (Public review):

      Summary:

      This study was designed to test the hypothesis that motor neurons play a causal role in circuit assembly of the vestibulo-ocular reflex circuit, which is based on the retrograde model proposed by Hans Straka. This circuit consists of peripheral sensory neurons, central projection neurons, and motor neurons. The authors hypothesize that loss of extraocular motor neurons, through CRISPR/Cas9 mutagenesis of the phox2a gene, will disrupt sensory selectivity in presynaptic projection neurons if the retrograde model is correct.

      Account of the major strengths and weaknesses of the methods and results:

      The work presented is impressive in both breadth and depth, including the experimental paradigms. Overall, the main results were that the loss of function paradigm to eliminate extraocular motor neurons did not 1) alter the normal functional connections between peripheral sensory neurons and central projection neurons, 2) affect the position of central projection neurons in the sensorimotor circuit, or 3) significantly alter the transcriptional profiles of central projection neurons. Together, these results strongly indicate that retrograde signals from motor neurons are not required for the development of the sensorimotor architecture of the vestibulo-ocular circuit.

      Appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The results of this study showed that extraocular motor neurons were not required for central projection neuron specification in the vestibulo-ocular circuit, which countered the prevailing retrograde hypothesis proposed for circuit assembly. A concern is that the results presented may be limited to this specific circuit and may not be generalizable to other circuit assemblies, even to other sensorimotor circuits.

      Discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      As mentioned above, this study sheds valuable new insights into the developmental organization of the vestibulo-ocular circuit. However, different circuits likely utilize various mechanisms, extrinsic or intrinsic (or both), to establish proper functional connectivity. So, the results shown here, although they begin to explain the developmental organization of the vestibulo-ocular circuit, whether generalizable to other circuits is debatable. At a minimum, this study provides a starting point for the examination of the patterning of connections in this and other sensorimotor circuits.

    4. Reviewer #3 (Public review):

      In this manuscript by Goldblatt et al. the authors study the development of a well-known sensorimotor system, the vestibulo-ocular reflex circuit, using Danio rerio as a model. The authors address whether motor neurons within this circuit are required to determine the identity, upstream connectivity and function of their presynaptic partners, central projection neurons. They approach this by generating a CRISPR-mediated knockout line for the transcription factor phox2a, which specifies the fate of extraocular muscle motor neurons. After showing that phox2a knockout ablates these motor neurons, the authors show that functionally, morphologically, and transcriptionally, projection neurons develop relatively normally.

      Overall, the authors present a convincing argument for the dispensability of motor neurons in the wiring of this circuit, although their claims about the generalizability of their findings to other sensorimotor circuits should be tempered. The study is comprehensive and employs multiple methods to examine the function, connectivity and identity of projection neurons.

      Comments on the revised version:

      The authors have addressed all my previous concerns.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Summary

      This study has as its goal to determine how the structure and function of the circuit that stabilizes gaze in the larval zebrafish depends on the presence of the output cells, the motor neurons. A major model of neural circuit development posits that the wiring of neurons is instructed by their postsynaptic cells, transmitting signals retrogradely on which cells to contact and, by extension, where to project their axons. Goldblatt et al. remove the motor neurons from the circuit by generating null mutants for the phox2a gene. The study then shows that, in this mutant that lacks the isl1-labelled extraocular motor neurons, the central projection neurons have 1) largely normal responses to vestibular input; 2) normal gross morphology; 3) minimally changed transcriptional profiles. From this, the authors conclude that the wiring of the circuit is not instructed by the output neurons, refuting the major model.

      Strengths

      I found the manuscript to be exceptionally well-written and presented, with clear and concise writing and effective figures that highlight key concepts. The topic of neural circuit wiring is central to neuroscience, and the paper's findings will interest researchers across the field, and especially those focused on motor systems. 

      The experiments conducted are clever and of a very high standard, and I liked the systematic progression of methods to assess the different potential effects of removing phox2a on circuit structure and function. Analyses (including statistics) are comprehensive and appropriate and show the authors are meticulous and balanced in most of the conclusions that they draw. Overall, the findings are interesting, and with a few tweaks, should leave little doubt about the paper's main conclusions. 

      We are grateful for the Reviewer’s enthusiasm for our manuscript and recognition of the advance to the vestibular and motor systems fields. We particularly appreciate their suggestions for experiments to improve the characterization of our phox2a mutant line. We hope the Reviewer finds the results of the added experiment adequately address the points they raise. 

      Weaknesses/Recommendations

      (1) The main point is the incomplete characterisation of the effects of removing phox2a on the extra-ocular motor neurons. Are these cells no longer there, or are they there but no longer labelled by isl1:GFP? If they are indeed removed, might they have developed early on, and subsequently lost? These questions matter as the central focus of the manuscript is whether the presence of these cells influences the connectivity and function of their presynaptic projection neurons. Therefore, for the main conclusions to be fully supported by the data, the authors would need to test whether 1) the motor neurons that otherwise would have been labelled by the isl1:GFP line are physically no longer there; 2) that this removal (if, indeed, it is that) is developmental. If these experiments are not feasible, then the text should be adjusted to take this into account. 

      Show (e.g., with DAPI or some other staining) whether there are still cells where you would have expected to see nIII/nIV extraocular motor neurons. If this is done in a developmental timeline both main "concerns" are addressed in one go. If this doesn't work for some reason, then I'd suggest adjusting the discussion section to note this caveat. I realise it is commonplace in zebrafish and rodent papers to equate the two, but it should also be considered that the isl1:GFP does not report which cells are isl1+ 100% faithfully. 

      We thank the Reviewer for their suggestion. We’ve included the results of this experiment in (new) Supplemental Figure 1 and have updated the Results accordingly (text lines 69-72). 

      Briefly: We performed fluorescent in situ hybridization for vachta, a marker for cholinergic motor neurons, when nIII/nIV differentiation is complete at 2 dpf and prior to synaptogenesis with both their pre- and postsynaptic partners. We included a DAPI stain. We find that while phox2a does not physically remove neurons from the region that contains nIII/nIV motor neurons, neurons in this region no longer express vachta. The presence of neurons at an early stage (2dpf) that have lost expression of both a transcription factor (isl1) and motor neuron marker (vachta) supports our contention that, while cells are there, they should not be considered motor neurons.

      While the reviewer did not suggest it directly, we note that there is a more laborious way to determine “what happens to cells that would have been phox2a+ but no longer express phox2a?” Specifically, one could target a reporter transgene to the endogenous phox2a locus on the phox2a mutant background. Regrettably, generating such a knock-in reporter is difficult and success is far from assured.

      Previously (Greaney et. al. 2017, 10.1002/cne.24042 ), we compared expression patterns in nIV to those observed after retro-orbital dye fills. We never saw neurons labeled by dye that were not also GFP+. However, it was not possible to perform a similar analysis for nIII, so we acknowledge the limits of the isl1:GFP reporter.

      (2) A further point to address is the context of the manipulation. If the phox2a removal does indeed take out the extra-ocular motor neurons, what percentage of postsynaptic neurons to the projection neurons are still present?

      In other words, how does the postsynaptic nMLF output relate to the motor neurons? If, for instance, the nMLF (which, as the authors state, are likely still innervated by the projection neurons) are the main output of the projections neurons, then this would affect the interpretation of the results.

      Is there quantitative information on the projection neuron outputs to address the second point (i.e., how much of the projection neurons' output is the extra-ocular motor neurons)? If not, it should be discussed how this could affect the conclusions. 

      Qualitatively, projection neurons form more robust arbors to the nMLF than to their nIII/nIV partners (see: Schoppik et al. 2017, DOI: 10.1523/JNEUROSCI.1711-17.2017 ). We expect this is proportional to the size of each downstream target. 

      The Reviewer makes an interesting point here. These projection neurons innervate several downstream nuclei that could potentially influence their development; we’ve considered this in the Discussion based on existing literature and in the context of our own findings. A precise dissection of each target population’s contribution would be interesting and important for larger questions about neural circuits for balance (see Sugioka et. al. 2023 10.1038/s41467-023-36682-y ). However, we feel this analysis is outside our study’s scope, given that our aim here was to evaluate a standing hypothesis restricted to the contribution of nIII/nIV motor neurons. 

      Less important, but still useful: 

      - Figure 4C/D: I found these panels difficult to interpret. Perhaps split them up so each panel does a little less heavy lifting? Do the main panels in C show all axons? Where are the "two remaining nIII/nIV neurons" in D? 

      We’ve split the panels in 4C as suggested and adjusted the caption text in 4D to clarify the “remaining neurons” were simply not eliminated following phox2a knockout. We presume they are instead phox2b+. 4C shows all axons labeled by our transgenic line that follow the medial longitudinal fasciculus.  

      Extremely minor: 

      - line 28: "tantamount" --> "paramount"? 

      - some figure legends say DeltaFF, instead of DeltaF/F 

      - line 192: "the any" 

      These have been corrected; we thank the reviewer for their attention to detail. 

      Reviewer 2:

      Summary

      This study was designed to test the hypothesis that motor neurons play a causal role in circuit assembly of the vestibulo-ocular reflex circuit, which is based on the retrograde model proposed by Hans Straka. This circuit consists of peripheral sensory neurons, central projection neurons, and motor neurons. The authors hypothesize that loss of extraocular motor neurons, through CRISPR/Cas9 mutagenesis of the phox2a gene, will disrupt sensory selectivity in presynaptic projection neurons if the retrograde model is correct. 

      Strengths

      The work presented is impressive in both breadth and depth, including the experimental paradigms. Overall, the main results were that the loss of function paradigm to eliminate extraocular motor neurons did not 1) alter the normal functional connections between peripheral sensory neurons and central projection neurons, 2) affect the position of central projection neurons in the sensorimotor circuit, or 3) significantly alter the transcriptional profiles of central projection neurons. Together, these results strongly indicate that retrograde signals from motor neurons are not required for the development of the sensorimotor architecture of the vestibulo-ocular circuit. 

      We are grateful for the excellent summary of our manuscript and support for our aim, which was indeed to evaluate Hans Straka’s model for the development of the vestibulo-ocular reflex circuit.  

      Appraisal of whether the authors achieved their aims, and whether the results support their conclusions The results of this study showed that extraocular motor neurons were not required for central projection neuron specification in the vestibulo-ocular circuit, which countered the prevailing retrograde hypothesis proposed for circuit assembly. A concern is that the results presented may be limited to this specific circuit and may not be generalizable to other circuit assemblies, even to other sensorimotor circuits. 

      Impact

      As mentioned above, this study sheds valuable new insights into the developmental organization of the vestibulo-ocular circuit. However, different circuits likely utilize various mechanisms, extrinsic or intrinsic (or both), to establish proper functional connectivity. So, the results shown here, although begin to explain the developmental organization of the vestibulo-ocular circuit, are not likely to be generalizable to other circuits; though this remains to be seen. At a minimum, this study provides a starting point for the examination of patterning of connections in this and other sensorimotor circuits.

      Weaknesses/Recommendations

      A concern is that the results presented may be limited to this specific circuit and may not be generalizable to other circuit assemblies, even to other sensorimotor circuits. However, different circuits likely utilize various mechanisms, extrinsic or intrinsic (or both), to establish proper functional connectivity. So, the results shown here, although begin to explain the developmental organization of the vestibulo-ocular circuit, are not likely to be generalizable to other circuits; though this remains to be seen. 

      We agree with the Reviewer that — of course — a diverse array of developmental mechanisms shape sensorimotor circuit architecture. However, prior findings in the spinal cord (Wang & Scott 2000, Sürmeli et al. 2011, Bikoff et al. 2016, Sweeney et al. 2018, Shin et al. 2020) support our primary conclusion that motor neurons are dispensable for specification of premotor partners. The Recommendation ends with “though this remains to be seen.” We infer that the Reviewer does not have a counterexample at hand for a circuit where motor neurons determine the fate of their partners. Therefore, the preponderance of evidence argues that our work is likely to generalize to other circuits. However, we acknowledge the limitations of our work and we have tempered any claims to generality in the text.

      Lines 156-57: The authors should consider and discuss explicitly the potential of compensatory mechanisms in the CRISPR/Cas9 mutants that may permit synaptogenesis of the projection neurons even though MNs partners are absent. 

      We agree with the Reviewer that careful consideration of compensation is merited when using mutants. There are two synapses that the comment might refer to: those between projection neurons and motor neurons, and those between sensory afferents and projection neurons. Projection neurons fail to form any synapses at the region that would contain their motor neuron (nIII/nIV) partners (see Fig. 4C), so there is no question of compensation there. Figure 1B shows that there is no phox2a expression in sensory or central projection neurons. Consequentially, even if there were a gene that perfectly compensated for the loss of phox2a it wouldn’t be active in sensory or central projection neurons. We therefore do not believe that compensatory expression of other genes plays any role here. 

      Line 162: Is this an accurate global statement or should it be restricted to the evidence provided in this report?

      We’ve clarified this line, which summarizes findings described in previous results sections of this report.

      Reviewer 3:

      Summary

      In this manuscript by Goldblatt et al. the authors study the development of a well-known sensorimotor system, the vestibulo-ocular reflex circuit, using Danio rerio as a model. The authors address whether motor neurons within this circuit are required to determine the identity, upstream connectivity and function of their presynaptic partners, central projection neurons. They approach this by generating a CRISPR-mediated knockout line for the transcription factor phox2a, which specifies the fate of extraocular muscle motor neurons. After showing that phox2a knockout ablates these motor neurons, the authors show that functionally, morphologically, and transcriptionally, projection neurons develop relatively normally.

      Overall, the authors present a convincing argument for the dispensability of motor neurons in the wiring of this circuit, although their claims about the generalizability of their findings to other sensorimotor circuits should be tempered. The study is comprehensive and employs multiple methods to examine the function, connectivity and identity of projection neurons.

      We appreciate the Reviewer’s support for our manuscript and have implemented their thoughtful suggestions on how to improve the clarity and presentation of our conclusions. We acknowledge the shared consideration with Reviewer 2 as to the generalizability of our findings, and have tempered the language in our revision. 

      In the introduction the authors set up the controversy on whether or not motor neurons play an instructive role in determining "pre-motor fate". This statement is somewhat generic and a bit misleading as it is generally accepted that many aspects of interneuron identity are motor neuron-independent. The authors might want to expand on these studies and better define what they mean by "fate", as it is not clear whether the studies they are citing in support of this hypothesis actually make that claim. 

      We appreciate the Reviewer’s attention to this important consideration. We agree that there are numerous, and often ambiguous ways to define cell fate. We’ve modified our manuscript to read  “…for and against an instructive role in establishing connectivity” (line 19) to reflect that connectivity is the most pertinent readout of cell fate in (most) studies cited there, as well as in our model system (lines 25-26: “Subtype fate, anatomical connectivity, and function are inextricably linked: directionally-tuned sensory neurons innervate nose-up/nosedown subtypes of projection neurons, which in turn innervate specific motor neurons…”). We’ve expanded on the prior studies mentioned above in relevant sections of the Results and Discussion. 

      Although it appears unchanged from their images, the authors do not explicitly quantitate the number of total projection neurons in phox2a knockouts. 

      We have added this quantification (text lines 95-96); the number of projection neurons per hemisphere is unchanged in control and mutant larvae.

      For figures 2C and 3C, please report the proportion of neurons in each animal, either showing individual data points here or in a separate supplementary figure; and please perform and report the results of an appropriate statistical test. 

      Generally, we agree that per-animal sampling can provide important metrics. We’ve added a line in the appropriate Methods section with the mean/standard deviation number of neurons sampled per animal for each genotype (lines 408-410). However, our extensive prior work using this transgenic line (Goldblatt et al. 2023, DOI: 10.1016/j.cub.2023.02.048 ) argued that a per-animal breakdown can be misleading. Due to occasional technical aberrations, variation in transgenic line expression, and limitations of our registration method, we cannot sample 100% of the projection nucleus (~50 neurons/hemisphere) in each animal. Likewise, the topography of the nucleus in WT animals, both for up/down subtypes (Fig. 2) and impulse responsive/unresponsive neurons (Fig. 3), means that subtypes may not be proportionally sampled on a peranimal basis. While such problems would likely resolve if we took data from ~50-75 animals for each condition, at a throughput of ~2 animals/day and 1-2 experimental days / week on shared instrumentation the throughput simply isn’t there. We therefore believe the data is best represented as an aggregate.

      In the topographical mapping of calcium responses (figures 2D, E and 3D), the authors say they see no differences but this is hard to appreciate based on the 3D plotting of the data. Quantitating the strength of the responses across the 3-axes shown individually and including statistical analyses would help make this point, especially since the plots look somewhat qualitatively different. 

      We have added a supplemental table (new Table 2) with statistical comparisons of projection neuron topography (both to tonic and impulse stimuli) across genotypes for additional clarification. Quantitatively, we find that differences in projection neuron position (max observed: approx. 5 microns) are within the limits of our expected error in registering neurons across larvae to a standardized framework, given the small size of the nucleus (approx. 40 microns in each spatial axis) and each individual neuron (approx. 5 microns in diameter).

      The transcriptional analysis is very interesting, however, it is not clear why it was performed at 72 hpf, while functional experiments were performed at 5 days. Is it possible that early aspects of projection neuron identity are preserved, while motor neuron-dependent changes occur later? The authors should better justify and discuss their choice of timepoint. 

      As suggested, we have updated the manuscript to justify the choice of timepoint (text lines 176-177). We agree with the Reviewer that choosing the “right” timepoint for transcriptional analysis is key. The comment underscores the challenges in balancing the amount of time past neurogenesis (24-54 hpf) when potential fate markers could change, with the timecourse of synaptogenesis (2-4 dpf) and functional maturation (5 dpf). We hypothesized that selecting an intermediate timepoint (72 hpf, during peak synaptogenesis), would enable the highest resolution of both fate markers expressed at the end of neurogenesis (54 hpf) and wiring specificity molecules. We point the Reviewer to recent studies in comparable systems that proposed subtype diversity is most resolvable during synaptogenesis as further justification (see: Ozel et al. 2022, DOI: 10.1038/s41586-020-2879-3 and Li et al. 2017, DOI: 10.1016/j.cell.2017.10.019). However, we acknowledge that the ideal experiment would have been a transcriptional timecourse that would have directly addressed the question. 

      The inclusion of heterozygotes as controls is problematic, given that the authors show there are notable differences between phox2a+/+ and phox2a+/- animals; pooling these two genotypes could potentially flatten differences between controls and phox2a-/-. 

      We agree that this is an important limitation on our interpretations and have noted this more explicitly in the appropriate Results section (line 204).  

      Projection neurons appear to be topographically organized and this organization is maintained in the absence of motor neurons. Are there specific genes that delineate ventral and dorsal projection neurons? If so, the authors should look at those as candidate genes as they might be selectively involved in connectivity. Showing that generic synaptic markers (Figure 4E) are maintained in the entire population is not convincing evidence that these neurons would choose the correct synaptic partners.

      We agree with the Reviewer that Figure 4E is limited and that the most convincing molecular probe would be against a subtype-specific marker gene, ideally the one(s) that establish subtype-specific connectivity. To date, few such markers have been identified in any system, and, to the best of our knowledge, no reported markers differentiate dorsal (nose-up) from ventral (nose-down) projection neurons. We are currently evaluating candidates, but will not include that data here until the relevant genes are established as veridical subtype markers with defined roles in subtype fate specification and connectivity.

    1. eLife Assessment

      The findings of this valuable manuscript advance our understanding of the significance of Bestrophin isoform 4 (BEST4) in suppressing colorectal cancer (CRC) progression. The authors used appropriate and validated methodology, such as the knockout of BEST4 using CRISPR/Cas9 in CRC cells, to provide a solid foundation for elucidating the potential link between BEST4 and CRC progression.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors describe the participation of the Hes4-BEST4-Twist axis in controlling the process of epithelial-mesenchymal transition (EMT) and the advancement of colorectal cancers (CRC). They assert that this axis diminishes the EMT capabilities of CRC cells through a variety of molecular mechanisms. Additionally, they propose that reduced BEST4 expression within tumor cells might serve as an indicator of an adverse prognosis for individuals with CRC.

      The revised manuscript and figures still need further improvement because some of the authors' claims are difficult to understand from a scientific perspective.

      Strengths:

      • Exploring the correlation between the Hes4-BEST4-Twist1 axis, EMT, and the advancement of CRC is a novel perspective and gives readers a fresh standpoint.<br /> • The potential role of BEST4 in EMT through the Hes4-BEST4-Twist1 axis, rather than through its channel function, is also a novel perspective.<br /> • The whole transcriptome sequence analysis (Figure 5) showing low expression of BEST4 in CRC samples will be of broad interest to cancer specialists as well as cell biologists although further corroborative data is essential to strengthen these findings (See Weaknesses).

      Weaknesses:

      • The authors employed three kinds of CRC cell lines, but not untransformed cells such as intestinal epithelial organoids which are commonly used in recent research. Since all the data from in vitro and in vivo experiments are generated from CRC cell lines with forced expression of proteins of interest, the authors' claim may not reflect a common biological process.<br /> • Most of experiments were performed to show changes in EMT markers, but not EMT itself.<br /> • The in vivo and in vitro data supporting the whole transcriptome sequence analysis (Figure 5) is mostly insufficient. Since BEST4 is a marker of a subset of terminally differentiated colonocytes, its lower expression in CRC compared to adjacent normal tissue could be within the range of common expectations.<br /> • Some experiments do not appear to have a direct relevance to their claims.

      Major comments:

      • The authors employed three kinds of CRC cell lines, but not untransformed cells such as intestinal epithelial organoids which are commonly used in recent research. Please include this limitation of the study in the discussion section with other possible limitations.<br /> • Some experiments do not appear to have direct relevance to their claims. Figure 1A-1F and 2E-2H relate to cell proliferation or viability of CRC cell lines, but not to EMT. The focus of this study should be on EMT, but the summary sentence for Figure 1 (Line 118-119) says "inhibitory effects of BEST4 on CRC development". This sentence, along with some others (such as Line 262-263), seems to be deviating. Cancer development and EMT are distinct biological processes, so please revise the manuscript with this in mind.<br /> • The context around Line 194-197, "Additionally, the knockdown of endogenous BEST4 in Hes4-expressing HCT116 cells substantially decreased Flag-Hes4 coprecipitation from the nuclear protein lysates, while Myc-Twist1 expression remained constant, as determined by co-IP with antibodies to Flag or My (Figure 4E; Figure 4-figure supplement 1C)." is difficult to follow.<br /> • The in vivo and in vitro data supporting the whole transcriptome sequence analysis (Figure 5) are mostly insufficient. Since BEST4 is a marker of a subset of terminally differentiated colonocytes, its lower expression in CRC compared to adjacent normal tissue could be within the range of common expectations.<br /> • As the reviewer #2 mentioned, the quality of some figures is quite suboptimal. It is not due to their pixel size, but rather due to other factors, such as the inconsistencies of aspect ratios. Improvement of the overall quality is needed. Figure legends also need improvements.<br /> • The formatting of genes and proteins is inconsistent. Please correct it according to the general formatting guidelines.

    3. Reviewer #2 (Public review):

      Summary:

      Using in vitro and in vivo approaches, the authors first demonstrate that BEST4 inhibits intestinal tumor cell growth and reduces their metastatic potential, possibly via downstream regulation of TWIST1.

      They further show that HES4 positively upregulates BEST4 expression, with direct interaction with BEST4 promoter region and protein. The authors further expand on this with results showing that negative regulation of TWIST1 by HES4 requires BEST4 protein, with BEST4 required for TWIST1 association with HES4. Reduction of BEST1 expression was shown in CRC tumor samples, with correlation of BEST4 mRNA levels with different clinicopathological factors such as sex, tumor stage and lymph node metastasis, suggesting a tumor-suppressive role of BEST4 for intestinal cancer.

      Strengths:

      • Good quality western blot data<br /> • Multiple approaches were used to validate the findings<br /> • Logical experimental progression for readability<br /> • Human patient data / In vivo murine model / Multiple cell lines were used, which supports translatability/reproducibility of findings

      Weaknesses:

      • Figure quality should still be improved<br /> • The discussion should still be improved

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors describe the participation of the Hes4-BEST4-Twist axis in controlling the process of epithelial-mesenchymal transition (EMT) and the advancement of colorectal cancers (CRC). They assert that this axis diminishes the EMT capabilities of CRC cells through a variety of molecular mechanisms. Additionally, they propose that reduced BEST4 expression within tumor cells might serve as an indicator of an adverse prognosis for individuals with CRC.

      Strengths:

      • Exploring the correlation between the Hes4-BEST4-Twist axis, EMT, and the advancement of CRC is a novel perspective and gives readers a fresh standpoint.<br /> • The whole transcriptome sequence analysis (Figure 5) showing low expression of BEST4 in CRC samples will be of broad interest to cancer specialists as well as cell biologists although further corroborative data is essential to strengthen these findings (See Weaknesses).

      Weaknesses:

      (1) The authors employed three kinds of CRC cell lines, but not untransformed cells such as intestinal epithelial organoids which are commonly used in recent research.

      Sincerely thanks for catching this issue. While we acknowledge the potential of intestinal epithelial organoids as a valuable model for this study and will consider establishing this system in future research, which falls outside the scope of our current work.

      (2) The authors use three different human CRC cell lines with a lack of consistency in the selection of them. Please clarify 1) how these lines are different from each other, 2) why they pick up one or two of them for each experiment. To be more convincing, at least two lines should be employed for each in vitro experiment.

      We apologize for any confusion caused to the reviewer. In our study, we employed HCT116 and Caco2 cell lines to investigate the overexpression of BEST4 in the biological functions of CRC and its involvement in EMT. The selection of HCT116, a human CRC cell line, was based on its relatively lower expression level of BEST4 compared to other CRC cell lines. Conversely, Caco2 is a human colon adenocarcinoma cell line that closely resembles differentiated intestinal epithelial cells and exhibits microvilli structures. Given that BEST4 serves as a marker for intestinal epithelial cells, these two cell lines were chosen for investigating the in vitro effects of overexpressing BEST4 on proliferation, clonality, invasion, migration of colon cancer tumor cells and expression of downstream EMT-related markers. Similarly, we selected the HCT-15 cell line derived from human CRC for BEST4 knockout due to its comparatively higher expression level of BEST4 among other CRC cell lines. We employed the CRISPR/Cas9 gene-editing technology to knockout BEST4 instead of utilizing shRNA for downregulating BEST4 expression, thereby limiting our selection to a single cell line.

      (3) The authors demonstrated associations between BEST4 and cell proliferation/ viability as well as migration/invasion, utilizing CRC cell lines, but it should be noted that these findings do not indicate a tumor-suppressive role of BEST4 as mentioned in line 120. Furthermore, while the authors propose that "BEST4 functions as a tumor suppressor in CRC" in line 50, there seems no supporting data to suggest BEST4 as a tumor suppressor gene.

      We apologize for these inaccurate expressions, and we have made the necessary modifications to the corresponding parts in the manuscript.

      (4) The HES4-BEST4-Twist1 axis likely plays a significant role in CRC progression via EMT but not CRC initiation. Some sentences could lead to a misunderstanding that the axis is important for CRC initiation.

      We apologize for these inaccurate expressions, and we have made the necessary modifications to the corresponding parts in the manuscript.

      (5) The authors mostly focus on the relationship of the HES4-BEST4-Twist1 axis with EMT, but their claims sometimes appear to deviate from this focus.

      We apologize for confusing the reviewer. The objectives of our study are as follows: (1) to establish the role of BEST4 in CRC growth both in vitro and in vivo; (2) to determine the underlying molecular mechanisms by which BEST4 interacts with Hes4 and Twist1, thereby regulating EMT; and (3) to investigate the correlation between BEST4 expression and prognosis of CRC. We have made the necessary modifications to the corresponding parts in the manuscript.

      (6) Some experiments do not appear to have a direct relevance to their claims. For example, the analysis using the xenograft model in Figure 2E-J is not optimal for analyzing EMT. The authors should analyze metastatic or invasive properties of the transplanted tumors if they intend to provide some supporting evidence for their claims.

      Sincerely thanks for catching this issue. The process of EMT transforms epithelial cells exhibiting a spindle fibroblast-like morphology, leading to the acquisition of mesenchymal characteristics and morphology, enabling these cells to acquire invasive and migratory abilities, with expression switching epithelial E-cadherin and Zo-1 to mesenchymal vimentin (Dongre and Weinberg, 2019)..The whole process is regulated by transcriptional factors of the Snail family and Twist1(Dongre and Weinberg, 2019). We utilized the xenograft model with overexpressed BEST4 to analyze the lysates of tumor tissue, revealing that BEST4 upregulated E-cadherin and downregulated vimentin and Twist1 (Figures 2I). These findings indicate that BEST4 inhibits EMT in vivo. Deletion of BEST4 may enable these cells to acquire invasive and migratory abilities, leading to metastasis in vivo. Therefore, we subsequently evaluated the metastatic potential of BEST4 in a CRC liver metastasis model by intrasplenically injecting HCT-15 cell lines with knockout of BEST4 (BEST4gRNA), wild-type control (gRNA), or knockout with rescue (BEST4-Rescued) into BALB/c nude mice. Our observations revealed a twofold increase in liver metastatic nodules in the absence of BEST4 compared to the control group (Fig. 2J-L). Although further in vivo experiments are required for confirmation, our research suggests a potential role for BEST4 in counteracting EMT induction in vivo.

      (7) In Figure 4H, ZO-1 and E-cad expression looks unchanged in the BEST4 KD.

      Sincerely thanks for catching this issue. We have implemented the necessary modifications to the corresponding sections in the manuscript and performed a comprehensive quantification of all Western Blot data to ensure statistically significant differences, including those presented in the supplementary file.

      (8) The in vivo and in vitro data supporting the whole transcriptome sequence analysis (Figure 5) is mostly insufficient. Including the following experiments will substantiate their claims: 1) BEST4 and HES4 immunostaining of human surgical tissue samples, 2) qPCR data of HES4, Twist1, Vimentin, etc. as shown in Figure 5C, 5D.

      Sincerely thanks for catching this issue.

      (1) Due to the substandard quality of the BEST4 antibody, we opted to evaluate the clinical significance of BEST4 in CRC by assessing mRNA results instead of protein levels using immunohistochemistry (IHC). After testing multiple antibodies for western blotting, only one (1:800; LsBio, LS-C31133) accurately indicated BEST4 protein expression while still exhibiting some non-specific bands. Consequently, we decided to transfect a HA-tagged BEST4 plasmid into the CRC cell line and used HA as a marker for BEST4 expression. Unfortunately, none of the antibodies employed for IHC were suitable as they failed to accurately distinguish between positive or negative staining for BEST4 and showed significant non-specific staining (data not shown). The challenge in detecting BEST4 protein in colorectal cancer tissues may be attributed to its low expression levels. Our findings are consistent with previous reports from the Human Protein Atlas database (https://www.proteinatlas.org/ENSG00000142959-BEST4/pathology), which also did not detect any BEST4 protein expression in colorectal cancer tissues through IHC analysis.

      (2) The qPCR data of E-cadherin, Twist1, and Vimentin mRNA expression in CRC tissue has already been published in other studies(Christou et al., 2017; Lazarova and Bordonaro, 2016; Zhu et al., 2015). It was found that E-cadherin is downregulated, while Twist1 and Vimentin are upregulated in CRC tissue compared to the adjacent normal tissues. The qPCR data of E-cadherin, Twist1, and Vimentin mRNA expression in CRC tissue has already been published in other studies(Christou et al., 2017; Lazarova and Bordonaro, 2016; Zhu et al., 2015). It was found that E-cadherin is downregulated, while Twist1 and Vimentin are upregulated in CRC tissue compared to the adjacent normal tissues. The analysis of mRNA expression data obtained from colorectal cancer samples and normal samples in the publicly available databases TCGA and GTEx also revealed a significant downregulation of _Hes_4 expression in colorectal cancer tissues, which will be our next research objective.

      (9) Some statements are inconsistent probably due to grammatical errors. (For example, some High/low may be reversed in lines 234-244.)

      We apologize for these mistakes. We have made corrections to this section in the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Using in vitro and in vivo approaches, the authors first demonstrate that BEST4 inhibits intestinal tumor cell growth and reduces their metastatic potential, possibly via downstream regulation of TWIST1.

      They further show that HES4 positively upregulates BEST4 expression, with direct interaction with BEST4 promoter region and protein. The authors further expand on this with results showing that negative regulation of TWIST1 by HES4 requires BEST4 protein, with BEST4 required for TWIST1 association with HES4. Reduction of BEST1 expression was shown in CRC tumor samples, with correlation of BEST4 mRNA levels with different clinicopathological factors such as sex, tumor stage, and lymph node metastasis, suggesting a tumor-suppressive role of BEST4 for intestinal cancer.

      Strengths:

      • Good quality western blot data.

      • Multiple approaches were used to validate the findings.

      • Logical experimental progression for readability.

      • Human patient data / In vivo murine model / Multiple cell lines were used, which supports translatability / reproducibility of findings.

      We sincerely thank Reviewer #2 for constructive feedback on this work

      Weaknesses:

      (1) Interpretation of figures and data (unsubstantiated conclusions).

      We apologize for this confusing presentation. We have made corrections to this section in the manuscript.

      (2) Figure quality.

      We apologize for the poor quality of the figures. The figure resolution was drastically reduced during the conversion of the manuscript to pdf on publisher web site. The figures have been re-uploaded and we have once again confirmed that each image has a resolution exceeding 300dpi.

      (3) Figure legends lack information.

      Sincere thanks for catching this issue. We have provided detailed figure legends including supplementary figure legends on pages 36-43 of the manuscript. We have rechecked this section and made improvements and additions.

      (4) Lacking/shallow discussion.

      We apologize for our shallow discussion. We have supplemented and improved some parts of the discussion

      (5) Requires more information for reproducibility regarding materials and methods.

      Sincere thanks for catching this issue. We have provided detailed information for reproducibility regarding materials and methods on pages 18-29; 43-47 of the manuscript. We have rechecked this section and made improvements and additions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      We sincerely thank Reviewer #1 for constructive feedback on this work.

      Minor comments:

      (1) Line 73: "Variant 4" is not precise. The term "variant" should mean mutation in the gene or different transcription.

      We apologize for using an inaccurate expression. We have now changed Variant 4 to Bestrophin 4.

      (2) Line 78. Is it correct that BEST4+ cells coexist with Hes4+ cells?

      According to the previous study that published in Nature (Parikh et al., 2019), BEST4+ cells originate from the absorptive lineage and express the transcription factors Hes4. Additionally, we also observed the nuclear co-localization of BEST4 and Hes4 in HCT116 cells through immunofluorescence staining (Figure 3E)

      (3) Line 85. The reason "Best4 may be associated with Twist1" is unclear.

      We apologize for the lack of clarity in our previous statement. In a recent analysis utilizing single cell RNA-sequencing, it was discovered that a subset of mature colonocytes expresses BEST4 (Parikh et al., 2019). Additionally, this subset coexists with hairy/enhancer of split 4 positive (Hes4+) cells (Parikh et al., 2019). Previous research has demonstrated the antagonistic role of Hes4 in regulating Twist1 through protein-protein interaction, which governs the differentiation of bone marrow stromal/stem cell lineage (Cakouros et al., 2015). Based on these findings, we speculate that there may be an interactive regulation between BEST4/Hes4/Twist1, potentially influencing the process of cell polarity during epithelial-mesenchymal transition in colorectal cancer. We have made corrections to this section in the manuscript.

      (4) Line 87. Grammatical error (Establishing the role BEST4).

      We apologize for the grammatical error of this section. We have rectified the issue in the manuscript.

      (5) Please clarify the reason the authors do not show any data of BEST4-overexpressing Caco2 cells in Figure 2?

      We apologize for our negligence in not adding this data to in Figure 2. It has now been fully supplemented.

      (6) In line 145, the authors did not show any tumorigenic properties.

      We apologize for this confusing presentation. We have made corrections to this section in the manuscript.

      (7) Figure 3 shows 1) HES4 regulates BEST4 promotor activity, and 2) HES4 and BEST4 colocalized in nuclei, but these are very different biological processes. Please clarify how these two relate to each other.

      Trajectory analysis identifies the basic helix-loop-helix (bHLH) transcription factors Hairy/enhancer of split 4 (Hes4)-expressing colonocytes (Hes4+) in BEST4-expressing colonic epithelial lineage (BEST4+). Although they are very different biological processes, the recent identification of a heterogeneous BEST4+ and Hes4+ subgroup in a human colonic epithelial lineage (Parikh et al., 2019) led us to consider their potential role in regulating CRC progression. We firstly observed a responsive upregulation of both endogenous BEST4 mRNA and protein levels in Hes4 overexpression cells compared to the control transfectant, indicating that Hes4 is a potential upstream activator regulating BEST4 functional. We then confirmed that Hes4 interacted with BEST4, binding directly to its upstream promoter at the region of 1470-1569 bp enhancing its promoter activity as analysed by Co-IP, dual-luciferase assay and ChIP-qPCR, respectively. Essentially, they were co-localized in the nucleus, as shown in immunofluorescence staining after the transient co-transfection of Hes4 and BEST4 into HCT116, therefore indicating that BEST4 interacts with Hes4 at both transcriptional and translational levels (Figure 3; Figure 3-supplemental figure 1)

      (8) In line 182-185, please clarify the reason BEST4 mediates the inhibition of the Twist 1 promotor activity by Hes4.

      Because a step of Hes4 in committing to human bone marrow stromal/stem cell lineage-specific development is mediated by Twist1 downregulation (Cakouros et al., 2015), with evidence of direct interaction between BEST4 and Hes4 observed in HCT116, it is plausible that they could exploit Twist1 to regulate EMT. In the present study, we found that Twist1 colocalized with BEST4 in the nucleus, and their interaction destabilized Twist1 and significantly inhibited EMT induction. Hes4 caused the same effect; however, it required intermediation through BEST4. Although the mechanistic insights of their intercorrelation remain to be elucidated, the present study identified the axis of Hes4-BEST4-Twist1 governing the development of CRC, at least partially by counteracting EMT induction

      (9) In line 205, please rephrase "BEST4-mediated upstream Hes4" to be clearer.

      We apologize for this confusing presentation. We have made corrections to this section in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      We sincerely thank Reviewer #2 for constructive feedback on this work

      Major Comments:

      (1) The general quality of the figures requires improvement (text in some figures is illegible, and the resolution of the images is low) with more proofreading of the text for clarity. In addition, the resolution of the histology in Fig 2K does not allow a proper evalution of the data.

      We apologize for the poor quality of the figures. The figure resolution was drastically reduced during the conversion of the manuscript to pdf on publisher web site. The figures have been re-uploaded and we have once again confirmed that each image has a resolution exceeding 300dpi. Meanwhile, the Figure 2K was further enhanced and expanded.

      (2) While the authors show that the HES4/BEST4 complex interacts with the TWIST1 protein, they do not expand on the mechanisms underpinning the post-translational or transcriptional regulation of TWIST1. We would like the authors to prove or further speculate on the mechanisms behind this regulation in the discussion.

      Our present study showed that BEST4 inhibited EMT in conjunction with downregulation of Twist1 in both HCT116 and Caco2 CRC cell lines. A previous study has shown an antagonist role of Hes4 in regulating Twist1 via protein-protein interaction that controls the bone marrow stromal/stem cell lineage differentiation (Cakouros et al., 2015). We speculate a possible interactive regulation between Hes4/BEST4/Twist1 by which they deter the process of cell polarity during EMT in CRC. In the present study, we found that BEST4 mediates the inhibition of the Twist1 both in transcription and translation level by Hes4. Twist1 colocalized with BEST4 in the nucleus, and their interaction destabilized Twist1 and significantly inhibited EMT induction. Hes4 caused the same effect; however, it required intermediation through BEST4. The present study identified the axis of Hes4-BEST4-Twist1 governing the development of CRC, at least partially by counteracting EMT induction. We agree that further studies to elucidate mechanistic insights of their intercorrelation are needed that are beyond the scope of the current work.

      (3) The authors need to show or argue that why TWIST1 is necessary for the phenotypes observed, e.g. metastasis/proliferation.

      We apologize for the lack of clarity in articulating this question. The process of EMT transforms epithelial cells exhibiting a spindle fibroblast-like morphology, leading to the acquisition of mesenchymal characteristics and morphology, enabling these cells to acquire invasive and migratory abilities, with expression switching epithelial E-cadherin and Zo-1 to mesenchymal vimentin (Dongre and Weinberg, 2019). When diagnosed in advanced stages, EMT may occur as CRC metastasize to distal organs (Pastushenko and Blanpain, 2019; Sunlin Yong, 2021; Yeung and Yang, 2017; Zhang et al., 2021).The whole process is regulated by transcriptional factors of the Snail family and Twist1(Dongre and Weinberg, 2019). Twist1 (a basic helix-loop-helix transcription factor) reprograms EMT by repressing the expression of E-cadherin and ZO-1 (Nagai et al., 2016; Yang et al., 2004) and simultaneously inducing several mesenchymal markers, typically vimentin (Bulzico et al., 2019; Meng et al., 2018; Nagai et al., 2016; Yang et al., 2004), which is a pivotal predictor of CRC progression (Vesuna et al., 2008; Yang et al., 2004; Yusup et al., 2017; Zhu et al., 2015).Overexpression of Twist1 significantly enhances the migration and invasion capabilities of colorectal cancer cells; furthermore, it is closely associated with metastasis and poor prognosis in patients with colorectal cancer(Yusup et al., 2017; Zhu et al., 2015). We have supplemented and improved these parts of the introduction and discussion.

      (4) The authors sufficiently prove that HES4/BEST4 regulates TWIST1 downregulation, however, we believe the findings are not enough to show *direct* regulation (refer also to line 273). At least rephrasing the conclusions would be adequate, also while referring to the working model depicted in Fig. 5G.

      We apologize for this inaccurate presentation. Although the interaction may not be direct, our co-immunoprecipitation (CO-IP) results demonstrated nuclear colocalization of Twist1 and BEST4 (Figure 4D; Figure 4-supplemental figure 1A). Furthermore, their interaction destabilized Twist1 and significantly inhibited the induction of EMT. We have made corrections to this section in the manuscript.

      (5) The discussion is very short and not satisfactory; is BEST4 an evolutionary conserved protein (besides the channel region)? Any speculation on which domain(s) is(are) important for the interaction with HES4 and TWIST1? How do the findings in the current study compare with recent, potentially contradicting data indicating a pro-tumorigenic function of BEST4 for CRC, including its upregulation (and not downregulation) in malignant intestinal tissues, and activation of PI3K/AKT signaling (PMID: 35058597)?

      We apologize for our shallow discussion. We have supplemented and improved some parts of the discussion. The bestrophins are a highly conserved family of integral membrane proteins initially discovered in Caenorhabditis elegans(Sonnhammer and Durbin, 1997). Homologous sequences can be found across animals, fungi, and prokaryotes, while they are absent in protozoans or plants(Hagen et al., 2005). Conservation is primarily observed within the N-terminal 350–400 amino acids, featuring an invariant motif arginine-phenylalanine-proline (RFP) with unknown functional properties (Milenkovic et al., 2008). Mutations in this region can lead to the development of vitelliform macular dystrophy. However, the C-terminus is a potential site for protein modification and function(Marmorstein et al., 2002; Miller et al., 2019). There is currently no further literature research on the functional roles of different domains of BEST4. Although the crucial domain for the interaction with HES4 and TWIST1 is yet to be determined, requiring further investigation for clarification, our findings demonstrate that Hes4 directly binds to the upstream promoter region of BEST4 at 1470-1569 bp, thereby enhancing its promoter activity. These results provide valuable insights for future research.

      Sincere thanks for catching this publication to us. We carefully read this study and would like to point out a few things.

      a) Firstly, the study demonstrated that BEST4 expression is upregulated in clinical CRC samples, which contradicts the results of other published studies except for our own research. RNA-seq of tissue samples from 95 human individuals representing 27 different tissues was performed to determine the tissue specificity of all protein-coding genes, and the results indicated that the BEST4 gene is predominantly expressed in the colon (Fagerberg et al., 2014). In addition, BEST4 was reported to be exclusively expressed by human absorptive cells and could be induced during the process of human absorptive cell differentiation(Ito et al., 2013). Recently, the research from Simmons’s group that published in Nature further proved that human absorptive colonocytes distinctly express BEST4 by single-cell profiling of healthy human colonic epithelial cells, and is dysregulated in colorectal cancer patients(Parikh et al., 2019). Furthermore, the analysis of RNA-seq expression data obtained from colorectal cancer samples and normal samples in the publicly available databases TCGA and GTEx also revealed a significant downregulation of BEST4 expression in colorectal cancer tissues, which is consistent with our research findings. The literature above demonstrates a close relationship between BEST4 and the normal function of the human colon, and provide evidence for their loss in colorectal cancer patients.

      b) Their study showed an increased expression of BEST4 protein levels in colorectal cancer patients through Western Blot. However, the antibody they used was only suitable for IHC-P and not for Western Blot (Abcam , ab188823); . In our study, we also utilized WB technology to detect the expression of BEST4 in colorectal cancer tissues and adjacent normal tissues. The results revealed a decreased expression of BEST4 protein levels in colorectal cancer patients. The antibody we used was specifically designed for WB detection (1:800; LsBio, LS-C31133 https://www.lsbio.com/antibodies/best4-antibody-n-terminus-wb-western-ls-c31133/29602).

      c) The study demonstrated an upregulation of BEST4 protein levels in colorectal cancer patients using immunohistochemistry (IHC). However, the expression of BEST4 was assessed in colorectal cancer tissues through IHC utilizing publicly available protein expression databases such as the Human Protein Atlas. Interestingly, this analysis revealed a minimal presence of BEST4 protein in colorectal cancer tissues (https://www.proteinatlas.org/ENSG00000142959-BEST4/pathology), contradicting their research findings but aligning with our own observations.

      d) Literature based on single-cell genomics analysis reports that only OTOP2 and BEST4 genes are expressed in a subset of the normal colorectal epithelial cells but not the rest(Parikh et al., 2019). An inhibitory effect of OTOP2 on CRC has been recently shown BEST4, and the Otopetrin 2 (OTOP2), which encodes proton‐selective ion channel protein were reported to distinct expressed in normal absorptive colonocytes and colocalized with each other (Drummond et al., 2017; Ito et al., 2013; Parikh et al., 2019). OTOP2 has been recently demonstrated to have an inhibitory effect on the development of CRC via being regulated by wide-type p53(Qu et al., 2019), while the role of BEST4 in CRC is less well studied, that indicate the potential of BEST4 to inhibit colorectal cancer. The Gene set enrichment analysis (GSEA) conducted by them revealed a significant enrichment of gene signatures associated with oncogenic signaling and metastasis, such as the PI3K/Akt signaling pathway, in patients exhibiting higher BEST4 expression compared to those with lower BEST4 expression. However, our GSEA did not show any significant enrichment of the PI3K/Akt signaling pathway in patients with higher BEST4 expression compared to those with lower BEST4 expression. In contrast to their findings, our BEST4 overexpression cell line did not exhibit a significant increase in phosphorylated Akt levels. The present study concludes that our findings align with previous literature and public database analyses, providing evidence for the downregulation of BEST4 in colorectal cancer tissues and its potential as an anticancer agent. Discrepancies observed in other studies may be attributed to difference in experimental model, protocols, preparations or experimental conditions.

      Minor Comments:

      (1) Western blot data should be quantified.

      Sincere thanks for catching this point to us. We have conducted a comprehensive quantification of all the Western Blot data and included the results in the supplementary file.

      (2) Errors in labelling figures in the text should be corrected (Line 214 and more).

      We apologize for these mistakes. We have made corrections to this section in the manuscript.

      (3) The authors used the human HES4 gene, which is indicated with the incorrect nomenclature. The gene and protein nomenclature should be correctly used.

      We apologize for these mistakes. We have made corrections to this section in the manuscript.

      (4) Methods and Materials for certain assays should be further clarified; e.g transwell migration/invasion assays (reference to previous publication? transwell inserts used, etc.)

      Sincerely thanks for catching this issue. We have implemented enhancements and updates to the respective sections.

      (5) Figure 2K: Quality of histology is insufficient.

      We apologize for the poor quality of the figures. The quality of Figure 2K was further enhanced and expanded.

      (6) Figure 2K: Can the authors speculate on whether there is any increase in proliferation through BEST4-ko in HCT15 cells (with overexpression of BEST4 leading to reduced proliferation) and how this may impact the metastatic assay or engraftment/seeding onto the liver?

      Our in vitro experiment demonstrated that the ablation of BEST4 in HCT-15 cells resulted in increased cell proliferation, clonogenesis, migration and invasion (Figures 1 and Figure 1-supplemental figure 1). These findings suggest that BEST4 knockout may potentially contribute to tumor proliferation in vivo; however, further research is required for confirmation. EMT transforms epithelial cells exhibiting a spindle fibroblast-like morphology, leading to the acquisition of mesenchymal characteristics and morphology, enabling these cells to acquire invasive and migratory abilities (Dongre and Weinberg, 2019). When diagnosed in advanced stages, EMT may occur as CRC metastasize to distal organs (Pastushenko and Blanpain, 2019; Sunlin Yong, 2021; Yeung and Yang, 2017; Zhang et al., 2021).  Our study demonstrated that BEST4 inhibits EMT in colorectal cancer (CRC) both in vitro and in vivo. Conversely, ablation of BEST4 promotes EMT by upregulating the expression of EMT-related genes, thereby facilitating the metastasis of colorectal cancer cells to the liver.

      (7) Figure 2L: Authors should indicate in the figure that the BEST4-rescued is at 0 (and not blank).

      Sincerely thanks for catching this issue. We have made corrections to this section in the manuscript.

      (8) Figure 3B: Authors should introduce the usage of the new LS174T cell line in the text.

      Sincerely thanks for catching this issue. The human colorectal cancer cell line, LS174T, was selected for Hes4 knockdown due to its comparatively higher expression of Hes4 in comparison to other CRC cell lines. We have made corrections to this section in the manuscript.

      (9) Figure 3F: Why is there less FLAG in the input, compared to the IP?

      Sincerely thanks for catching this issue. Cell lysates (20 µg) were used for input, and 500ug for IP according to the manufacturer's protocols.

      (10) Figure 5F-G: the quality of the figure is not good enough for interpretation.

      Again, we apologize for poor quality of pictures due to manuscript conversion. We have made corrections to this section in the manuscript.

      (11) Table 1: Conclusions made by the authors are wrong (lines 237-239); instead "high BEST4 expression more prevalent in females" and "low BEST4 expression more prevalent among CRC patients with advanced tumor stage". And how are low and high BEST4 expressions defined (the same applies to the data in Fig. 5F)?

      We apologize for these mistakes, we set cutoff-high (50%) and cutoff-low (50%) values to split the high-expression and low-expression cohorts. We have made corrections to this section in the manuscript.

      (12) In all Figure legends, there should be an indication of the type of statistical tests that were applied, as well as information on the number of independent experiments that were performed and provided the same results

      Sincerely thanks for catching this issue. The types of statistical tests applied in the Materials and Method- Statistical analysis section are indicated. Information on the number of independent experiments used is provided in the figure legend section.

      Reference

      Bulzico, D., Pires, B.R.B., PAS, D.E.F., Neto, L.V., and Abdelhay, E. (2019). "Twist1 Correlates With Epithelial-Mesenchymal Transition Markers Fibronectin and Vimentin in Adrenocortical Tumors". Anticancer research 39, 173-175. 10.21873/anticanres.13094.

      Cakouros, D., Isenmann, S., Hemming, S.E., Menicanin, D., Camp, E., Zannetinno, A.C., and Gronthos, S. (2015). "Novel basic helix-loop-helix transcription factor hes4 antagonizes the function of twist-1 to regulate lineage commitment of bone marrow stromal/stem cells". Stem Cells Dev 24, 1297-1308. 10.1089/scd.2014.0471.

      Christou, N., Perraud, A., Blondy, S., Jauberteau, M.O., Battu, S., and Mathonnet, M. (2017). "E-cadherin: A potential biomarker of colorectal cancer prognosis". Oncol Lett 13, 4571-4576. 10.3892/ol.2017.6063.

      Dongre, A., and Weinberg, R.A. (2019). "New insights into the mechanisms of epithelial-mesenchymal transition and implications for cancer". Nature reviews. Molecular cell biology 20, 69-84. 10.1038/s41580-018-0080-4.

      Drummond, C.G., Bolock, A.M., Ma, C., Luke, C.J., Good, M., and Coyne, C.B. (2017). "Enteroviruses infect human enteroids and induce antiviral signaling in a cell lineage-specific manner". Proceedings of the National Academy of Sciences of the United States of America 114, 1672-1677. 10.1073/pnas.1617363114.

      Fagerberg, L., Hallstrom, B.M., Oksvold, P., Kampf, C., Djureinovic, D., Odeberg, J., Habuka, M., Tahmasebpoor, S., Danielsson, A., Edlund, K., et al. (2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Mol Cell Proteomics 13, 397-406. 10.1074/mcp.M113.035600.

      Hagen, A.R., Barabote, R.D., and Saier, M.H. (2005). "The bestrophin family of anion channels: identification of prokaryotic homologues". Molecular membrane biology 22, 291-302. 10.1080/09687860500129711.

      Ito, G., Okamoto, R., Murano, T., Shimizu, H., Fujii, S., Nakata, T., Mizutani, T., Yui, S., Akiyama-Morio, J., Nemoto, Y., et al. (2013). "Lineage-specific expression of bestrophin-2 and bestrophin-4 in human intestinal epithelial cells". PLoS One 8, e79693. 10.1371/journal.pone.0079693.

      Lazarova, D.L., and Bordonaro, M. (2016). "Vimentin, colon cancer progression and resistance to butyrate and other HDACis". Journal of cellular and molecular medicine 20, 989-993. 10.1111/jcmm.12850.

      Marmorstein, L.Y., McLaughlin, P.J., Stanton, J.B., Yan, L., Crabb, J.W., and Marmorstein, A.D. (2002). "Bestrophin interacts physically and functionally with protein phosphatase 2A". The Journal of biological chemistry 277, 30591-30597. 10.1074/jbc.M204269200.

      Meng, J., Chen, S., Han, J.X., Qian, B., Wang, X.R., Zhong, W.L., Qin, Y., Zhang, H., Gao, W.F., Lei, Y.Y., et al. (2018). "Twist1 Regulates Vimentin through Cul2 Circular RNA to Promote EMT in Hepatocellular Carcinoma". Cancer research 78, 4150-4162. 10.1158/0008-5472.Can-17-3009.

      Milenkovic, V.M., Langmann, T., Schreiber, R., Kunzelmann, K., and Weber, B.H. (2008). "Molecular evolution and functional divergence of the bestrophin protein family". BMC evolutionary biology 8, 72. 10.1186/1471-2148-8-72.

      Miller, A.N., Vaisey, G., and Long, S.B. (2019). "Molecular mechanisms of gating in the calcium-activated chloride channel bestrophin". eLife 8. 10.7554/eLife.43231.

      Nagai, T., Arao, T., Nishio, K., Matsumoto, K., Hagiwara, S., Sakurai, T., Minami, Y., Ida, H., Ueshima, K., Nishida, N., et al. (2016). "Impact of Tight Junction Protein ZO-1 and TWIST Expression on Postoperative Survival of Patients with Hepatocellular Carcinoma". Digestive diseases (Basel, Switzerland) 34, 702-707. 10.1159/000448860.

      Parikh, K., Antanaviciute, A., Fawkner-Corbett, D., Jagielowicz, M., Aulicino, A., Lagerholm, C., Davis, S., Kinchen, J., Chen, H.H., Alham, N.K., et al. (2019). "Colonic epithelial cell diversity in health and inflammatory bowel disease". Nature 567, 49-55. 10.1038/s41586-019-0992-y.

      Pastushenko, I., and Blanpain, C. (2019). "EMT Transition States during Tumor Progression and Metastasis". Trends in cell biology 29, 212-226. 10.1016/j.tcb.2018.12.001.

      Qu, H., Su, Y., Yu, L., Zhao, H., and Xin, C. (2019). "Wild-type p53 regulates OTOP2 transcription through DNA loop alteration of the promoter in colorectal cancer". FEBS open bio 9, 26-34. 10.1002/2211-5463.12554.

      Sonnhammer, E.L., and Durbin, R. (1997). "Analysis of protein domain families in Caenorhabditis elegans". Genomics 46, 200-216. 10.1006/geno.1997.4989.

      Sunlin Yong, Z.W., Tang Yuan, Chuang Cheng, Dan Jiang (2021). "Comparison of MMR protein and Microsatellite Instability Detection in Colorectal Cancer and Its Clinicopathological Features Analysis". Journal of Medical Research 50, 61-66. 10.11969/j.issn.1673-548X.2021.05.015

      Vesuna, F., van Diest, P., Chen, J.H., and Raman, V. (2008). "Twist is a transcriptional repressor of E-cadherin gene expression in breast cancer". Biochem Biophys Res Commun 367, 235-241. 10.1016/j.bbrc.2007.11.151.

      Yang, J., Mani, S.A., Donaher, J.L., Ramaswamy, S., Itzykson, R.A., Come, C., Savagner, P., Gitelman, I., Richardson, A., and Weinberg, R.A. (2004). "Twist, a master regulator of morphogenesis, plays an essential role in tumor metastasis". Cell 117, 927-939. 10.1016/j.cell.2004.06.006.

      Yeung, K.T., and Yang, J. (2017). "Epithelial-mesenchymal transition in tumor metastasis". Molecular oncology 11, 28-39. 10.1002/1878-0261.12017.

      Yusup, A., Huji, B., Fang, C., Wang, F., Dadihan, T., Wang, H.J., and Upur, H. (2017). "Expression of trefoil factors and TWIST1 in colorectal cancer and their correlation with metastatic potential and prognosis". World journal of gastroenterology 23, 110-120. 10.3748/wjg.v23.i1.110.

      Zhang, N., Ng, A.S., Cai, S., Li, Q., Yang, L., and Kerr, D. (2021). "Novel therapeutic strategies: targeting epithelial-mesenchymal transition in colorectal cancer". The Lancet. Oncology 22, e358-e368. 10.1016/s1470-2045(21)00343-0.

      Zhu, D.J., Chen, X.W., Zhang, W.J., Wang, J.Z., Ouyang, M.Z., Zhong, Q., and Liu, C.C. (2015). "Twist1 is a potential prognostic marker for colorectal cancer and associated with chemoresistance". American journal of cancer research 5, 2000-2011.

    1. eLife Assessment

      This study provides a useful set of experiments showing the relative contribution of the Noradrenergic system in reversing the sedation induced by midazolam. The evidence supporting the claims of the authors is solid, although specificity issues in the pharmacology and neural-circuit investigations narrow down the strengths of the conclusions. Dealing with these limitations will make the paper attractive to medical biologists working on the neurobiology of anesthesia.

    2. Reviewer #2 (Public Review):

      Summary:

      This article mainly explores the neural circuit mechanism of recovery of consciousness after midazolam administration and proves that the LC-VLPO NEergic neural circuit helps to promote the recovery of midazolam, and this effect is mainly caused by α1 adrenergic receptors. (α1-R) mediated.

      Strengths:

      This article uses innovative methods such as optogenetics and fiber optic photometry in the experimental methods section to make the stimulation of neuronal cells more precise and the stimulation intensity more accurate in experimental research. In addition, fiber optic photometry adds confidence to the results of calcium detection in mouse neuronal cells.

      This article explains the results from the entire system down to cells, and then cells gradually unfold to explain the entire mechanism. The entire explanation process is logical and orderly. At the same time, this article conducted a large number of rescue experiments, which greatly increased the credibility of the experimental conclusions.

      Throughout the full text and all conclusions, this article has elucidated the neural circuit mechanism of recovery of consciousness after midazolam administration and successfully verified that the LC-VLPO NEergic neural circuit helps promote the recovery of midazolam.

      The conclusions of this article are crucial to ameliorate the complications of its abuse. It will pinpoint relevant regions involved in midazolam response and provide a perspective to help elucidate the dynamic changes in neural circuits in the brain during altered consciousness and suggest a promising approach towards the goal of timely recovery from midazolam. New research avenues.

      At the same time, this article also has important clinical translation significance. The application of clinical drug midazolam and animal experiments have certain guiding significance for subsequent related clinical research.

      Comments on revised version: I have no further questions for this manuscript.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      One major issue arises in Figure 4, the recording of VLPO Ca2+ activity. In Lines 211-215, they stated that they injected AAV2/9-DBH-GCaMP6m into the VLPO, while activating LC NE neurons. As they claimed in line 157, DBH is a specific promoter for NE neurons. This implies an attempt to label NE neurons in the VLPO, which is problematic because NE neurons are not present in the VLPO. This raises concerns about their viral infection strategy since Ca activity was observed in their photometry recording. This means that DBH promoter could randomly label some non-NE neurons. Is DBH promoter widely used? The authors should list references. Additionally, they should quantify the labeling efficiency of both DBH and TH-cre throughout the paper.

      In Figure 5, we found that the VLPO received the noradrenergic projection from LC, indicating the recorded Ca2+ activity may come from the axon fibers corresponding to the projection. Similarly, Gunaydin et al. (2014) demonstrated that fiber photometry can be used to selectively record from neuronal projection.

      We appreciate the reviewer's insightful suggestion to elaborate on the DBH promoter, we have now expanded our discussion to address the DBH (pg. 18): “DBH (Dopamine-beta-hydroxylase), located in the inner membrane of noradrenergic and adrenergic neurons, is an enzyme that catalyzes the conversion of dopamine to norepinephrine, and therefore plays an important role in noradrenergic neurotransmission. DBH is a marker of noradrenergic neurons. Zhou et al. (2020) clarified the probe specifically labeled noradrenergic neurons by immunolabeling for DBH. Recently, DBH promoter have been used in several studies (e.g., Han et al., 2024; Lian et al., 2023). The DBH-Cre mice are widely used to specifically labeled noradrenergic neurons (e.g., Li et al., 2023; Breton-Provencher et al., 2022; Liu et al., 2024). It is difficult to distinguish the role of NE or DA neurons when using the TH promoter in VLPO. Therefore, we used DBH promoter with more specific labeling. LC is the main noradrenergic nucleus of the central nervous system. In our study, we injected rAAV-DBH-GCaMP6m-WPRE (Figure 2 and 8) and rAAV-DBH-EGFP-S'miR-30a-shRNA GABAA receptor)-3’-miR30a-WPRES (Figure 9) into the LC. The results showed that DBH promoter could specifically label noradrenergic neurons in the LC, while non-specific markers outside the LC were almost absent.”

      As suggested, we have quantified the labeling efficiency of both DBH and TH-cre throughout the revised manuscript (Fig.2D; Fig.3D, N-O; Fig.4E-F, J, L; Fig.5E, L; Fig.6L, S, X; Fig.7G).

      A similar issue arises with chemogenetic activation in Fig. 5 L-R, the authors used TH-cre and DIO-Gq virus to label VLPO neurons. Were they labelling VLPO NE or DA neurons for recording? The authors have to clarify this.

      As previously addressed in response to Comment #1, we agree that it is difficult to distinguish the role of NE or DA neurons when using the TH promoter in the VLPO. Therefore, we injected the mixture of DBH-Cre-AAV and AAV-EF1a-DIO-hChR2(H134R)-eYFP/AAV-Ef1a-DIO-hM3Dq-mCherry viruses into bilateral LC and AAV-EF1a-DIO-hChR2(H134R)-eYFP/AAV-Ef1a-DIO-hM3Dq-mCherry virus into bilateral VLPO. Moreover, we quantified the labeling efficiency of DBH in the LC to demonstrate that this promoter can specifically label NE neurons (Fig. 5). Importantly, these corrections did not alter the outcomes of our results. Both photogenetic and chemogenetic activation of LC-NE terminals in the VLPO can effectively promote midazolam recovery (Fig. 5G, N).

      Another related question pertains to the specificity of LC NE downstream neurons in the VLPO. For example, do they preferentially modulate GABAergic or glutamatergic neurons?

      Our study primarily aimed to explore the role of the LC-VLPO NEergic neural circuit in modulating midazolam recovery. We acknowledge that our evidence for the role of LC NE downstream neurons in the VLPO, derived from activation of LC-NE terminals and pharmacological intervention in the VLPO (Fig.5, Fig.6, Fig.8, Fig.9) is limited. Accordingly, we now present the VLPO’s role as a promising direction for future research in the limitation section of our revised manuscript: “This study shows that the LC-VLPO NEergic neural circuit plays an important role in modulating midazolam recovery. However, the specificity of LC NE downstream neurons in the VLPO is not explained in this paper, which is our next research direction, VLPO neurons and their downstream regulatory mechanisms may be involved in other nervous systems except the NE nervous system, and the deeper and more complex mechanisms need to be further investigated.”

      In Figure 1A-D, in the measurement of the dosage-dependent effect of Mida in LORR, were they only performed one batch of testing? If more than one batch of mice were used, error bar should be presented in 1B. Also, the rationale of testing TH expression levels after Mid is not clear. Is TH expression level change related to NE activation specifically? If so, they should cite references.

      As recommended, we have supplemented error bar and modified the graph of LORR’s rate in the revised manuscript. (Fig. 1A-B; Fig. 9G-H).

      We agree that the use of TH as a marker of NE activation is controversial, so in the revised manuscript, we directly determined central norepinephrine content to reflect the change of NE activity after midazolam administration (Fig. 1D).

      Regarding the photometry recording of LC NE neurons during the entire process of midazolam injection in Fig. 2 and Fig. 4, it is unclear what time=0 stands for. If I understand correctly, the authors were comparing spontaneous activity during the four phases. Additionally, they only show traces lasting for 20s in Fig. 2F and Fig. 4L. How did the authors select data for analysis, and what criteria were used? The authors should also quantify the average Ca2+ activity and Ca2+ transient frequency during each stage instead of only quantifying Ca2+ peaks. In line 919, the legend for Figure 2D, they stated that it is the signal at the BLA; were they also recorded from the BLA?

      In this study, we used optical fiber calcium signal recording, which is a fluorescence imaging based on changes in calcium. The fluorescence signal is usually divided into different segments according to the behavior, and the corresponding segments are orderly according to the specific behavior event as the time=0. The mean calcium fluorescence signal in the time window 1.5s or 1s before the event behavior is taken as the baseline fluorescence intensity (F0), and the difference between the fluorescence intensity of the occurrence of the behavior and the baseline fluorescence intensity is divided by the difference between the baseline fluorescence intensity and the offset value. That is, the value ΔF/F0 represents the change of calcium fluorescence intensity when the event occurs. The results of the analysis are commonly represented by two kinds of graphs, namely heat map and event-related peri-event plot (e.g., Cheng et al., 2022; Gan-Or et al., 2023; Wei et al., 2018). In Fig. 2, the time points for awake, midazolam injection, LORR and RORR in mice were respectively selected as time=0, while in Fig. 4, RORR in mice was selected as time=0. The selected traces lasting for 20s was based on the length of a complete Ca2+ signal. We have explained the Ca2+ recording experiment more specifically in the figure legends and methods sections of our revised manuscript.

      To the BLA, we sincerely apologize for our carelessness, the signal we recorded were from the LC rather than the BLA. We have carefully checked and corrected similar problems in the revised manuscript.

      Reviewer 2:

      In figure legends, abbreviations in figure should be supplemented as much as possible. For example, "LORR" in Figure 1.

      As suggested, we have supplemented abbreviations in figure as much as possible in the revised manuscript.

      Additional recommendations:

      The main conceptual issue in the paper is the inflation of the conclusion regarding the mechanism of sedation induced by midazolam. The authors did not reveal the full mechanism of this but rather the relative contribution of NE system. Several conclusions in the text should be edited to take into account this starting from the title. I think the following examples are more appropriate: "NE contribution to rebooting unconsciousness caused by midazolam' or 'NE contribution to reverse the sedation induced by midazolam'.

      As suggested, we have moderated the assertions about the mechanism of sedation induced by midazolam in several conclusions starting from the title (Line 1,125,150,169,202,237,482), to present a more measured interpretation in the manuscript.

      Line 178-179, the authors state 'these suggest that intranuclear ... suppresses recovery from midazolam administration'. In fact, this intervention prolonged or postponed recovery from midazolam.

      In our revised manuscript, we have corrected this inappropriate term (Line 178).

      Pharmacology part (page 12) that aimed to pinpoint which NE receptor is implicated would suffer from specificity issues.

      In relation to the specificity issue, the focus on VLPO might be rational but again other areas are most likely involved given the pharmacological actions of midazolam.

      In the revised manuscript, we have discussed those specificity issues of NE receptor and areas involved throughout the midazolam-induced altered consciousness: “In addition, given the pharmacological actions of midazolam, other areas may also be involved. Current studies suggest that the neural network involved in the recovery of consciousness consists of the prefrontal cortex, basal forebrain, brain stem, hypothalamus and thalamus. The role of these regions in midazolam recovery remains to be further investigated. Therefore, we will apply more specific experimental methods to determine the importance of LC-VLPO NEergic neural circuit and related NE receptors in the midazolam recovery, and conduct further studies on other relevant brain neural regions, hoping to more fully elucidate the mechanism of midazolam recovery in the future”.

      Line 274, the authors used 'inhibitory EEG activity'. what does it mean? a description of which rhythm-related power density is affected would be more objective.

      Example of conclusion inflation: in line 477, the word 'contributes' is better than 'mediates' if the specificity issue is taken into account.

      As suggested, we have improved our expression of words in our revised manuscript (pg. 13-14).

      References

      Gunaydin LA, Grosenick L, Finkelstein JC, et al. Natural neural projection dynamics underlying social behavior. Cell. 2014;157(7):1535-1551. doi:10.1016/j.cell.2014.05.017

      Zhou N, Huo F, Yue Y, Yin C. Specific Fluorescent Probe Based on "Protect-Deprotect" To Visualize the Norepinephrine Signaling Pathway and Drug Intervention Tracers. J Am Chem Soc. 2020;142(41):17751-17755. doi:10.1021/jacs.0c08956

      Han S, Jiang B, Ren J, et al. Impaired Lactate Release in Dorsal CA1 Astrocytes Contributed to Nociceptive Sensitization and Comorbid Memory Deficits in Rodents. Anesthesiology. 2024;140(3):538-557. doi:10.1097/ALN.0000000000004756

      Lian X, Xu Q, Wang Y, et al. Noradrenergic pathway from the locus coeruleus to heart is implicated in modulating SUDEP. iScience. 2023;26(4):106284. Published 2023 Feb 27. doi:10.1016/j.isci.2023.106284

      Li C, Sun T, Zhang Y, et al. A neural circuit for regulating a behavioral switch in response to prolonged uncontrollability in mice. Neuron. 2023;111(17):2727-2741.e7. doi:10.1016/j.neuron.2023.05.023

      Breton-Provencher V, Drummond GT, Feng J, Li Y, Sur M. Spatiotemporal dynamics of noradrenaline during learned behaviour. Nature. 2022;606(7915):732-738. doi:10.1038/s41586-022-04782-2

      Liu Q, Luo X, Liang Z, et al. Coordination between circadian neural circuit and intracellular molecular clock ensures rhythmic activation of adult neural stem cells. Proc Natl Acad Sci U S A. 2024;121(8):e2318030121. doi:10.1073/pnas.2318030121

      Cheng J, Ma X, Li C, et al. Diet-induced inflammation in the anterior paraventricular thalamus induces compulsive sucrose-seeking. Nat Neurosci. 2022;25(8):1009-1013. doi:10.1038/s41593-022-01129-y

      Gan-Or B, London M. Cortical circuits modulate mouse social vocalizations. Sci Adv. 2023;9(39):eade6992. doi:10.1126/sciadv.ade6992

      Wei YC, Wang SR, Jiao ZL, et al. Medial preoptic area in mice is capable of mediating sexually dimorphic behaviors regardless of gender. Nat Commun. 2018;9(1):279. Published 2018 Jan 18. doi:10.1038/s41467-017-02648-0

    1. eLife Assessment

      This fundamental study advances substantially our understanding of sound encoding at synapses between single inner hair cells of the mouse cochlea and spiral ganglion neurons. Dual patch-clamp recordings-a technical tour-de force-and careful data analysis provide compelling evidence that the functional heterogeneity of these synapses contributes to the diversity of spontaneous and sound-evoked firing by the neurons. The work will be of broad interest to scientists in the field of auditory neuroscience.

    2. Reviewer #1 (Public review):

      Summary:

      Tobón and Moser reveal a remarkable amount of presynaptic diversity in the fundamental Ca dependent exocytosis of synaptic vesicles at the afferent fiber bouton synapse onto the pilar or mediolar sides of single inner hair cells of mice. These are landmark findings with profound implications for understanding acoustic signal encoding and presynaptic mechanisms of synaptic diversity at inner hair cell ribbon synapses. The paper will have an immediate and long-lasting impact in the field of auditory neuroscience.

      Main findings: 1) Synaptic delays and jitter of masker responses are significantly shorter (synaptic delay: 1.19 ms) at high SR fibers (pilar) than at low SR fibers (mediolar; 2.57 ms). 2) Masked evoked EPSC are significantly larger in high SR than in low SR. 3) Quantal content and RRP size are 14 vesicles in both high and low SR fibers. 4) Depression is faster in high SR synapses suggesting they have a higher release probability and tighter Ca nanodomain coupling to docked vesicles. 5) Recovery of master-EPSCs from depletion is similar for high and low SR synapses, although there is a slightly faster rate for low SR synapses that have bigger synaptic ribbons, which is very interesting. 6) High SR synapses had larger and more compact (monophasic) sEPSCs, well suited to trigger rapidly and faithfully spikes. 7) High SR synapses exhibit lower voltage (~sound pressure in vivo) dependent thresholds of exocytosis.

      Great care was taken to use physiological external pH buffers and physiological external Ca concentrations. Paired recordings were also performed at higher temperatures with IHCs at physiological resting membrane potentials and in more mature animals than previously done for paired recordings. This is extremely challenging because it becomes increasingly difficult to visualize bouton terminals when myelination becomes more prominent in the cochlear afferents. In addition, perforated patch recordings were used in the IHC to preserve its intracellular milieu intact and thus extend the viability of the IHCs. The experiments are tour-de-force and reveal several novel aspects of IHC ribbon synapses. The data set is rich and extensive. The analysis is detailed and compelling.

    3. Reviewer #2 (Public review):

      Summary:

      The study by Jaime-Tobon & Moser is a truly major effort to bridge the gap between classical observations on how auditory neurons respond to sounds and the synaptic basis of these phenomena. The so-called spiral ganglion neurons (SGNs) are the primary auditory neurons connecting the brain with hair cells in the cochlea. They all respond to sounds increasing their firing rates, but also present multiple heterogeneities. For instance, some present a low threshold to sound intensity, whereas others have high threshold. This property inversely correlates with the spontaneous rate, i.e., the rate at which each neuron fires in the absence of any acoustic input. These characteristics, along with others, have been studied by many reports over years. However, the mechanisms that allow the hair cells-SGN synapses to drive these behaviors are not fully understood.

      The level of experimental complexity described in this manuscript is unparalleled, producing data that is hardly found elsewhere. The authors provide strong proof for heterogeneity in transmitter release thresholds at individual synapses and they do so in an extremely complex experimental settings. In addition, the authors found other specific differences such as in synaptic latency and max EPSCs. A reasonable effort is put in bridging these observations with those extensively reported in in vivo SGNs recordings. Similarities are many and differences are not particularly worrying as experimental conditions cannot be perfectly matched, despite the authors' efforts in minimizing them.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Jaime Tobon and Moser uses patch-clamp electrophysiology in cochlear preparations to probe the pre- and post-synaptic specializations that give rise to diverse activity of spiral ganglion afferent neurons (SGN). The experiments are quite an achievement! They use paired recordings from pre-synaptic cochlear inner hair cells (IHC) that allow precise control of voltage and therefore calcium influx, with post-synaptic recordings from type I SGN boutons directly opposed to the IHC for both presynaptic control of membrane voltage and post-synaptic measurement of synaptic function with great temporal resolution.

      Any of these techniques by themselves are challenging, but the authors do them in pairs, at physiological temperatures, and in hearing animals, all of which combined make these experiments a real tour de force. The data is carefully analyzed and presented, and the results are convincing. In particular, the authors demonstrate that post-synaptic features that contribute to the spontaneous rate (SR) of predominantly monophasic post-synaptic currents (PSCs), shorter EPSC latency, and higher PSC rates are directly paired with pre-synaptic features such as a lower IHC voltage activation and tighter calcium channel coupling for release to give a higher probability of release and subsequent increase in synaptic depression. Importantly, IHCs paired with Low and High SR afferent fibers had the same total calcium currents, indicating that the same IHC can connect to both low and high SR fibers. These fibers also followed expected organizational patterns, with high SR fibers primarily contacting the pillar IHC face and low SR fibers primarily contacting the modiolar face. The authors also use in vivo-like stimulation paradigms to show different RRP and release dynamics that are similar to results from SGN in vivo recordings. Overall, this work systematically examines many features giving rise to specializations and diversity of SGN neurons.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Tobón and Moser reveal a remarkable amount of presynaptic diversity in the fundamental Ca dependent exocytosis of synaptic vesicles at the afferent fiber bouton synapse onto the pilar or mediolar sides of single inner hair cells of mice. These are landmark findings with profound implications for understanding acoustic signal encoding and presynaptic mechanisms of synaptic diversity at inner hair cell ribbon synapses. The paper will have an immediate and long-lasting impact in the field of auditory neuroscience.

      Main findings: 1) Synaptic delays and jitter of masker responses are significantly shorter (synaptic delay: 1.19 ms) at high SR fibers (pilar) than at low SR fibers (mediolar; 2.57 ms). 2) Masked evoked EPSC are significantly larger in high SR than in low SR. 3) Quantal content and RRP size are 14 vesicles in both high and low SR fibers. 4) Depression is faster in high SR synapses suggesting they have a higher release probability and tighter Ca nanodomain coupling to docked vesicles. 5) Recovery of master-EPSCs from depletion is similar for high and low SR synapses, although there is a slightly faster rate for low SR synapses that have bigger synaptic ribbons, which is very interesting. 6) High SR synapses had larger and more compact (monophasic) sEPSCs, well suited to trigger rapidly and faithfully spikes. 7) High SR synapses exhibit lower voltage (~sound pressure in vivo) dependent thresholds of exocytosis.

      Strengths:

      Great care was taken to use physiological external pH buffers and physiological external Ca concentrations. Paired recordings were also performed at higher temperatures with IHCs at physiological resting membrane potentials and in more mature animals than previously done for paired recordings. This is extremely challenging because it becomes increasingly difficult to visualize bouton terminals when myelination becomes more prominent in the cochlear afferents.

      In addition, perforated patch recordings were used in the IHC to preserve its intracellular milieu intact and thus extend the viability of the IHCs. The experiments are tour-de-force and reveal several novel aspects of IHC ribbon synapses. The data set is rich and extensive. The analysis is detailed and compelling.

      We would like to thank the reviewer for the appreciation of our work and the comments that helped us to improve our manuscript. We detail our responses to the comments below.

      Weaknesses:

      (1) Materials and Methods: Please provide whole-cell Rs (series resistance ) and Cm (membrane capacitance) average +/- S.E.M. (or SD) values for IHC and afferent fiber bouton recordings. The Cm values for afferents have been estimated to be about 0.1 pF (Glowatzki and Fuchs, 2002) and it would be interesting to know if there are differences in these numbers for high and low SR afferents. Is it possible to estimate Cm from the capacitative transient time constant? Minimal electronic filtering would be required for that to work, so I realize the authors may not have this data and I also realize that the long cable of the afferents do not allow accurate Cm measurements, but some first order estimate would be very interesting to report, if possible.

      In response to the reviewer’s comment, we now added the estimates of series resistance and membrane capacitance for IHC and bouton recordings in Material and Methods and in the Figure 1 – figure supplement 1. Our estimate for bouton Cm is on average 1.7 ± 0.09 pF, a value that compares well to the literature. For example, Glowatzki and Fuchs (2002) provided estimates ranging 0.5-2 pF for recordings from afferent inner hair cell synapses in rats that showed a capacitance transient. In own prior work on afferent inner hair cell synapses of pre-hearing mice, we found estimates of 2.6 ± 0.5 pF (Chapochnikov et al., 2014) and 1.9 ± 0.2 pF (Takago et al., 2019). Keen and Hudspeth (2006) reported capacitances of 1–4 pF for afferent terminals in the bullfrog amphibian papilla. There was no difference in bouton Cm between high SR (1.78 ± 0.19 pF) and low SR synapses (1.68 ± 0.11 pF; p = 0.6575, unpaired t test).

      (2) Page 20, 26 and Figure 4: With regard to synaptic delays at auditory hair cell synapses: please see extensive studies done in Figure 11 of Chen and von Gersdorff (JNeurosci., 2019); this showed that synaptic delays are 1.26 ms in adult bullfrog auditory hair cells at 31oC, which is very similar to the High SR fibers (1.19 ms; Fig.4B and page 20). During ongoing depolarizations (e.g. during a sustained sine wave) the synaptic delay can be reduced to just 0.72 ms for probe EPSCs, which is a more usual number for mature fast synapses. This paper should, thus, be cited and briefly discussed in the Discussion. So a significant shortening of delay occurs for the probe response and this is also observed in young rat IHC synapses (see Goutman and Glowatzki, 2011).

      We thank the reviewer for this comment. We have analysed the synaptic delay of the probe response and included it in Figure 4 – figure supplement 1. Contrary to the findings from Goutman and Glowatzki (2011) and Chen and von Gersdorff (2019), we did not observe a shortening of the synaptic delay for the probe response compared to the masker response. This difference might arise from the duration of the masker stimulus and/or the IHC holding potential. Synaptic facilitation in hair cells seems to occur only when the RRP is not depleted by the first stimulus (Cho et al., 2011). Our 100 ms masker depolarization from a holding potential of -58 mV effectively depleted the synapse RRP (Figure 4D), while both studies mentioned above used relatively short depolarizations (2 in rat and 20 ms in bullfrog) from a holding potential around -90 mV, which most likely didn’t deplete the RRP. Indeed, when using partially RRP depleting stimuli of 10 ms, Goutman (2011) observed longer synaptic latencies and smaller responses to the second stimulus. We have included this discussion in the last paragraph of the results section.

      Additionally, we would also like to note that we referred to the important work on frog hair cell synapses in the manuscript, yet aimed to focus on relating synaptic heterogeneity of mammalian inner hair cell synapses to the functional diversity of type I spiral ganglion neurons that unlike the frog afferents show little branching of their peripheral neurites (in only ~15% of the neurons). We think it will be very interesting to study the aspect of presynaptic heterogeneity in the bullfrog amphibian papilla, but assume that the converging input of several active zones onto a single afferent might provide a different encoding scheme than in the mammalian cochlea.

      (3) Gaussian-like (and/or multi-peak) EPSC amplitude distributions were obtained in more mature rat IHCs by Grant et al. (see their Figure 4G; JNeurosci. 2010; postnatal day 19-21). The putative single quanta peak was at 50 pA and the main peak was at 375 pA. The large mean suggests a low CV (probably < 0.4). However, Fig. 2F shows a mean of about 100 pA and CV = 0.7 for spontaneous EPSCs. This major difference deserves some more discussion. I suppose that one possible explanation may be that the current paper holds the IHC membrane potential fixed at -58 mV, whereas Grant et al. (2010) did not control the IHC membrane potential and spontaneous fluctuations in the Vm may have depolarized the IHC, thus producing larger evoked EPSCs that are triggered by Ca channel openings. Some discussion that compares these differences and possible explanations would be quite useful for the readers.

      We understand the reviewer’s concern. We have now included the amplitude distribution of sEPSCs recorded from 12 boutons without patch-clamping the IHC (Figure 2–figure supplement 1, panel A). The rest of the recording conditions (i.e., artificial perilymph-like solution, physiological temperature and age) were identical to the conditions used for the paired recordings. Both the range of spontaneous rate (0 up to 16.33 sEPSC/s) and the amplitude distribution (peak at -40 pA and CV of 0.66) were comparable to the values we obtained when clamping the IHC resting potential at -58 mV. In addition, for two of our pairs, we established the bouton recording first, measured the spontaneous release, then established the perforated patch-clamp of the IHC and measured the spontaneous release again with IHC held at -58 mV. For pair #l300321_1, the SR before clamping the IHC was 0.0125 sEPSC/s, with a maximal AmpsEPSC of -110 pA (avg. -52 pA). The SR while holding the IHC at -58 mV was 0.36 sEPSCs/s, with a maximal AmpsEPSC of -140 pA (avg. -46 pA). For pair #l200522_2, the SR changed from 0.07 sEPSC/s to 0. The maximal AmpsEPSC before clamping the IHC was -70 pA (avg. -31 pA). Overall, our data recorded without controlling the IHC argues against the resting potential of -58 mV as a major source of differences in EPSC rate and amplitudes compared to previous studies.

      Nonetheless, it is important to note that the experimental conditions used in our study differ from previous reports in several aspects. Our extracellular solution contains the physiological pH buffer bicarbonate instead of the fast buffer HEPES, as well as TEA and Cs+ for proper isolation of the Ca2+ currents. Both pH and potassium channel blockers can alter the excitability of the cell and, consequently, the spontaneous and evoked release. For instance, despite maintaining a similar extracellular pH (7.3 to 7.4), the choice of bicarbonate or HEPES for the extracellular solution can influence differently the regulation of the intracellular pH of the cell (Michl et al., 2019). Indeed, the activity of ion channels and receptors (e.g., AMPAR), and the resting potential can change depending on the extracellular buffer used (Hare and Owen, 1998, Vincent et al., 2019, Cho and von Gersdorff, 2014; and review Sinning and Hübner, 2013). Additionally, the animal model and the age range could be a source of difference. In rats, the EPSC amplitude distribution seems to change with maturation but not with K+ stimulation (Grant et al., 2010) or voltage depolarizations (Goutman and Glowatzki, 2007). This however does not seem to be the case for afferent boutons recorded from mice. In resting conditions (i.e. 5.8 mM extracellular K+), average EPSC amplitudes are around -100 to -150 pA for both prehearing (Chapochnikov et al., 2014) and hearing mice (Niwa et al., 2021 and the present study). Upon stimulation (40 mM K+ or voltage depolarizations), the mean EPSC amplitude does not change in prehearing mice (Jing et al., 2013; Takaba et al., 2019), but it significantly increases in hearing mice (Niwa et al., 2021 and the present study). In p20 and p30 mice, the mean EPSC amplitude was predominantly below -100 pA at rest and only increased above -100 pA after stimulation with 40 mM K+ (Niwa et al., 2021). Similarly, our reported avg. AmpsEPSC is below -150 pA, while the evoked EPSCs reached average amplitudes above -200 pA (Figure 1–figure supplement 1, panel F and Figure 4 – figure supplement 1, panel F).

      We have included the aforementioned points in the discussion under the section "Diversity of spontaneous release and their topographical segregation”.

      Reviewer #2 (Public Review):

      Summary:

      The study by Jaime-Tobon & Moser is a truly major effort to bridge the gap between classical observations on how auditory neurons respond to sounds and the synaptic basis of these phenomena. The so-called spiral ganglion neurons (SGNs) are the primary auditory neurons connecting the brain with hair cells in the cochlea. They all respond to sounds increasing their firing rates, but also present multiple heterogeneities. For instance, some present a low threshold to sound intensity, whereas others have high threshold. This property inversely correlates with the spontaneous rate, i.e., the rate at which each neuron fires in the absence of any acoustic input. These characteristics, along with others, have been studied by many reports over the years. However, the mechanisms that allow the hair cells-SGN synapses to drive these behaviors are not fully understood.

      Strengths:

      The level of experimental complexity described in this manuscript is unparalleled, producing data that is hardly found elsewhere. The authors provide strong proof for heterogeneity in transmitter release thresholds at individual synapses and they do so in extremely complex experimental settings. In addition, the authors found other specific differences such as in synaptic latency and max EPSCs. A reasonable effort is put into bridging these observations with those extensively reported in in vivo SGNs recordings. Similarities are many and differences are not particularly worrying as experimental conditions cannot be perfectly matched, despite the authors' efforts in minimizing them.

      We would like to thank the reviewer for the appreciation of our work and the comments that helped us to improve our manuscript. We detail our responses to the comments below.

      Weaknesses:

      Some concern surges in relation to mismatches with previous reports of IHC-SGN synapses function. EPSCs at these synapses present a peculiar distribution of amplitudes, shapes, and rates. These characteristics are well-established and some do not seem to be paralleled in this study. Here, amplitude distributions are drastically shifted to smaller values, and rates of events are very low, all compared with previous evidence. The reasons for these discrepancies are unclear. The rate at which spontaneous EPSCs appear is an especially sensitive matter. A great part of the conclusions relies on the definition of which of the SGNs (or should say synapses) belong to the low end and which to the high end in the spectrum of spontaneous rates. The data presented by the authors seem a bit off and the criteria used to classify recordings are not well justified. The authors should clarify the origin of these differences since they do not seem to come from obvious reasons such as animal ages, recording techniques, mouse strain, or even species.

      We understand the reviewer’s concern. We have now included the amplitude distribution of sEPSCs recorded from 12 boutons without patch-clamping the IHC (Figure 2–figure supplement 1, panel A). The rest of the recording conditions (i.e., artificial perilymph-like solution, physiological temperature and age) were identical to the conditions used for the paired recordings. Both the range of spontaneous rate (0 up to 16.33 sEPSC/s) and the amplitude distribution (peak at -40 pA and CV of 0.66) were comparable to the values we obtained when clamping the IHC resting potential at -58 mV. In addition, for two of our pairs, we established the bouton recording first, measured the spontaneous release, then established the perforated patch-clamp of the IHC and measured the spontaneous release again with IHC held at -58 mV. For pair #l300321_1, the SR before clamping the IHC was 0.0125 sEPSC/s, with a maximal AmpsEPSC of -110 pA (avg. -52 pA). The SR while holding the IHC at -58 mV was 0.36 sEPSCs/s, with a maximal AmpsEPSC of -140 pA (avg. -46 pA). For pair #l200522_2, the SR changed from 0.07 sEPSC/s to 0. The maximal AmpsEPSC before clamping the IHC was -70 pA (avg. -31 pA). Overall, our data recorded without controlling the IHC argues against the resting potential of -58 mV as a major source of differences in EPSC rate and amplitudes compared to previous studies.

      Additionally, as noted on the section “Diversity of spontaneous release and their topographical segregation”, our SR values also agree with the range of 0.1 – 16.42 spikes/s reported by Wu et al., (2016) using loose patch recordings from p15-p17 rats. 90% of the paired recordings (and 60% of the bouton recordings) of our dataset were obtained from mice between p14-p17, where spontaneous activity is still low compared to older age groups (p19-p21: 0 – 44.22 spikes/s; p29p32: 0.11 – 54.9 spikes/s Wu et al., 2016; p28: 0 – 47.94 spikes/s, Siebald at al., 2023). There are two additional aspects to consider: i) about 40% of the SGN spikes seem to be generated intrinsically (not activated by an EPSP, ergo an EPSC) at p15-p18 (Wu et al., 2016); and ii) the presence of a spike or EPSC is the sole determinant of a successful recording when the IHC is not stimulated (either by K+ or voltage), thus, these type of experiments undersample fibers with low SR.

      We have included the aforementioned points in the discussion under the section "Diversity of spontaneous release and their topographical segregation”.

      Reviewer #3 (Public Review):

      Summary:

      "Bridging the gap between presynaptic hair cell function and neural sound encoding" by Jaime Tobon and Moser uses patch-clamp electrophysiology in cochlear preparations to probe the pre- and post-synaptic specializations that give rise to the diverse activity of spiral ganglion afferent neurons (SGN). The experiments are quite an achievement! They use paired recordings from pre-synaptic cochlear inner hair cells (IHC) that allow precise control of voltage and therefore calcium influx, with post-synaptic recordings from type I SGN boutons directly opposed to the IHC for both presynaptic control of membrane voltage and post-synaptic measurement of synaptic function with great temporal resolution.

      Strengths

      Any of these techniques by themselves are challenging, but the authors do them in pairs, at physiological temperatures, and in hearing animals, all of which combined make these experiments a real tour de force. The data is carefully analyzed and presented, and the results are convincing. In particular, the authors demonstrate that post-synaptic features that contribute to the spontaneous rate (SR) of predominantly monophasic post-synaptic currents (PSCs), shorter EPSC latency, and higher PSC rates are directly paired with pre-synaptic features such as a lower IHC voltage activation and tighter calcium channel coupling for release to give a higher probability of release and subsequent increase in synaptic depression. Importantly, IHCs paired with Low and High SR afferent fibers had the same total calcium currents, indicating that the same IHC can connect to both low and high SR fibers. These fibers also followed expected organizational patterns, with high SR fibers primarily contacting the pillar IHC face and low SR fibers primarily contacting the modiolar face. The authors also use in vivo-like stimulation paradigms to show different RRP and release dynamics that are similar to results from SGN in vivo recordings. Overall, this work systematically examines many features giving rise to specializations and diversity of SGN neurons.

      We would like to thank the reviewer for the appreciation of our work and the comments that helped us to improve our manuscript. We detail our responses to the comments below.

      Weaknesses / Comments / edits:

      (1) The careful analysis of calcium coupling and EPSC metrics is especially nice. Can the authors speculate as to why different synapses (likely in the same IHC) would have different calcium cooperativity?

      The finding of different apparent Ca2+ cooperativities among IHC synapses is intriguing. Paired pre- and postsynaptic patch-clamp recordings (this work and (Jaime Tobón and Moser, 2023)) and single synapse imaging of presynaptic Ca2+ signals and glutamate release (Özçete and Moser, 2021) jointly support this notion. Both methodologies complement each other. Imaging allows to assess the presynaptic Ca2+ of the specific synapse, while in paired recordings release is related to the whole cell Ca2+ influx. Paired recordings, on the other hand, provide the sensitivity and temporal resolution to assess the initial release rate with short stimuli (2 to 10 ms), which avoids an impact of RRP depletion and ongoing SV replenishment that needs to be considered for the longer stimuli used in imaging (50 ms). Both approaches agree on the finding of tighter coupling of Ca2+ channels and release sites (i.e., lower apparent Ca2+ cooperativity during depolarization within the range of receptor potentials) at pillar synapses. Moreover, the present study took advantage of recording individual release events [which was not achieved by imaging] and further supported the hypothesis that high SR SGNs receive input from active zones with tighter coupling than low SR SGNs. However, our two non-overlapping data sets for paired patch-clamp recordings (this work and (Jaime Tobón and Moser, 2023)) found a narrower range of apparent Ca2+ cooperativities compared to results from single synapse imaging (Özçete and Moser, 2021). This might reflect the technical differences described above. Future studies, potentially combining paired patch-clamp recordings with imaging of presynaptic Ca2+ signals will be needed to scrutinize this aspect.

      We think that the different Ca2+ cooperativities reflect subtle differences in the topography of presynaptic Ca2+ channels and vesicular release sites at the specific IHC active zones. The work of Özçete and Moser (2021) indicated that indeed, apparent Ca2+ cooperativities differ among active zones even within the same inner hair cell. Synaptic heterogeneity within one individual cell can expand its coding capacity. In the case of IHCs, differences in the Ca2+ dependence of synaptic release, in addition to the heterogeneous voltage dependence, appears to diversify the response properties (i.e., synaptic vesicle release probability) of individual synapses to the same stimulus. This is particularly important for sound intensity and temporal coding.

      We have included the aforementioned points in the discussion under the section "Candidate mechanisms distinguishing evoked release at low and high SR synapses”.

      (2) On the bottom of page 6 it would be helpful to mention earlier how many pillar vs modiolar fibers were recorded from, otherwise the skewness of SRs (figure 2H could be thought to be due to predominantly recordings from modiolar fibers. As is, it reads a bit like a cliff-hanger.

      Done!

      (3) The contrasts for some of the data could be used to point out that while significant differences occur between low and high SR fibers, some of these differences are no longer apparent when comparing modiolar vs pillar fibers (eg by contrasting Figure 2C and 2K). This can indicate that indeed there are differences between the fiber activity, but that the activity likely exists in a gradient across the hair cell faces. Pointing this out at the top of page 10 (end of the first paragraph) would be helpful, it would make the seemingly contradictory voltage dependence data easier to understand on first read (voltage-dependence of release is significantly different between different SR fibers (figure 3) but is not significantly different between fibers on different HC faces (figure S3).

      Done!

      (4) It should be acknowledged that although the use of post-hearing animals here (P14-23) ensures that SGN have begun to develop more mature activity patterns (Grant et al 2010), the features of the synapses and SGN activity may not be completely mature (Wu et al 2016 PMID: 27733610). Could this explain some of the 'challenges' (authors' section title) detailed on page 28, first full paragraph?

      Done!

      (5) In the discussion on page 24, the authors compare their recorded SR of EPSCs to measure values in vivo which are higher. Could this indicate that in vivo, the resting membrane potential of IHCs is more depolarized than is currently used for in vitro cochlear experiments?

      That is indeed one possible explanation among others. We have expanded the discussion about the factors that could affect the SR in ex vivo experiments.

      (6) The results showing lower calcium cooperativity of high SR fibers are powerful, but do the authors have an explanation for why the calcium cooperativity of < 2 is different from that (m = 3-4) observed in other manuscripts?

      We assume this question to potentially result from a misunderstanding. Using membrane capacitance measurements and Ca2+ uncaging, Beutner et al. (2001) reported a high intrinsic Ca2+ cooperativity of inner hair cell exocytosis (m = 4-5). Based on this data, it has been proposed that the binding of 4-5 Ca2+ ions is required to trigger the fusion of a synaptic vesicle in IHCs. However, given the shortcoming of Ca2+ uncaging, we and others aimed to further study this aspect using alternative methods. By varying the current of single Ca2+ channels in apical IHCs of hearing mice, several studies reported a high apparent Ca2+ cooperativity (m = 3-5) that is thought to reflect the high intrinsic cooperativity (Brandt et al., 2005; Wong et al., 2014; Özçete and Moser, 2021; Jaime Tobón and Moser, 2023).

      On the other hand, the apparent Ca2+ cooperativity observed upon changing the number of open Ca2+ channels would also reflect the active zone topography (i.e., number and distance of Ca2+ channels to the vesicular release site). In the present study, we used different depolarizations within the range of receptor potentials and found a low apparent Ca2+ cooperativity (m < 2) in 93% of the studied synapses. Other studies in apical IHCs from hearing mice used similar and alternative methods to change the number of open Ca2+ channels and also estimated an apparent cooperativity of < 2 (Brandt et al., 2005; Johnson et al., 2005; Johnson et al., 2007; Wong et al., 2014; Özçete and Moser, 2021; Jaime Tobón and Moser, 2023). The fact that these estimates are smaller than those seen upon changes in single Ca2+ current has been taken to indicate that SV release is governed by one or few Ca2+ channels in nanometer proximity (Ca2+ nanodomain-like control of SV exocytosis), building on classical synapse work (Augustine et al., 1991). 

      In contrast, comparable recordings from mouse IHCs before the onset of hearing (Wong et al., 2014) revealed more similar apparent Ca2+ cooperativities (m ~3) for both changes in the number of open Ca2+ channels and changes in single Ca2+ channel current. This suggests that IHCs before the onset of hearing employ a Ca2+ microdomain-like control of SV exocytosis in which release is governed by the combined activity of several Ca2+ channels in >100 nm distance to the release site. A Ca2+ microdomain-like control of SV exocytosis was also reported for basocochlear IHCs (Johnson et al., 2017).

      Recommendations for the authors:

      As explained in the public reviews of Reviewers 1 and 2, some mismatches between the data presented here and previous reports from the literature have been identified. It is recommended that you discuss those mismatches, perhaps in relation to the choice of patchclamping the hair cells at -58mV.

      We have addressed this point thoroughly in the revised MS. Please see our response to the public review.

      Reviewer #1 (Recommendations For The Authors):

      Minor suggestions and corrections:

      (1) Figures 3 and 4 show beautiful data with paired recordings. Figure 3 shows 10 ms pulses, whereas Fig. 4 shows 100 ms depolarizing pulses. The example in Fig. 3A shows asynchronous release after Ca channel closure, whereas Fig. 4 does not show this so prominently. Was there quite a bit of variability in the asynchronous release from different cell pairs, or was this correlated with pulse duration?

      The asynchronous release is also present after 100 ms depolarizing pulses (please see the updated panel A of Figure 4). However, we have not analysed asynchronous release and think that this would be beyond the scope of the current MS. For clarity, we have added dashed lines in the EPSC traces of Figs. 3 and 4 to indicate the on and off-set of the depolarization.

      (2) Differences in apex and basal IHC ribbon synapse nanodomain to microdomain Ca channel coupling to exocytosis-sensor have been reported also for gerbil IHCs (see Johnson et al., JNeurosci., 2017). This may be worth mentioning since it is another indication of major synaptic diversity in the mammalian cochlea, this time from low to frequency-located IHCs.

      Done

      (3) Page 22: change "hight SR" to "high SR".

      Done

      (4) Page 27: change "addess" to "addressed".

      Done

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      (1) As indicated in methods, recording stretches of 5-10 seconds were used to determine the SR of a given SGN. This seems too short for a reasonable estimate of the SR in these neurons. Also, the reported SRs for these mature mice are not only much lower than those measured in in-vivo SGN extracellular recordings but also compared to those reported in ex-vivo rat recordings. Why this discrepancy? The authors decided to estimate SR by voltage-clamping IHCs at a fixed value of - 58 mV, which they take from Johnson, 2015. I wonder if it is not more reasonable to use a range of IHC holdings and measure SR at those, instead of using a single one. It is hard to visualize a very strong argument for using strictly -58 mV. In addition, mapping out a range of holding potentials could provide additional information on IHCs resting membrane potential in physiological conditions.

      Related to this point, considering that SR values found in the ex-vivo preparation are much lower than those described in in-vivo situations, is it fair to use the same 1 sp/s criteria, as in Taberner & Liberman, to segregate low and high? Shouldn't this value be adjusted to the overall lower SR? This criterion is naturally critical for the consequent evaluation of other SGN properties.

      Finally, on this same problem of IHC Vh, does -58 mV estimate include the 19 mV liquid junction potential? How does it compare with the activation threshold of calcium influx at modiolar vs pillar synapses (see imaging studies)?

      We had proactively discussed the challenges of relating ex vivo and in vivo data in the preprint provided for review. While we consider the outcome of our study helpful for better understanding the relation of afferent synaptic heterogeneity and diverse firing properties of SGNs, we do not claim that the assumptions based on literature (such as on the physiological resting potential) represent ground truth.

      When carefully revising the MS, we have expanded on the discussion to address the points raised here, particularly regarding the lower SR and sEPSC amplitudes. As this and the other reviewer commented in the public review, these experiments were hard to achieve and we consider repeating them with a range of IHC holding potentials (then not only for spontaneous rate of transmission, but also for in depth characterization of evoked release) to be beyond the scope of the present study.

      We do appreciate the suggestion to adjust the distinction between low and high SR given the overall lower rates. However, we would like to refrain from it, as i) we consider it quite arbitrary to define another criterium and ii) we would like to avoid any apparent cherry-picking bias.

      Finally, yes, of course, the -58 mV represent the liquid junction potential corrected holding potential. Our average IHC whole-cell Vhalf ICa (-38.86 mV for high SR and -37.60 mV for low SR) compares well with previous reports of average whole-cell Vhalf ICa (-35.44 mV) and average synaptic Vhalf Rhod-FF (-41.15 mV) (Özçete and Moser, 2021). Additionally, our Vhalf QEPSC distribution (ranging from -53.97 to -31.72 mV) also compares well with the Vhalf iGluSnFR distribution (ranging from -45.25 to -29.86 mV) obtained by imaging of synaptic glutamate release (Özçete and Moser, 2021).

      2) EPSCs amplitude distributions in Figure 2 seem very different from those reported before by Grant et al., 2010 and Niwa et al., 2021 (even Chapochnikov et al., 2014; although not sure if the animal ages match). The average amplitudes of EPSCs reported here, for either pillar or modiolar SGNs, seem way smaller than those reported previously. The authors should provide a convincing explanation for this critical deviation from the consensual results.

      Please refer to our response to the public review (point #3).

      3) Rise time analysis in Fig. 2 supp 1. The actual values seem too long, again, compared to reported values. Also, what would these differences between modiolar and pillar represent?

      Previous reports on mouse, rat, turtle and bullfrog focused mainly on the rise times (or time to peak) of monophasic EPSCs: about 0.39 ms (p8-p11 mouse; Chapochnikov et al., 2014, Takago et al., 2019), 0.33-0.58 ms (p7-p14 rat; Yi at al., 2010, Grant et al., 2010, Glowatzki and Fuchs, 2002), 0.17-0.29 ms (p15-p21 rat; Chapochnikov et al., 2014, Huang and Moser, 2018, Grant et al., 2010), 0.1-0.2 ms (turtle auditory papilla; Schnee et al., 2013) and 0.15-0.2 ms (bullfrog 31ºC and 22ºC; Li et al., 2009, Chen and von Gersdorff, 2019). Regarding multiphasic EPSCs, some studies have reported rise times (or times to peak) of about 1.5 ms (p8-p11 mouse; Takago et al., 2019), 1.1 ms (p8-p11 rat; Grant et al., 2010) and 0.6-0.8 ms (p15-p21 rats; Huang and Moser, 2018, Chapochnikov et al., 2014, Grant et al., 2010). When we factor in the waveform of the sEPSCs, our rise times are comparable to the literature:

      Author response table 1.

      Thus, IHC synapses with higher SR and predominantly located at the pillar side appear to have sEPSCs with faster rise times regardless of their waveform. This might be a consequence of the fusion kinetics of the synaptic vesicles which are tightly influenced by the Ca2+ influx (Huang and Moser, 2018). Additionally, differences in the composition and density of the postsynaptic AMPA receptors could play a role in the rise time of the EPSC (Rubio et al., 2017). 

      4) One of the most impressive observations of the in-vivo SGN physiology is the difference in sound threshold among specific fibers. This can vary over tens of dB of sound pressure levels.

      The representation of this phenomenon when using an ex-vivo preparation is not obvious. Overall, it has been reported that IHC Vm is a good proxy for stimulus intensity. Consequently, the authors reported an 'IHC Vm threshold' at the start of SGN synaptic activity for each recording. This can be found in Figure 3 Eii, where values vary between -65 to -30 mV. This is already an important finding. However, the representative traces on panel A only diverge by 5 mV. It would be very interesting to the reader to have represented in the figure recordings that can better illustrate this wide range of values.

      We agree with the reviewer regarding the impressive difference in the sound thresholds recorded in vivo. To illustrate better illustrate our findings, we have chosen a different representative trace for the high SR synapse.

      5) On the masker-probe experiments it would be interesting to look at the synaptic delay of the probe pulses. Are they different between high and low SR synapses?

      We have now included the results of the synaptic delay of the probe response (Figure 4– supplementary figure 1). Despite not being statistically significant, the eEPSC probe latency of high SR is on average faster than low SR.

      Reviewer #3 (Recommendations For The Authors):

      (1) The terms monophasic and compact are used interchangeably. This is fine, but perhaps compact could be defined earlier, otherwise, readers may think that 'compact' means 'short' (as is sometimes euphemistically used to describe short people), which then makes phrasing such as the figure legend for figure 2 a bit confusing. This could be included at first use in a figure as well, in figure 1B where the two types of EPSCs are first shown.

      Done, now explained and preferentially used monophasic.

      (2) Check for mention of figure panels in the results text - for example, there is no mention in the results text of figure 2A, 2I,

      Done

      (3) The locations of some of the statistics are inconsistent. This is fine if the authors have a reason for including the stats where they did, but in some cases, the stats are duplicated (for example figure 2J, 2K, 2L, the stats are in both the figure legend and the results text, then check throughout).

      Done

      (4) The color coding in figure 4 is confusing in panel A - does orange still mean a high SR fiber here? The legend indicates that orange is for EPSCs, but does not specify charge. It could be helpful to show both a high and low SR response, both for EPSCs and for charge. 

      Thanks for pointing us to this aspect: we have carefully revised the figure and figure legend for clarity. We also included an exemplary response of a low SR synapse in the figure.

    1. Reviewer #1 (Public review):

      Summary:

      The circuit mechanism underlying the formation of grid cell activity and the organization of grid cells in the medial entorhinal cortex (MEC) is still unclear. To understand the mechanism, the current study investigated synaptic interactions between stellate cells (SC) and PV+ interneurons (IN) in layer 2 of the MEC by combing optogenetic activations and paired patch-clamp recordings. The results convincingly demonstrated highly structured interactions between these neurons: specific and direct excitatory-inhibitory interactions existed at the scale of grid cell phase clusters, and indirect interactions occurred at the scale of grid modules.

      Strengths:

      Overall, the manuscript is very well written, the approaches used are clever, and the data were thoroughly analyzed. The study conveyed important information towards understanding the circuit mechanism that shapes grid cell activity. It is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor network and neural circuit.

      Weaknesses:

      The study largely relies on the fact that ramp-like wide field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. While the asynchronization of action potentials during ramp-like wide field optogenetics was shown in Figure 2 Figure Supplement 1, the asynchronization during focal optogenetic activation was not confirmed in the current experimental setting. More data and statistical analysis in this aspect would strengthen the foundation of this study.

    2. Reviewer #3 (Public review):

      Summary:

      This paper presents convincing data from technically demanding dual whole cell patch recordings of stellate cells in medial entorhinal cortex slice preparations during optogenetic stimulation of PV+ interneurons. The authors show that the patterns of postsynaptic activation are consistent with dual recorded cell close to each other receiving shared inhibitory input and sending excitatory connections back to the same PV neurons, supporting a circuitry in which clusters of stellate cells and PV+IN interact with each other with much weaker interactions between clusters. These data are important to our understanding of the dynamics of functional cell responses in the entorhinal cortex. The experiments and analysis are quite complex and would benefit from some revisions to enhance clarity.

      Strengths:

      These are technically demanding experiments, but the authors show quite convincing differences in the correlated response of cell pairs that are close to each other in contrast to an absence of correlation in other cell pairs at a range of relative distances. This supports their main point of demonstrating anatomical clusters of cells receiving shared inhibitory input.

      Weaknesses:

      The overall technique is complex, but the authors have made every effort to present this in a clear manner. In addition, due to this being a slice preparation they cannot directly relate the inhibitory interactions to the functional properties of grid cells which was possible in a complementary approach using 2-photon in vivo imaging by Heys, Rangarajan and Dombeck, 2014.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Overall, the manuscript is very well written, the approaches used are clever, and the data were thoroughly analyzed. The study conveyed important information for understanding the circuit mechanism that shapes grid cell activity. It is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor networks and neural circuits.

      We appreciate the positive comments.

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. However, it is unclear what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. More detailed information/statistics about the asynchronization of SC activity is necessary for interpreting the results.

      The percentage of SCs that show synchronised activity during ramping optogenetic activation is zero. To make this clear we've added new quantification to the analyses of simultaneously activated SCs in Figure 2, Figure Supplement 1. This includes confidence intervals for the correlograms and statistical comparisons of the correlograms to shuffled data from each pair of neurons. We also validate our statistical analysis strategy by showing that it successfully identifies autocorrelation peaks for the same cells.

      Synchronisation during focal optogenetic activation is also expected to be zero. We did not commit resources to experiments to directly test this for focal stimulation because we had already tested the possibility with ramping stimuli discussed above, and because the established biophysics of local SC circuits is such that synchronised activity during selective activation of SCs is unlikely. In particular, because direct excitatory connections between SCs are either rare or absent (Fuchs et al. 2016; Couey et al. 2013; Pastoll et al. 2013; Winterer et al. 2017), and when detected have small amplitude (Winterer et al. 2017), no mechanism exists that could drive synchronisation. The absence of coordination in responses to ramping stimuli quantified above is consistent with this conclusion.

      (2) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. However, the evidence supporting this "direct interaction" between these two cell types is missing. Is it possible that pyramidal cells are also involved in this interaction? Some pieces of evidence or discussions are necessary to further support the "direction interaction".

      We were insufficiently clear in our previous attempts to ground these interpretations in the context of previous work. The hypothesis about "direct excitatory-inhibitory" interactions wasn't made solely on the basis of Figure 4, but from multiple previous studies that directly demonstrate these interactions (e.g. Fuchs et al. 2016; Couey et al. 2013; Pastoll et al. 2013). Similarly, the diagram in Figure 8 doesn't only reflect the conclusions of the present study but integrates work from these and other previous studies.

      A possible role for pyramidal cells in coordination would require that they can be driven to fire action potentials by input from SCs. However, SCs appear not to connect to pyramidal cells (0/126 tested connections in Winterer et al. 2017). Thus, this possibility is inconsistent with the previously published data.

      To make these points clearer we have added additional discussion and citations to the results (p 5), discussion (p 11) and legend to Figure 8.

      Reviewer #2 (Public Review):

      In this study, Huang et al. employed optogenetic stimulation alongside paired whole-cell recordings in genetically defined neuron populations of the medial entorhinal cortex to examine the spatial distribution of synaptic inputs and the functional-anatomical structure of the MEC. They specifically studied the spatial distribution of synaptic inputs from parvalbumin-expressing interneurons to pairs of excitatory stellate cells. Additionally, they explored the spatial distribution of synaptic inputs to pairs of PV INs. Their results indicate that both pairs of SCs and PV INs generally receive common input when their relative somata are within 200-300 ums of each other. The research is intriguing, with controlled and systematic methodologies. There are interesting takeaways based on the implications of this work to grid cell network organization in MEC.

      We appreciate the positive comments.

      (1) Results indicate that in brain slices, nearby cells typically share a higher degree of common input. However, some proximate cells lack this shared input. The authors interpret these findings as: "Many cells in close proximity don't seem to share common input, as illustrated in Figures 3, 5, and 7. This implies that these cells might belong to separate networks or exist in distinct regions of the connectivity space within the same network.".

      Every slice orientation could have potentially shared inputs from an orthogonal direction that are unavoidably eliminated. For instance, in a horizontal section, shared inputs to two SCs might be situated either dorsally or ventrally from the horizontal cut, and thus removed during slicing. Given the synaptic connection distributions observed within each intact orientation, and considering these distributions appear symmetrically in both horizontal and sagittal sections, the authors should be equipped to estimate the potential number of inputs absent due to sectioning in the orthogonal direction. How might this estimate influence the findings, especially those indicating that many close neurons don't have shared inputs?

      We appreciate the suggestion, however systematically generating estimates that account in full for the relative position of the postsynaptic neurons, for variation in the organisation of their dendritic fields and for unknowns such as the location and number of synaptic contacts made, quickly leads to a large potential parameter space, while not advancing our understanding beyond qualitative assessment of the raw data.

      Given this, we make the following comments:

      'We note that the absence of correlated inputs in one slice plane does not rule out the possibility that the same cell pair receives common inputs in a different plane, as these inputs would most likely not be activated if the cell bodies of the presynaptic neuron were removed by slicing.' (p10) and:

      'The incompleteness may in part result from loss of some inputs by tissue slicing. However, the fact that axons were well preserved and typically extended beyond the range of functional correlations, while many cell pairs that did not receive correlated input were relatively close to one another and had overlapping dendritic fields, argues against tissue slicing being a major contributor to incompleteness.' (p10).

      (2) The study examines correlations during various light-intensity phases of the ramp stimuli. One wonders if the spatial distribution of shared (or correlated) versus independent inputs differs when juxtaposing the initial light stimulation phase, which begins to trigger spiking, against subsequent phases. This differentiation might be particularly pertinent to the PV to SC measurements. Here, the initial phase of stimulation, as depicted in Figure 7, reveals a relatively sparse temporal frequency of IPSCs. This might not represent the physiological conditions under which high-firing INs function.

      While the authors seem to have addressed parts of this concern in their focal stim experiments by examining correlations during both high and low light intensities, they could potentially extract this metric from data acquired in their ramp conditions. This would be especially valuable for PV to SC measurements, given the absence of corresponding focal stimulation experiments.

      As the reviewer's comments recognise, the consistent results with focal stimulation already provide direct experimental validation to our ramp stimulation approach. We appreciate the suggestion for further analysis, but as we understand it this analysis would be hard to interpret. First, variation between pairs in the activity at different phases of the light ramp will be confounded by slice to slice differences in the level of ChR2 expression, e.g. in Figure 2, Figure Supplement 1 within slice variability is low, whereas between slice variation is relatively high. This is because in slices with relatively low expression spike onset is relatively late, while in slices with relatively high expression spike onset is early in the ramp and later in the ramp neurons experience depolarising block. Second, the onset of changes in cross-correlation coefficients and lag variation is typically abrupt. This makes it challenging to assign windows to onset phases or to interpret the resulting data.

      (3) Re results from Figure 2: Please fully describe the model in the methods section. Generally, I like using a modeling approach to explore the impact of convergent synaptic input to PVs from SCs that could effectively validate the experimental approach and enhance the interpretability of the experimental stim/recording outcomes. However, as currently detailed in the manuscript, the model description is inadequate for assessing the robustness of the simulation outcomes. If the IN model is simply integrate-and-fire with minimal biophysical attributes, then the findings in Fig 2F results shown in Fig 2F might be trivial. Conversely, if the model offers a more biophysically accurate representation (e.g., with conductance-based synaptic inputs, synapses appropriately dispersed across the model IN dendritic tree, and standard PV IN voltage-gated membrane conductances), then the model's results could serve as a meaningful method to both validate and interpret the experiments.

      We have expanded the description of the modelling given in the methods including clearer motivation and justification (p 15). Two points are helpful to consider:

      First, the goal of the model is to assess the feasibility of the correlation based approach given the synaptic current responses recorded at the soma. We now make this clearer by stating that:

      'The goal of our simulations was to assess if analysis of cross-correlations between currents recorded from pairs of neurons could be used to establish whether they receive shared input from the same pre-synaptic neuron. While this should be obvious if neurons exclusively receive shared input, we wanted to establish whether shared input is detectable when each neuron also receives independent inputs of similar frequency and amplitude to the shared input.' (p 15).

      The suggestion that the results in Figure 2F are trivial doesn't make sense to us. Indeed, it strikes us as non-trivial that with this approach shared input from a single common presynaptic neuron is not detectable, but input from two or more is.

      Second, because we are simulating a somatic voltage-clamp experiment the details of the neuronal time constants, voltage-gated channels or other integrative mechanisms that reviewer suggests may be important here are not actually relevant to the interpretation. To appreciate this consider the membrane equation:

      When the membrane is clamped at a fixed potential, there is no capacitance current , while voltage-dependent ionic currents and the resting ionic current are constant. In this case the only time varying current is the synaptic current . Thus, adding more details would not make the model more 'meaningful' as these details would be redundant and the results will be the same as simply considering convolution of the synaptic conductances. We have made this rationale clearer in the revised methods (p 15).

      Reviewer #3 (Public Review):

      These are technically demanding experiments, but the authors show quite convincing differences in the correlated response of cell pairs that are close to each other in contrast to an absence of correlation in other cell pairs at a range of relative distances. This supports their main point of demonstrating anatomical clusters of cells receiving shared inhibitory input.

      We appreciate the positive comments.

      The overall technique is complex and the presentation could be more clear about the techniques and analysis.

      Thanks. We've added additional explanation to the methods section to try to improve clarity (p 15-16).

      In addition, due to this being a slice preparation they cannot directly relate the inhibitory interactions to the functional properties of grid cells which was possible in the 2-photon in vivo imaging experiment by Heys and Dombeck, 2014.

      We agree the two approaches are complementary. The Heys and Dombeck study could only reveal correlations in functional activity, which could have many possible synaptic mechanisms, whereas our results address synaptic organisation but the representational roles of the specific neurons we recorded from are unclear. We have highlighted these current limitations and strategies to address them in the final paragraph of the discussion (p 11).

    1. eLife Assessment

      This manuscript tackles a significant problem in addiction science: how interdependent are measures of "addiction-like" behavioral phenotypes? The manuscript provides compelling evidence that, under these experimental conditions, escalation of intake, punishment-resistant responding, and progressive ratio break points reflect a single underlying construct rather than reflect distinct unrelated measures. The exceptionally large sample size and incorporation of multiple behavioral endpoints add strength to this paper, and make it an important resource for the field.

    2. Reviewer #1 (Public review):

      Summary:

      Guglielmo et al. characterized addiction-like behaviors in more than 500 outbred heterogeneous stock (HS) rats using extended access to cocaine self-administration (6 h/daily) and analyzed individual differences in escalation of intake, progressive-ratio (PR) responding, continued use despite adverse consequence (contingent foot shocks), and irritability-like behavior during withdrawal. By principal component analysis, they found that escalation of intake, progressive ratio responding, and continued use despite adverse consequences loaded onto the same factor, whereas irritability-like behaviors loaded onto a separate factor. Characterization of rats in four categories of resilient, mild, moderate, and severe addiction-like phenotypes showed that females had higher addiction-like behaviors, particularly due to a lower number of resilient individuals, than males. The authors suggest that escalation of intake, continued use despite adverse consequences, and progressive ratio responding are highly correlated measures of the same psychological construct and that a significant proportion of males, but not females may be resilient to addiction-like behaviors. The amount of work in this study is impressive, and the results are interesting.

      Strengths: Large dataset. Males and females included.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper by de Guglielmo and colleagues, the authors were interested in analyzing addiction-like behaviors using a very large number of heterogeneous outbred rats in order to determine the relationships among these behaviors. The paper used both males and females on the order of hundreds of rats, allowing for detailed and complex statistical analyses of the behaviors. The rats underwent cocaine self-administration, first via 2-hour access and then via 6-hour access. The rats also underwent a test of punishment resistance in which footshocks were administered a portion of the times a lever was pressed. The authors also conducted a progressive ratio test to determine the break point for "giving up" pressing the lever and a bottle-brush test to determine the rats "irritability". Ultimately, principal component analysis revealed that escalation of intake during 6-hour access, punishment resistance, and breakpoint all loaded onto the same principal component. Moreover, the authors also identified a subgroup of "resilient" rats that qualitatively differed from the "vulnerable" rats and also identified sex differences in their work.

      Strengths:

      The use of heterogeneous rats and the use of so many rats are major strengths for this paper. Moreover, the statistical analyses are particular strengths as they enabled the identification of the three measures as likely reflecting a single underlying construct. The behavioral methods themselves are also strong, as the authors used behavioral measures commonly used in the field that will enable comparison with the field at large. In general, the results support the conclusions and provide a wealth of data to the field. The addition of effect sizes is also a strength, as this provides critical information to other researchers.

      Additionally, the changes made to the manuscript are another strength, as the authors clearly took the reviewers' points seriously and made strong efforts incorporate the reviewers' ideas.

      The manuscript also uses both males and females and provides a good analysis of how findings differed by sex as well as how large the effect sizes were for those differences.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) Adding page numbers would have helped the reviewers. 

      We apologize for the oversight and have added page numbers for the revision.

      (2) Page 2, second paragraph: please do not generalize. Also, this sentence is confusing: "the addiction neuroscience field has moved from recognizing that "compulsive drug seeking/use" and "continued seeking/use despite negative consequences" are two distinct aspects of addiction to defining the former nearly exclusively by the latter in animal models." 

      We acknowledge that the sentence in question may have been unclear. We have revised the introduction to avoid generalizations and improve clarity to read:

      “Recently, the preclinical addiction field has moved from recognizing compulsive drug seeking/use and continued seeking/use despite negative consequences as two distinct aspects of addiction, to examining compulsive-like behavior nearly exclusively by models of continued seeking/use despite negative consequences.”

      In the revised introduction, we have focused on the specific aims and findings of our study, emphasizing the use of a large, genetically diverse sample and an extended drug access paradigm to better model addiction-like behaviors. We have also clarified the relationship between the different measures of addiction-like behavior and the potential role of sex differences in resilience to these behaviors.

      (3) Again here, please do not generalize: "While these three behaviors capture different aspects of addictionlike behaviors, a pervasive view in the field is that the only way to identify an individual with an addiction phenotype is to measure continued drug use despite adverse consequences." This is not unanimous in the addiction field. Same on page 21. 

      We have revised the sentence to avoid generalizing and to acknowledge that this perspective is held by some researchers, rather than presenting it as a pervasive view. We have also included relevant citations to support this point. This sentence now reads:

      “These measures are thought to capture different aspects of addiction-like behaviors. Some researchers argue that continued drug use despite adverse consequences is the most critical measure for identifying an addiction phenotype, as it reflects the compulsive nature of drug use (Deroche-Gamonet et al., 2004; Vanderschuren and Everitt, 2004)”

      (4) This sentence needs citations: "A key argument in favor of this hypothesis is that responding despite adverse consequences is sometimes uncorrelated to drug taking/seeking." 

      We have added references (Chen et al., 2013; Domi et al., 2021; Giuliano et al., 2019; Li et al., 2021; Siciliano et al., 2019; Timme et al., 2022; Belin et al., 2008; Pelloux et al., 2007) that provide evidence for this assertion. These studies demonstrate that individual differences in responding despite adverse consequences can be dissociated from drug intake and seeking behaviors, suggesting that they may measure distinct aspects of addiction-like behaviors.

      (5) Page 4: what is "an advanced model?" (also on page 22). Change "characterization" to "characterized." Delete "as much as possible" in "as much genetic diversity as possible." 

      These have been addressed

      (6) Page 7, statistical analysis: PCA needs to be explained further. Was the PCA varimax rotated, normalized, eigenvalues, etc. Was this used to find "latent variables?" (PCA versus factor analysis) 

      It was a principal component analysis (PCA), deriving components that are a linear combination of the original variables, with the following coefficients for the first two components, which were added in the results for PC1:

      Author response table 1.

      The PCA was performed in R with prcomp in the stats package, using centering and scaling, which was added in the methods section. No orthogonal loadings rotation (varimax) was used. The eigenvalues of the PCs are 1.9, 1.0, 0.7, 0.4 and explain variance as shown in the scree plot:

      Author response image 1.

      (7) Page 9: correct "an indexes." 

      This was corrected as ‘indexes’

      (8) Figure 1 legend: correct "test at the." 

      Corrected to ‘tested’

      (9) Page 17: rewrite "except for the low addicted one." 

      Done

      (10) Page 19: delete "state-of-the-art." Intravenous self-administration is not new. 

      Done

      (11) Page 20: replace "abuse" with cocaine use disorder. 

      Done

      (12) Page 20: The distinction of qualitative and quantitative differences between males and females is inaccurate given that resilient and vulnerable groups were arbitrarily defined by quantitative differences. 

      This distinction between quantitative and qualitative was removed.

      (13) The discussion about DSM-V criteria is "over the top" and unnecessary. One cannot determine whether rodents took more drugs than intended, made efforts to quit, etc. 

      This discussion was toned down and shortened, as this is not the focus of the manuscript.

      (14) Page 21: The discussion about small n and the test of nondependent rats should also be toned down and it is incomplete. There are several behavioral and pharmacological studies that indicate that different measures may capture, at least to some degree, different aspects of behavior in alcohol and opioid-dependent rodents (e.g., PMID: 28461696; PMID: 25878287; PMID: 36683829). 

      the discussion has been toned down and expanded as suggested by the reviewer.

      Now it reads: 

      “A possible explanation as to why previous studies failed to observe this correlation between escalation, motivation, and aversion-resistance is that most of the previous studies used small sample sizes that may not provide sufficient statistical power to observe this relationship between variables. Another explanation is that previous studies often used animal models with limited access to the drug, where animals exhibit low levels of acute intoxication and very little, if any, signs of drug dependence (George et al., 2022). However, it is important to note that several behavioral and pharmacological studies have indicated that different measures may capture, at least to some degree, different aspects of addiction-like behavior in alcohol and opioid-dependent rodents (Aoun et al., 2018; Barbier et al., 2015; Marchette et al., 2023). While the present results suggest that escalation of drug intake highly predicts drug responding despite adverse consequences in an animal model with long access to cocaine and evidence of drug dependence, further research is needed to determine the extent to which these findings generalize to other drugs of abuse and different stages of the addiction cycle”.

      (15) Several factors should be considered for explaining their PCA findings. The progression of the progressive ratio (too steep, not steep enough), the shock intensity (too low, too high), the contingency of the shock (high or not high enough), the cocaine unit dose, the use of multiple punishment sessions (learning; the first session is likely to reflect the previous session, same for PR) etc, all could affect the outcomes. Not finding differences in one dataset (even large ones) obtained from a particular experimental design does not necessarily mean that these differences do not exist. 

      Thank you for raising this important point about the potential impact of experimental factors on our PCA findings. We now acknowledge in the discussion (pages 22-23) that several factors, such as the progression of the progressive ratio schedule, shock intensity, contingency of the shock, cocaine unit dose, and the use of multiple punishment sessions, could influence the outcomes of our analysis. Now it reads: 

      “It is important to acknowledge that several experimental factors could influence the outcomes of the PCA analysis. These factors include the schedule of reinforcement, the progression of the progressive ratio schedule, the shock intensity, the contingency of the shock, the cocaine unit dose, and the use of multiple punishment sessions (Belin et al., 2008; Deroche-Gamonet et al., 2004; Pelloux et al., 2007). In particular, learning effects may play a role when animals undergo multiple punishment or progressive ratio sessions. An animal's response to punishment or its performance in progressive ratio sessions may change over time as it learns from its previous experiences (Marchant et al., 2013; Vanderschuren et al., 2017). While the present study utilized a large dataset obtained from a particular experimental design, it is essential to acknowledge that not finding differences in one dataset does not necessarily mean that these differences do not exist. Future studies should investigate the impact of these experimental factors, including learning effects, on the relationship between escalation, motivation, and aversion-resistance to further elucidate the underlying constructs of addiction-like behaviors.”

      (16) Related to the above, another reason for all "consummatory variables" to load onto the same factor can be due to the selection of the variables. For example, the inclusion of all ShA and LgA access sessions makes the PCA much less powerful. In fact, these many similar variables would make the PCA less powerful in a large dataset than a much smaller dataset that includes fewer variables in the PCA. The authors should attempt to avoid redundant variables in the PCA (all ShA and all LgA sessions). Perhaps use the average of the last three sessions of each ShA and LgA (or the slope of the escalation curve for LgA), or not even include ShA. They should also attempt PCAs without the irritability test. It is very common to find clusters of variables pertaining to the same tests (i.e., all consummatory variables clustered together, and all irritability measures clustered together in an independent factor. 

      For the PCA in figure 4E, only 4 variables were included: the Z-scores for A) escalation (calculated as the average intake of the last three long-access sessions, similar to the average or slope of the escalation curve as suggested by the reviewer), B) motivation (intake under progressive ratio), C) compulsivity (continued responding despite adverse consequences), and D) irritability. This approach aimed to minimize redundancy in the variables and focus on key measures of addiction-like behaviors.

      To further address the reviewer's concern, we performed an additional PCA on the same 377 animals, excluding the irritability index. This PCA included only the escalation, motivation, and compulsivity indices. The results of this analysis (Figure S3A) were consistent with our original findings, with the three variables loading similarly (>+1 standard deviation) onto factor 1 explaining 63.5% of the variance in addiction-like behaviors." This analysis was added as supplementary figure S3A.

      (17) Also related to the above, males and females may behave differently, sometimes in opposite directions, thus "cancel each other out." The authors should take advantage of their huge sample size and do PCAs separately for males and females to learn more about potential sex differences in behavioral constructs. 

      First, we looked at male vs. female differences in the biplot represented in Fig. 4E, which included the irritability index. This analysis showed no sex differences and was added as supplemental figure S3B. 

      Next, we ran the PCA analysis on males (left panel) and females (right panel) separately, which revealed a difference in the relationship between the Compulsivity Index and the other variables. In males, the Compulsivity Index separated from the escalation and motivation indices in the opposite direction relative to PC2 compared to females. Additionally, in males, compulsivity became more positively correlated with irritability, while in females, the relationship was opposite. These interpretations were added to the discussion page 21 and the results were included in the Supplemental Figure S3 C-D. The discussion was updated accordingly.

      (18) Figure 3 legend. There is no correlation in the figure. 

      This was intended to summarize that vulnerable animals, as defined with a high intake in the last 3 LgA sessions are also more vulnerable in the other measures, but was removed to avoid further confusion.

      (19) Page 22: the authors contradict themselves: "The evaluation of different addiction-like behaviors is important as multiple elements of addiction vulnerability were found to be independently heritable (Eid et al., 2019), and likely controlled by distinct genes that remain to be identified." 

      we agree with the reviewer, and we edited the discussion to clarify the relationship between the current findings and the potential for distinct genetic influences on different aspects of addiction vulnerability. The text now reads: 

      “The evaluation of different addiction-like behaviors is important, as previous research has suggested that multiple elements of addiction vulnerability may be independently heritable (Eid et al., 2019). While our current findings indicate that escalation, motivation, and compulsivity are highly correlated and load onto a single construct in our model, it is possible that distinct genes contribute to different aspects of addiction vulnerability. The high correlation between these behaviors in our study may reflect common underlying genetic influences, but it does not preclude the existence of additional, unique genetic factors that shape specific aspects of addiction-like behavior. Further research is needed to identify the specific genes that contribute to the overall construct of addiction vulnerability, as well as those that may influence distinct behavioral elements. The behavioral characterization of HS rats in this study provides a foundation for future genome-wide association studies (GWAS) aimed at identifying specific alleles and genes that contribute to vulnerability and resilience to cocaine addiction-like behavior (Chitre et al., 2020).”

      Reviewer #2:

      (1) I strongly suggest the authors include effect sizes. They are likely correct that many studies using rats during self-administration are underpowered, but because it is unlikely that most studies will use over 500 rats, the effect size information would be beneficial for future researchers. That is, if an effect requires 100 rats per group, this would be critical to know. 

      Standardized effect sizes (Cohen d and 95% confidence intervals) were included for the sex differences, intake group differences, and addiction groups. Moreover, a statement about the required amount of animals needed to detect significant effects was added in the discussion.

      (2) I suggest that the authors tone down the portions of the Discussion that appear to be defenses of the extended access model. The data in this paper do not address short vs. long-access in a way that supports that. Moreover, they should acknowledge some of the ways that I noted above in which the short access period seems to be just as predictive as the long-access. It raises the question of whether keeping another group of rats on short access through all 25 days would have led to some of the same outcomes that were observed. 

      This discussion was toned down and shortened, as this is also not the focus of the manuscript (see also response to reviewer 1’s 14th comment).

      We appreciate the reviewer's comment on the potential predictive value of the short access period for addiction-like behaviors. We agree that maintaining a group of rats on short access throughout the 25 days could have provided valuable insights into the development of these behaviors, particularly in light of the individual differences observed in our genetically diverse HS rat population. As we mention also in our response to Reviewer 3 (comment 5), the observed escalation of drug intake during the short access condition in our study may be attributed to the genetic diversity of the HS rat population. To address this important point, we have added a new paragraph in the Discussion section that elaborates on this observation:

      "It is important to note that while our study focused on the differences between resilient and vulnerable rats under long access conditions, the short access period may also be predictive of addiction-like behaviors, particularly in genetically diverse populations. The observed escalation of drug intake during short access in our study is not due to an acquisition issue, as rats start differentiating the active from the inactive levers on the first day of ShA1 (Fig. S1B), with a 3 to 1 ratio between active/inactive pressing by ShA7. Rather, this early escalation may be attributed to the individual differences in drug-taking behavior among the HS rats, highlighting the importance of using genetically diverse animals to capture the full spectrum of individual differences in addiction-like behaviors."

      (3) I suggest the authors explain how the dosing was maintained across the self-administration period. I also suggest that the authors provide figures that show mg/kg of cocaine consumed for each day, rather than just infusions per day. This would be especially helpful for the sex difference claims. 

      To ensure consistent dosing, animals were weighed weekly to adjust the drug solution concentration, rounded to the nearest ten grams. This sentence was added to the methods section. Each infusion is 0.5 mg/kg, so the amount the animals consumed = number of infusions x 0.5 mg/kg. Moreover, a second axis with the dose in mg/kg of cocaine consumed was added to the escalation curves in figures 1B, 2A, 3A, 5A, and S1A.

      (4) Throughout the paper, and especially the 2nd paragraph of the Introduction, the authors make a number of assertions for which they should provide references. 

      We have carefully reviewed the manuscript and have now included relevant references to ensure that all statements are properly supported by the existing literature.

      (5) Likewise, with the Discussion about sex and gender differences, I suggest a more nuanced and better-cited discussion. Many rodent studies with self-administration have not identified sex differences, though this often gets under-noticed as the titles and abstracts do not mention the lack of effects. The support for gender differences in humans in terms of vulnerability to cocaine use disorder, beyond that men have higher rates, is thin and this section should be modified.

      The section was modified with additional references and linked to the newly introduced effect sizes for sex differences.

      (6) I also suggest the authors change some of the language such as referring to their behavioral measures as "state of the art". Extended access has been around for over two decades.

      This has been adjusted, also see response to reviewer 1’s comments on page 4, 19, and 22.

      Reviewer #3:

      Strengths: 

      (1) The number of animals run through this study is particularly impressive and allows for analyses that cannot be done with smaller cohorts. 

      (2) The inclusion of males and females in a study of this size allows for a better understanding of potential sex differences across a range of behavioral domains. 

      (3) Relating these measures to each other is incredibly important. If they are all measuring the same thing this would have important implications for the field. 

      Weaknesses: 

      (1) The authors claim that escalation of intake, increased motivation under progressive ratio, and responding despite negative consequences can all be explained by the same psychological construct, which they conclude is predictive of an addiction-like phenotype. However, previous research has demonstrated that the aforementioned behavioral measures highly correlate with the rate at which animals lever press to receive a reinforcer. For example, animals that have higher baseline rates of behavior will also be less sensitive to punishment and will press more on a PR. In fact, early behavioral pharmacology work from Peter Dews showed that the same is true for drug effects on behavior, where the same drug has less of a behavioral effect with behavior was maintained on a schedule that resulted in higher response rates. This is not ruled out and actually could explain the results in a parsimonious way. This is not highlighted or mentioned in the manuscript. 

      Thank you for raising this important point about the potential influence of baseline response rates on the observed correlations between addiction-like behaviors. We agree that individual differences in baseline response rates may contribute to the relationships we observed, and we have added a paragraph to the discussion acknowledging this possibility (see page 22). We now discuss how previous research has shown that animals with higher baseline rates of responding tend to be less sensitive to punishment and exhibit higher levels of responding under progressive ratio schedules, as demonstrated in early behavioral pharmacology work by Dews and others (Dews, 1955; Sanger and Blackman, 1976). While our findings suggest that escalation of intake, motivation, and responding despite negative consequences can be explained by a single psychological construct related to addiction vulnerability, we cannot rule out the influence of baseline response rates. We have highlighted the need for future studies to investigate the relationship between baseline response rates and addiction-like behaviors to further clarify the underlying mechanisms

      (2) The authors draw major conclusions from data collected using only one dose of cocaine. Can the authors comment on how the dose of cocaine was selected? Although the majority of the animals maintained responding to the drug, one finding of the manuscript claims that roughly 20% of animals were resilient to developing an addiction-like phenotype. The differences observed could simply be a result of selecting too high or too low of a dose per infusion. 

      We selected a dose of 0.5 mg/kg/infusion of cocaine for our study based on our and others previous literature demonstrating that this dose is commonly used in rat self-administration studies and is effective in producing addiction-like behaviors (de Guglielmo et al. 2017, Kallupi et al. 2022, Kononoff et al. 2018, Sedighim et al. 2021, Ahmed and Koob, 1998; Deroche-Gamonet et al., 2004; Belin et al., 2009). This dose has been shown to maintain stable responding and induce escalation of intake, motivation, and compulsive-like responding in a significant proportion of animals (Ahmed and Koob, 1998; DerocheGamonet et al., 2004; Belin et al., 2009).

      (3) In line with the previous comment, rats self-administered cocaine under one schedule of reinforcement and were exposed to only one, mild, foot shock intensity. Although a large number of animals were used, it is difficult to translate these results to understand patterns of drug intake in humans. 

      We appreciate the reviewer's comment on the limitations of using a single schedule of reinforcement and a single foot shock intensity in our study. We acknowledge that these factors may limit the direct translatability of our findings to patterns of drug intake in humans. As mentioned in our response to Reviewer 1 (comment 15), we have now added a paragraph to the discussion (pages 22-23) addressing the potential impact of various experimental factors on our PCA findings. These factors include the schedule of reinforcement, the progression of the progressive ratio schedule, shock intensity, contingency of the shock, cocaine unit dose, and the use of multiple punishment sessions. We acknowledge that the specific parameters used in our study may have influenced the observed individual differences in addiction-like behaviors and that different results might be obtained under different experimental conditions. To further address the current reviewer's concern, we would like to emphasize that our study aimed to investigate individual differences in addiction-like behaviors within a specific experimental context, rather than directly modeling the complex patterns of drug intake in humans. While our findings provide valuable insights into the relationship between different addiction-like behaviors in rats, we agree that additional studies using a range of experimental conditions are needed to fully understand the extent to which these findings translate to human drug use patterns. Future studies could investigate the impact of different schedules of reinforcement, shock intensities, and other experimental parameters on the development and expression of addiction-like behaviors in the HS rat population. Such studies would help to determine the generalizability of our findings and provide a more comprehensive understanding of the factors influencing individual differences in addiction vulnerability.  

      (4) It is unclear how a principal component analysis, which includes irritability-like behavior, was conducted when the total number of animals used for behavior is nearly half the number of animals used for drugintake behaviors. The authors should expand on the PCA methodology and explain how that is not a problem for the PCA method that is used. 

      The PCA (Figure 4E) can only be performed using animals that had the data for all measures, including irritability. Since not all animals were tested for irritability-like behaviors the PCA was performed on those 377 animals who had behavioral measures for all variables. Once irritability was excluded as a measure, the larger animal set could be used (including the animals missing irritability measurements). This was clarified in the text and figure legend, where animal numbers were added.

      (5) It is surprising that the authors observed an escalation of drug intake during the short access condition (Fig. 1B, 2A, 3A, 5A). Previous literature has demonstrated that animals with short access to cocaine maintain stable and low intake, even when tested daily for weeks. Can the authors comment on this discrepancy? Are these animals still acquiring the task during this period? 

      We were indeed surprised by the fact that some individuals started escalating their intake early on during short access, as most of the literature shows that short access leads to stable intake. However, we have some hypotheses that may explain this phenomenon. It is unlikely that this early escalation is due to an acquisition issue as rats start differentiating the active from the inactive levers on the first day of ShA1 (new data included as Fig. S1B) and that there is a 3 to 1 ratio between active/inactive pressing by ShA7. Three factors are more likely to play a key role in this early escalation. First, it is likely that the early escalation observed in some animals is due to the genetic diversity of the HS rat population used in our study. Indeed, most of the literature used Wistar, Sprague Dawley, and Long Evans rats, while the HS rats includes 8 different strains as founder parents. Indeed, profound strain differences have been observed in the vulnerability to self-administer cocaine, the maintenance of cocaine self-administration during short access,  and the level of escalation of intake (Freeman et al., 2009; Kosten et al., 2007; Perry et al., 2006; Picetti et al., 2010; Valenza et al., 2016). Second, we used a 2 h short access while most studies used 1 h of short access. The level of escalation is proportional to the duration of access, and it is likely that a 1 h access period leads to a ceiling effect preventing detection of individual differences in early escalation. Third, it is likely that reporting and publication bias played a significant role in the lack of reporting of such a phenomenon. When using a low sample size, many laboratories remove outliers during short access to ensure a homogeneous population before being given long access or moving on to a specific experimental condition. The combination of using a limited number of strains with limited genetic diversity, a 1h short access, and reporting bias is likely to have led to the conclusion that escalation of cocaine intake does not occur during short access. The current report using a rat stock with high genetic diversity, a 2 h short access, and no reporting bias conclusively demonstrates that escalation of cocaine intake occurs in some individuals. The discussion has been updated to reflect these points on page 20.

      (6) Although the authors provide PR and foot shock data separated by sex in Supplemental Figure 2, the manuscript would benefit from denoting the number of males and females in each data set shown in Figures 3 and 5. Is there a difference in the proportion of males or females that display a vulnerable phenotype? Given that the authors are interested in investigating sex differences, it would greatly improve the manuscript to disaggregate the resilient/vulnerable data (Figure 3) and degree of vulnerability data (Figure 5) by sex. 

      We have now added the proportion of males and females in each of these subgroups and discussed these results. 

      - For figure 3: when categorizing on intake, there is a greater number of males in the Resilient population than females, as a logical conclusion from the findings in figure 2. The following was added: “From the analysis of sex differences above, we could expect the Resilient group to contain more males. Amongst the resilient animals, there were twice as many males compared to females (N = 122 total with 82 males and 40 females). The amounts in the vulnerable group were almost equal (N = 445 total with 210 males and 235 females).

      - For figure 5: as the z-scoring of the behavioral measures is performed per sex, these differences are normalized, and all groups contain equal amounts of males and females. The following was added: “As the indices were derived per sex, quantile normalization results in roughly equal number of males and females in each group: 57 females and 71 males in the Low group, 68 females and 60 males in the Mild group, 67 females and 60 males in the Moderate group, and 57 females and 71 males in the Severe group.” To make this clearer, we also elaborated on the calculation of the indices in the methods and results sections. 

      (7) Consistent with previous reports, the authors demonstrate an increase in irritability-like behavior during withdrawal after cocaine self-administration; however, they make the claim that this variable was orthogonal to drug intake behavior. The discussion claims that the increase in irritability-like behavior was likely due to factors independent of drug intake, such as undergoing surgery, catheter implants, or being tested daily for two months. Individuals with a history of substance use disorder are thought to continue use as a consequence of negative reinforcement. Unwanted behavioral states, such as irritability, can be a driving factor in relapse; therefore, it would perhaps be more translationally relevant to understand the degree to which irritability-like behavior acts as a negative reinforcer rather than correlating this behavior with initial drug-seeking behavior. While this is outside of the scope of the current manuscript, perhaps this is worth noting in the discussion.

      the reviewer raises a good point and we added a paragraph to the discussion acknowledging the translational relevance of understanding the relationship between irritability and drug-seeking behavior in the context of negative reinforcement and relapse. Now it reads: 

      “Despite the lack of correlation between irritability-like behavior and drug intake in our study, it is important to consider the translational relevance of irritability in the context of substance use disorders. In individuals with a history of substance use disorder, negative affective states, such as irritability, are thought to contribute to continued drug use and relapse through negative reinforcement processes (Baker et al., 2004; Koob and Le Moal, 2008). Specifically, the desire to alleviate or escape from these unwanted behavioral states may drive individuals to seek and use drugs, thus perpetuating the cycle of addiction (Baker et al., 2004; Solomon and Corbit, 1974). While our study focused on the relationship between irritability-like behavior and initial drug-seeking behavior, future research should investigate the degree to which irritability acts as a negative reinforcer in the context of drug relapse”.

    1. eLife Assessment

      This valuable study applies transcranial direct current stimulation (tCDS) to the prefrontal cortex of non-human primates during two states: (1) propofol-induced unconsciousness; and (2) wakeful performance of a fixation task. The analysis offers incomplete evidence to indicate that the effect of tDCS on brain dynamics, as recorded with functional magnetic resonance imaging, is contingent on the state of consciousness during which the stimulation is applied. The findings will be of interest to researchers interested in brain stimulation and consciousness.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, the authors apply TDCS to awake and anesthetized macaques to determine the effect of this modality on dynamic connectivity measured by fMRI. The question is to understand the extent to which TDCS can influence conscious or unconscious states. Their target was the PFC. During the conscious states, the animals were executing a fixation task. Unconsciousness was achieved by administering a constant infusion of propofol and a continuous infusion of the muscle relaxant cisatracurium. They observed the animals while awake receiving anodal or cathodal hd-TDCS applied to the PFC. During the cathodal stimulation, they found disruption of functional connectivity patterns, enhanced structure-function correlations, a decrease in Shannon entropy, and a transition towards patterns that were more commonly anatomically based. In contrast under propofol anesthesia anodal hd-TDCS stimulation appreciably altered the brain connectivity patterns and decreased the correlation between structure and function. The PFC stimulations altered patterns associated with consciousness as well as those associated with unconsciousness.

      Strengths:

      The authors carefully executed a set of very challenging experiments that involved applying tDCS in awake and anesthetized non-human primates while conducting functional imaging.

      Weaknesses:

      The authors show that tDCS can alter functional connectivity measured by fMRI but they do not make clear what their studies teach the reader about the effects of tDCS on the brain during different states of consciousness. No important finding is stated contrary to what is stated in the abstract. It is also not clear what the work teaches us about how tDCS works nor is it clear what are the "clinical implications for disorders of consciousness." The deep anesthesia is akin to being in a state of coma. This was not discussed.

      While the authors have executed a set of technically challenging experiments, it is not clear what they teach us about how tDCS works, normal brain neurophysiology, or brain pathological states such as disorders of consciousness.

    3. Reviewer #2 (Public review):

      General comments:

      The authors investigated the effects of tDCS on brain dynamics in awake and anesthetized monkeys using functional MRI. They claim that cathodal tDCS disrupts the functional connectivity pattern in awake monkeys while anodal tDCS alters brain patterns in anesthetized monkeys. This study offers valuable insight into how brain states can influence the outcomes of noninvasive brain stimulation. However, there are several aspects of the methods and results sections that should be improved to clarify the findings.

      Major comments

      (1) For the anesthetized monkeys, the anode location differs between subjects, with the electrode positioned to stimulate the left DLFPC in monkey R and the right DLPFC in monkey N. The authors mention that this discrepancy does not result in significant differences in the electric field due to the monkeys' small head size. However, this is not correct, as placing the anode on the left hemisphere would result in much lower EF in the right DLPFC compared to placing the anode on the right side. Running an electric field simulation would confirm this. Additionally, the small electrode size suggested by the Easy cap configuration for NHP appears sufficient to focally stimulate the targeted regions. If this interpretation is correct, the authors should provide additional evidence to support their claim, such as a computational simulation of the EF distribution.

      (2) For the anesthetized monkeys, the authors applied 1 mA tDCS first, followed by 2 mA tDCS. A 20-minute stimulation duration of 1 mA tDCS is strong enough to produce after-effects that could influence the brain state during the 2 mA tDCS. This raises some concerns. Previous studies have shown that 1 mA tDCS can generate EF of over 1 V/m in the brain, and the effects of stimulation are sensitive to brain state (e.g., eye closed vs. eye open). How do the authors ensure that there are no after-effects from the 1 mA tDCS? This issue makes it challenging to directly compare the effects of 1 mA and 2 mA stimulation.

      (3) The occurrence rate of a specific structural-functional coupling pattern among random brain regions shows significant effects of tDCS. However, these results seem counterintuitive. It is generally understood that noninvasive brain stimulation tends to modulate functional connectivity rather than structural or structural-functional connectivity. How does the occurrence rate of structural-functional coupling patterns provide a more suitable measure of the effectiveness of tDCS than functional connectivity alone? I would recommend that the authors present the results based on functional connectivity itself. If there is no change in functional connectivity, the relevance of changes in structural-functional coupling might not translate into a meaningful alteration in brain function, making it unclear how significant this finding is without corresponding functional evidence.

      (4) The authors recorded data from only two monkeys, which may limit the investigation of the group effects of tDCS. As the number of scans for the second monkey in each consciousness condition is lower than that in the first monkey, there is a concern that the main effects might primarily reflect the data from a single monkey. I suggest that the authors should analyze the data for each monkey individually to determine if similar trends are observed in both subjects.

      (5) Anodal tDCS was only applied to anesthetized monkeys, which limits the conclusion that the authors are aiming for. It raises questions about the conclusion regarding brain state dependency. To address this, it would be better to include the cathodal tDCS session for anesthetized monkeys. If cathodal tDCS changes the connectivity during anesthesia, it becomes difficult to argue that the effects of cathodal tDCS varies depending on the state of consciousness as discussed in this paper. On the other hand, if cathodal tDCS would not produce any changes, the conclusion would then focus on the relationship between the polarity of tDCS and consciousness. In that case, the authors could maintain their conclusion but might need to refine it to reflect this specific relationship more accurately.

    4. Reviewer #3 (Public review):

      Summary:

      This study used transcranial direct current stimulation administered using small 'high-definition' electrodes to modulate neural activity within the non-human primate prefrontal cortex during both wakefulness and anaesthesia. Functional magnetic resonance imaging (fMRI) was used to assess the neuromodulatory effects of stimulation. The authors report on the modification of brain dynamics during and following anodal and cathodal stimulation during wakefulness and following anodal stimulation at two intensities (1 mA, 2 mA) during anaesthesia. This study provides some possible support that prefrontal direct current stimulation can alter neural activity patterns across wakefulness and sedation in monkeys. However, the reported findings need to be considered carefully against several important methodological limitations.

      Strengths:

      A key strength of this work is the use of fMRI-based methods to track changes in brain activity with good spatial precision. Another strength is the exploration of stimulation effects across wakefulness and sedation, which has the potential to provide novel information on the impact of electrical stimulation across states of consciousness.

      Weaknesses:

      The lack of a sham stimulation condition is a significant limitation, for instance, how can the authors be sure that results were not affected by drowsiness or fatigue as a result of the experimental procedure?

      In the anaesthesia condition, the authors investigated the effects of two intensities of stimulation (1 mA and 2 mA). However, a potential confound here relates to the possibility that the initial 1 mA stimulation block might have caused plasticity-related changes in neural activity that could have interfered with the following 2 mA block due to the lack of a sufficient wash-out period. Hence, I am not sure any findings from the 2 mA block can really be interpreted as completely separate from the initial 1 mA stimulation period, given that they were administered consecutively. Several previous studies have shown that same-day repeated tDCS stimulation blocks can influence the effects of neuromodulation (e.g., Bastani and Jaberzadeh, 2014, Clin Neurophysiol; Monte-Silva et al., J. Neurophysiology).

      The different electrode placement for the two anaesthetised monkeys (i.e., Monkey R: F3/O2 montage, Monkey N: F4/O1 montage) is problematic, as it is likely to have resulted in stimulation over different brain regions. The authors state that "Because of the small size of the monkey's head, we expected that tDCS stimulation with these two symmetrical montages would result in nearly equivalent electric fields across the monkey's head and produce roughly similar effects on brain activity"; however, I am not totally convinced of this, and it really would need E-field models to confirm. It is also more likely that there would in fact be notable differences in the brain regions stimulated as the authors used HD-tDCS electrodes, which are generally more focal.

      Given the very small sample size, I think it is also important to consider the possibility that some results might also be impacted by individual differences in response to stimulation. For instance, in the discussion (page 9, paragraph 2) the authors contrast findings observed in awake animals versus anaesthetised animals. However, different monkeys were examined for these two conditions, and there were only two monkeys in each group (monkeys J and Y for awake experiments [both male], and monkeys R and N [male and female] for the anaesthesia condition). From the human literature, it is well known that there is a considerable amount of inter-individual variability in response to stimulation (e.g., Lopez-Alonso et al., 2014, Brain Stimulation; Chew et al., 2015, Brain Stimulation), therefore I wonder if some of these differences could also possibly result from differences in responsiveness to stimulation between the different monkeys? At the end of the paragraph, the authors also state "Our findings also support the use of tDCS to promote rapid recovery from general anesthesia in humans...and suggest that a single anodal prefrontal stimulation at the end of the anesthesia protocol may be effective." However, I'm not sure if this statement is really backed-up by the results, which failed to report "any behavioural signs of awakening in the animals" (page 7)?

    5. Author response:

      We thank the reviewers for their thoughtful and critical comments. We will revise and improve the manuscript according to the public reviews. In particular, we will:

      (1) provide a broader perspective on the potential clinical implications of our experiments regarding the mechanisms and the treatment of coma and disorders of consciousness. In particular, we will address how the reported increase in dynamical features associated with consciousness, even without behavioral signs, might be relevant to characterize patients with a motor-cognitive dissociation.

      (2) use the term "tDCS" to qualify the technique we used in the paper instead of "HD-tDCS" to avoid any potential confusion. We understand that "HD-tDCS", which we used in our paper to refer to high-density tDCS (small size electrodes), may cause some confusion with high-definition tDCS, which is more commonly used in the literature to design a 4x1 tDCS montage with smaller high-definition electrodes. We will also provide the full characteristics of the carbon electrodes we used for stimulation.

      (3) clarify the location sites of stimulation and provide structural MRI images with the accurate localization of the stimulating electrodes.

      (4) clarify the fMRI data analyses we performed and provide a schematic illustration of the analysis process.

    1. eLife Assessment

      This valuable study uses an original approach to address the longstanding question of why reaching movements are often biased. The combination of a wide range of experimental conditions and computational models is a strength. However, the modeling assumptions are not well-substantiated, the modeling analysis is insufficient with its focus on fits to average and not individual subject data, and the results are limited to biases in reach direction and do not consider biases in reach extent. Taken together, the evidence supporting the main claims is incomplete.

    2. Reviewer #1 (Public review):

      Wang et al. studied an old, still unresolved problem: Why are reaching movements often biased? Using data from a set of new experiments and from earlier studies, they identified how the bias in reach direction varies with movement direction, and how this depends on factors such as the hand used, the presence of visual feedback, the size and location of the workspace, the visibility of the start position and implicit sensorimotor adaptation. They then examined whether a visual bias, a proprioceptive bias, a bias in the transformation from visual to proprioceptive coordinates and/or biomechanical factors could explain the observed patterns of biases. The authors conclude that biases are best explained by a combination of transformation and visual biases.

      A strength of this study is that it used a wide range of experimental conditions with also a high resolution of movement directions and large numbers of participants, which produced a much more complete picture of the factors determining movement biases than previous studies did. The study used an original, powerful, and elegant method to distinguish between the various possible origins of motor bias, based on the number of peaks in the motor bias plotted as a function of movement direction. The biomechanical explanation of motor biases could not be tested in this way, but this explanation was excluded in a different way using data on implicit sensorimotor adaptation. This was also an elegant method as it allowed the authors to test biomechanical explanations without the need to commit to a certain biomechanical cost function.

      The main weakness of the study is that it rests on the assumption that the number of peaks in the bias function is indicative of the origin of the bias. Specifically, it is assumed that a proprioceptive bias leads to a single peak, a transformation bias to two peaks, and a visual bias to four peaks, but these assumptions are not well substantiated. Especially the assumption that a transformation bias leads to two peaks is questionable. It is motivated by the fact that biases found when participants matched the position of their unseen hand with a visual target are consistent with this pattern. However, it is unclear why that task would measure only the effect of transformation biases, and not also the effects of visual and proprioceptive biases in the sensed target and hand locations. Moreover, it is not explained why a transformation bias would lead to this specific bias pattern in the first place. Also, the assumption that a visual bias leads to four peaks is not well substantiated as one of the papers on which the assumption was based (Yousif et al., 2023) found a similar pattern in a purely proprioceptive task. Another weakness is that the study looked at biases in movement direction only, not at biases in movement extent. The models also predict biases in movement extent, so it is a missed opportunity to take these into account to distinguish between the models.

      Overall, the authors have done a good job mapping out reaching biases in a wide range of conditions, revealing new patterns in one of the most basic tasks, but unambiguously determining the origin of these biases remains difficult, and the evidence for the proposed origins is incomplete. Nevertheless, the study will likely have a substantial impact on the field, as the approach taken is easily applicable to other experimental conditions. As such, the study can spark future research on the origin of reaching biases.

    3. Reviewer #2 (Public review):

      Summary:

      This work examines an important question in the planning and control of reaching movements - where do biases in our reaching movements arise and what might this tell us about the planning process? They compare several different computational models to explain the results from a range of experiments including those within the literature. Overall, they highlight that motor biases are primarily caused by errors in the transformation between eye and hand reference frames. One strength of the paper is the large number of participants studied across many experiments. However, one weakness is that most of the experiments follow a very similar planar reaching design - with slicing movements through targets rather than stopping within a target. Moreover, there are concerns with the models and the model fitting. This work provides valuable insight into the biases that govern reaching movements, but the current support is incomplete.

      Strengths:

      The work uses a large number of participants both with studies in the laboratory which can be controlled well and a huge number of participants via online studies. In addition, they use a large number of reaching directions allowing careful comparison across models. Together these allow a clear comparison between models which is much stronger than would usually be performed.

      Weaknesses:

      Although the topic of the paper is very interesting and potentially important, there are several key issues that currently limit the support for the conclusions. In particular I highlight:

      Almost all studies within the paper use the same basic design: slicing movements through a target with the hand moving on a flat planar surface. First, this means that the authors cannot compare the second component of a bias - the error in the direction of a reach which is often much larger than the error in reaching direction. Second, there are several studies that have examined biases in three-dimensional reaching movements showing important differences to two-dimensional reaching movements (e.g. Soechting and Flanders 1989). It is unclear how well the authors' computational models could explain the biases that are present in these much more common-reaching movements.

      The model fitting section is under-explained and under-detailed currently. This makes it difficult to accurately assess the current model fitting and its strength to support the conclusions. If my understanding of the methods is correct, then I have several concerns. For example, the manuscript states that the transformation bias model is based on studies mapping out the errors that might arise across the whole workspace in 2D. In contrast, the visual bias model appears to be based on a study that presented targets within a circle (but not tested across the whole workspace). If the visual bias had been measured across the workspace (similar to the transformation bias model), would the model and therefore the conclusions be different? There should be other visual bias models theoretically possible that might fit the experimental data better than this one possible model. Such possibilities also exist for the other models.

      Although the authors do mention that the evidence against biomechanical contributions to the bias is fairly weak in the current manuscript, this needs to be further supported. Importantly both proprioceptive models of the bias are purely kinematic and appear to ignore the dynamics completely. One imagines that there is a perceived vector error in Cartesian space whereas the other imagines an error in joint coordinates. These simply result in identical movements which are offset either with a vector or an angle. However, we know that the motor plan is converted into muscle activation patterns which are sent to the muscles, that is, the motor plan is converted into an approximation of joint torques. Joint torques sent to the muscles from a different starting location would not produce an offset in the trajectory as detailed in Figure S1, instead, the movements would curve in complex patterns away from the original plan due to the non-linearity of the musculoskeletal system. In theory, this could also bias some of the other predictions as well. The authors should consider how the biomechanical plant would influence the measured biases.

    4. Reviewer #3 (Public review):

      The authors make use of a large dataset of reaches from several studies run in their lab to try to identify the source of direction-dependent radial reaching errors. While this has been investigated by numerous labs in the past, this is the first study where the sample is large enough to reliably characterize isometries associated with these radial reaches to identify possible sources of errors.

      The sample size is impressive, but the authors should include confidence intervals and ideally, the distribution of responses across individuals along with average performance across targets. It is unclear whether the observed "averaged function" is consistently found across individuals, or if it is mainly driven by a subset of participants exhibiting large deviations for diagonal movements. Providing individual-level data or response distributions would be valuable for assessing the ubiquity of the observed bias patterns and ruling out the possibility that different subgroups are driving the peaks and troughs. It is possible that the Transformation or some other model (see below) could explain the bias function for a substantial portion of participants, while other participants may have different patterns of biases that can be attributable to alternative sources of error.

      The different datasets across different experimental settings/target sets consistently show that people show fewer deviations when making cardinal-directed movements compared to movements made along the diagonal when the start position is visible. This reminds me of a phenomenon referred to as the oblique effect: people show greater accuracy for vertical and horizontal stimuli compared to diagonal ones. While the oblique effect has been shown in visual and haptic perceptual tasks (both in the horizontal and vertical planes), there is some evidence that it applies to movement direction. These systematic reach deviations in the current study thus may reflect this epiphenomenon that applies across modalities. That is, estimating the direction of a visual target from a visual start position may be less accurate, and may be more biased toward the horizontal axis, than for targets that are strictly above, below, left, or right of the visual start position. Other movement biases may stem from poorer estimation of diagonal directions and thus reflect more of a perceptual error than a motor one. This would explain why the bias function appears in both the in-lab and on-line studies although the visual targets are very different locations (different planes, different distances) since the oblique effects arise independent of plane, distance, or size of the stimuli.

      When the start position is not visible like in the Vindras study, it is possible that this oblique effect is less pronounced; masked by other sources of error that dominate when looking at 2D reach endpoint made from two separate start positions, rather than only directional errors from a single start position. Or perhaps the participants in the Vindras study are too variable and too few (only 10) to detect this rather small direction-dependent bias.

      A bias in estimating visual direction or visual movement vector is a more realistic and relevant source of error than the proposed visual bias model. The Visual Bias model is based on data from a study by Huttenlocher et al where participants "point" to indicate the remembered location of a small target presented on a large circle. The resulting patterns of errors could therefore be due to localizing a remembered visual target, or due to relative or allocentric cues from the clear contour of the display within which the target was presented, or even movements used to indicate the target. This may explain the observed 4-peak bias function or zig-zag pattern of "averaged" errors, although this pattern may not even exist at the individual level, especially given the small sample size. The visual bias source argument does not seem well-supported, as the data used to derive this pattern likely reflects a combination of other sources of errors or factors that may not be applicable to the current study, where the target is continuously visible and relatively large. Also, any visual bias should be explained by a coordinates centre on the eye and should vary as a function of the location of visual targets relative to the eyes. Where the visual targets are located relative to the eyes (or at least the head) is not reported.

      The Proprioceptive Bias Model is supposed to reflect errors in the perceived start position. However, in the current study, there is only a single, visible start position, which is not the best design for trying to study the contribution. In fact, my paradigms also use a single, visual start position to minimize the contribution of proprioceptive biases, or at least remove one source of systematic biases. The Vindras study aimed to quantify the effect of start position by using two sets of radial targets from two different, unseen start positions on either side of the body midline. When fitting the 2D reach errors at both the group and individual levels (which showed substantial variability across individuals), the start position predicted most of the 2D errors at the individual level - and substantially more than the target direction. While the authors re-plotted the data to only illustrate angular deviations, they only showed averaged data without confidence intervals across participants. Given the huge variability across their 10 individuals and between the two target sets, it would be more appropriate to plot the performance separately for two target sets and show confidential intervals (or individual data). Likewise, even the VT model predictions should differ across the two targets set since the visual-proprioceptive matching errors from the Wang et al study that the model is based on, are larger for targets on the left side of the body.

      I am also having trouble fully understanding the V-T model and its associated equations, and whether visual-proprioception matching data is a suitable proxy for estimating the visuomotor transformation. I would be interested to first see the individual distributions of errors and a response to my concerns about the Proprioceptive Bias and Visual Bias models.

    5. Author response:

      We are pleased that the reviewers found our study thought-provoking and appreciate the care they have taken in providing constructive feedback. Focusing on the main issues raised by the reviewers, we provide here a provisional response to the Public Comments and outline our revision plan.

      A) Reviewers 1 and 2 were concerned that our task and analyses were limited by the fact that we only tested the model based on biases in movement direction (angular biases) and did not examine biases in movement extent (radial biases).

      While we think the angular biases provide a sufficient test to compare the set of models presented in the paper, we appreciate that there was a missed opportunity to also look at movement extent.  Looking at predictions concerning both movement direction and extent would provide a stronger basis for model comparison. To this end, we will take a two-step approach:

      (1) Re-analysis of existing datasets from experiments that involve a pointing task (movements terminate at the target position) rather than a shooting task (movements terminate further than the target distance).  We will conduct a model comparison using these data. 

      (2) If we are unable to obtain a suitable dataset or datasets because we cannot access individual data or there are too few participants, we will conduct a new experiment using a pointing task.  We will use these new data to evaluate whether the transformation model can accurately predict biases in both movement direction and extent.

      We will incorporate those new results in our revision.

      B) Reviewer 3 noted that model fitting was based on group average data. They questioned if this was representative across individuals and how well the model would account for individual patterns of reach biases.

      To address this issue, we propose to do the following:

      (1) We will first fit the model to individual data in Exp 1 and assess whether a two-peak function, the signature of the transformation model, is characteristic of most the fits. We recognize that the results at the individual level may not support the model.  This could occur because the model is not correct.  Alternatively, the model could be correct but difficult to evaluate at the individual level for several reasons. First, the data set may be underpowered at the individual level. Second, motor biases can be idiosyncratic (e.g., within subject correlation is greater than between subject correlation), a point we noted in the original submission. Third, as observed in previous studies, transformation biases also show considerable individual variability (Wang et al, 2020); as such, even if the model is correct, a two-peaked function may not hold for all individuals.

      (2) If the individual variability is too large to draw meaningful conclusions, we will conduct a new experiment in which we measure motor and proprioceptive biases. Our plan would be to collect a large data set from a limited number of participants.  These data should allow us to evaluate the models on an individual basis, including using each participant’s own transformation/proprioceptive bias function to predict their motor biases.

      C) The reviewers have comments regarding the assumptions and form of the different models. Reviewer 3 questioned the visual bias model presented in the paper, and Reviewers 2 and 3 suggested additional visual bias/ biomechanical models to consider.

      We agree that what we call a visual bias effect is not confined to the visual modality: It is observed when the target is presented visually or proprioceptively, and in manifest in both reaching movements, saccades, and pressing keys to adjust a dot to match with the remembered target (Kosovicheva & Whitney, 2017; Yousif et al. 2023). As such, the bias may reflect a domain-general distortion in the representation of goals within polar space. We refer to this component as a "visual bias" because it is associated with the representation of the visual target in the reaching task.

      We do think the version of the visual bias model in the original submission is reasonable given that the bias pattern has been observed in perceptual tasks with stimuli that were very similar to ours (e.g., Kosovicheva & Whitney, 2017). We have explored other perceptual models in evaluating the motor biases observed in Experiment 1. For example, several models discuss how visual biases may depend on the direction of a moving object or the orientation of an object (Wei & Stocker, 2015; Patten, Mannion & Clifford, 2017). However, these models failed to account for the motor biases observed in our experiments, a not surprising outcome since the models were not designed to capture biases in perceived location.  There are also models of visual basis associated with viewing angle (e.g., based on retina/head position).  Since we allow free viewing, these biases are unlikely to make substantive contributions to the biases observed in our reaching tasks.

      Given that some readers are likely to share the reviewers’ concerns on this issue, we will extend our discussion to describe alternative visual models and provide our arguments about why these do not seem relevant/appropriate for our study.

      In terms of biomechanical models, we plan to explore at least one alternative model, the MotorNet Model (https://elifesciences.org/articles/88591). This recently published model combines a six-muscle planar arm model with artificial neural networks (ANNs) to generate a control policy. The model has been used to predict movement curvature in various contexts.  We will focus on its utility to predict biases in reaching to visual targets.

      D) Reviewer 1 had concerns with how we measured the transformation bias. In particular, they asked why the data from Wang et al (2020) are used as an estimate of transformation biases, and not as the joint effects of visual and proprioceptive biases in the sensed target and hand location, respectively.

      We define transformation error as the misalignment between the visual target and the hand position. We quantify this transformation bias by referencing studies that used a matching task in which participants match their unseen hand to a visual target, or vice versa. Errors observed in these tasks are commonly attributed to proprioceptive bias, although they could also reflect a contribution from visual bias. We utilized the same data set to simulate both the transformation bias model and the proprioceptive bias model.

      Although it may seem that we are simply renaming concepts, the concept of transformation error addresses biases that arise during motor planning. For the proprioceptive bias model, the bias only influences the perceived start position but not the goal since proprioception will influence the perceived position of the target before the movement begins. In contrast, the transformation bias model proposes that movements are planned toward a target whose location is biased due to discrepancies between visual and proprioceptive representations.

      The question then arises whether measurements of proprioceptive bias also reflect a transformation bias. We believe that the transformation bias is influenced by proprioceptive feedback, or at the very least, proprioceptive and transformation bias share a common source of error and thus, are highly correlated. We will revise the Introduction and Results sections to more clearly articulate these relationships and assumptions.

      E) Reviewer 3 asked whether the oblique effect in visual perception could account for our motor bias.

      The potential link between the oblique effect and the observed motor bias is an intriguing idea, one that we had not considered. However, after giving this some thought, we see several arguments against the idea that the oblique effect accounts for the pattern of motor biases.

      First, by the oblique effect, variance is greater for diagonal orientations compared to Cartesian orientations. These differences in perceptual variability can explain the bias pattern in visual perception through a Bayesian efficient coding model (Wei & Stocker, 2015). We note that even though participants showed large variability for stimuli at diagonal orientations, the bias for these stimuli was close to zero. As such, we do not think it can explain the motor bias function given the large bias for targets at locations along the diagonal axes.

      Second, the reviewer suggested an "oblique effect" within the motor system, proposing that motor variability is greater for diagonal directions due to increased visual bias. If this hypothesis is correct, a visual bias model should account for the motor bias observed, particularly for diagonal targets. In other words, when estimating the visual bias from a reaching task, a similar bias pattern should emerge in tasks that do not involve movement. However, this prediction is not supported in previous studies. For example, in a position judgment task that is similar to our task but without the reaching response, participants exhibited minimal bias along the diagonals (Kosovicheva & Whitney, 2017).

      Despite our skepticism, we will keep this idea in mind during the revision, investigating variability in movement across the workspace.

    1. eLife Assessment

      This important study shows that sound exposure enhances drug delivery to the cochlea via outer hair cell electromotility acting as a "fluid pump". Although others have proposed that electromotility subserves cochlear amplification, this is the first report to have tested the pumping effect in vivo and considered its possible implications for cochlear homeostasis and drug delivery. The manuscript provides convincing evidence for OHC-based fluid flow within the cochlea.

    2. Reviewer #1 (Public review):

      Summary:

      The authors test the "OHC-fluid-pump" hypothesis by assaying the rates of kainic acid dispersal both in quiet and in cochleae stimulated by sounds of different levels and spectral content. The main result is that sound (and thus, presumably, OHC contractions and expansions) results in faster transport along the duct. OHC involvement is corroborated using salicylate, which yielded results similar to silence. Especially interesting is the fact that some stimuli (e.g. tones) seem to provide better/faster pumping than others (e.g. noise), ostensibly due to the phase profile of the resulting cochlear traveling-wave response.

      Strengths:

      The experiments appear well controlled and the results are novel and interesting. Some elegant cochlear modeling that includes coupling between the organ of Corti and the surrounding fluid as well as advective flow supports the proposed mechanism.

      Weaknesses:

      It's not clear whether the effect size (e.g., the speed of sound-induced pumping relative to silence) is large enough to have important practical applications (e.g., for drug delivery). The authors should comment on the practical requirements and limitations.

      Although helpful so far as it goes, the modeling could be taken much further to help understand some of the more interesting aspects of the data and to obtain testable predictions. In particular, the authors should systematically explore the level effects they find experimentally and determine whether the model can replicate the finding that different sounds produce different results (e.g. noise vs tone).

      The model should also be used to relate the model's flow rates more quantitatively to the properties of the traveling wave (e.g., its phase profile).

      Finally, the model should be used to investigate differences between active and passive OHCs (e.g., simulating the salicylate experiment by disabling the model's OHCs).

      The manuscript would be stronger if the authors discussed ways to test their hypothesis that OHC motility serves a protective effect by pumping fluid. For example, do animals held in quiet after noise exposure (TTS) take longer to recover?

    3. Reviewer #2 (Public review):

      Summary:

      Recent cochlear micromechanical measurements in living animals demonstrated outer hair cell-driven broadband vibration of the reticular lamina that contradicts frequency-selective cochlear amplification. The authors hypothesized that motile outer hair cells can drive cochlear fluid circulation. This hypothesis was tested by observing the effects of acoustic stimuli and salicylate, an outer hair cell motility blocker, on kainic acid-induced changes in the cochlear nucleus activities. It was found that acoustic stimuli can reduce the latency of the kainic acid effect, and a low-frequency tone is more effective than broadband noise. Salicylate reduced the effect of acoustic stimuli on kainic acid-induced changes. The authors also developed a computational model to provide the physical basis for interpreting experimental results. It was concluded that experimental data and simulations coherently indicate that broadband outer hair cell action is for cochlear fluid circulation.

      Strengths:

      The major strengths of this study include its high significance and the combination of electrophysiological recording of the cochlear nucleus responses with computational modeling. Cochlear outer hair cells have been believed to be responsible for the exceptional sensitivity, sharp tuning, and huge dynamic range of mammalian hearing. Recent observation of the broadband reticular lamina vibration contradicts frequency-specific cochlear amplification. Moreover, there is no effective noninvasive approach to deliver the drugs or genes to the cochlea for treating sensorineural hearing loss, one of the most common auditory disorders. These important questions were addressed in this study by observing outer hair cells' roles in the cochlear transport of kainic acid. The well-established electrophysiological method for recording cochlear nucleus responses produced valuable new data, and the purposely developed computational model significantly enhanced the interpretation of the data.

      The authors successfully tested their hypothesis, and both the experimental and modeling results support the conclusion that active outer hair cells can drive cochlear fluid circulation in the living cochlea.<br /> Findings from this study will help auditory scientists understand how the outer hair cells contribute to cochlear amplification and normal hearing.

      Weaknesses:

      While the statement "The present study provides new insights into the nonselective outer hair cell action (in the second paragraph of Discussion)" is well supported by the results, the authors should consider providing a prediction or speculation of how this hair cell action enhances cochlear sensitivity. Such discussion would help the readers better understand the significance of the current work.

    4. Reviewer #3 (Public review):

      Summary:

      This study reveals that sound exposure enhances drug delivery to the cochlea through the non-selective action of outer hair cells. The efficiency of sound-facilitated drug delivery is reduced when outer hair cell motility is inhibited. Additionally, low-frequency tones were found to be more effective than broadband noise for targeting substances to the cochlear apex. Computational model simulations support these findings.

      Strengths:

      The study provides compelling evidence that the broad action of outer hair cells is crucial for cochlear fluid circulation, offering a novel perspective on their function beyond frequency-selective amplification. Furthermore, these results could offer potential strategies for targeting and optimizing drug delivery throughout the cochlear spiral.

      Weaknesses:

      The primary weakness of this paper lies in the surgical procedure used for drug administration through the round window. Opening the cochlea can alter intracochlear pressure and disrupt the traveling wave from sound, a key factor influencing outer hair cell activity. However, the authors do not provide sufficient details on how they managed this issue during surgery. Additionally, the introduction section needs further development to better explain the background and emphasize the significance of the work.

    1. eLife Assessment

      This study presents valuable findings to the field interested in inattentional blindness (IB), reporting that participants indicating no awareness of unexpected stimuli through yes/no questions, still show above-chance sensitivity to specific properties of these stimuli through follow-up forced-choice questions (e.g., its color). The results suggest that this is because participants are conservative and biased to report not noticing in IB. The authors conclude that these results provide evidence for residual perceptual awareness of inattentionally blind stimuli and that therefore these findings cast doubt on the claim that awareness requires attention. Although the samples are large and the analysis protocol novel, the evidence supporting this interpretation is still incomplete, because effect sizes are rather small, the experimental design could be improved and alternative explanations have not been ruled out.

    2. Reviewer #1 (Public review):

      Summary:

      In the abstract and throughout the paper, the authors boldly claim that their evidence, from the largest set of data ever collected on inattentional blindness, supports the views that "inattentionally blind participants can successfully report the location, color, and shape of stimuli they deny noticing", "subjects retain awareness of stimuli they fail to report", and "these data...cast doubt on claims that awareness requires attention." If their results were to support these claims, this study would overturn 25+ years of research on inattentional blindness, resolve the rich vs. sparse debate in consciousness research, and critically challenge the current majority view in cognitive science that attention is necessary for awareness.

      Unfortunately, these extraordinary claims are not supported by extraordinary (or even moderately convincing) evidence. At best, the results support the more modest conclusion: If sub-optimal methods are used to collect retrospective reports, inattentional blindness rates will be overestimated by up to ~8% (details provided below in comment #1). This evidence-based conclusion means that the phenomenon of inattentional blindness is alive and well as it is even robust to experiments that were specifically aimed at falsifying it. Thankfully, improved methods already exist for correcting the ~8% overestimation of IB rates that this study successfully identified.

      Comments:

      (1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate, if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the non-noticer group who may have indeed actually seen the location of the stimulus. If these 29 subjects are moved to the noticer group, the corrected rate of IB for experiment 1 is 21% instead of 29%. In other words, relying only on the "Y/N did you notice anything" question leads to an overestimate of IB rates by 8%. This modest level of inaccuracy in estimating IB rates is insufficient for concluding that "subjects retain awareness of stimuli they fail to report", i.e. that inattentional blindness does not exist.

      In addition, this 8% inaccuracy in IB rates only considers one side of the story. Given the data reported for experiment 1, one can also calculate the number of subjects who answered "yes, I did notice something unusual" but then reported the incorrect location of the critical stimulus. This turned out to be 8 subjects (or 3% of the "noticer" group). Some would argue that it's reasonable to consider these subjects as inattentionally blind, since they couldn't even report where the critical stimulus they apparently noticed was located. If we move these 8 subjects to the non-noticer group, the 8% overestimation of IB rates is reduced to 6%.

      The same exercise can and should be carried out on the other 4 experiments, however, the authors do not report the subject numbers for any of the other experiments, i.e., how many subjects answered Y/N to the noticing question and how many in each group correctly answered the stimulus feature question. From the limited data reported (only total subject numbers and d' values), the effect sizes in experiments 2-5 were all smaller than in experiment 1 (d' for the non-noticer group was lower in all of these follow-up experiments), so it can be safely assumed that the ~6-8% overestimation of IB rates was smaller in these other four experiments. In a revision, the authors should consider reporting these subject numbers for all 5 experiments.

      (2) Because classic IB paradigms involve only one critical trial per subject, the authors used a "super subject" approach to estimate sensitivity (d') and response criterion (c) according to signal detection theory (SDT). Some readers may have issues with this super subject approach, but my main concern is with the lack of precision used by the authors when interpreting the results from this super subject analysis.

      Only the super subject had above-chance sensitivity (and it was quite modest, with d' values between 0.07 and 0.51), but the authors over-interpret these results as applying to every subject. The methods and analyses cannot determine if any individual subject could report the features above-chance. Therefore, the following list of quotes should be revised for accuracy or removed from the paper as they are misleading and are not supported by the super subject analysis:

      "Altogether this approach reveals that subjects can report above-chance the features of stimuli (color, shape, and location) that they had claimed not to notice under traditional yes/no questioning" (p.6)

      "In other words, nearly two-thirds of subjects who had just claimed not to have noticed any additional stimulus were then able to correctly report its location." (p.6)

      "Even subjects who answer "no" under traditional questioning can still correctly report various features of the stimulus they just reported not having noticed, suggesting that they were at least partially aware of it after all." (p.8)

      "Why, if subjects could succeed at our forced-response questions, did they claim not to have noticed anything?" (p.8)

      "we found that observers could successfully report a variety of features of unattended stimuli, even when they claimed not to have noticed these stimuli." (p.14)

      "our results point to an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them." (p.16)

      "In other words, the inattentionally blind can see after all." (p.17)

      (3) In addition to the d' values for the super subject being slightly above zero, the authors attempted an analysis of response bias to further question the existence of IB. By including in some of their experiments critical trials in which no critical stimulus was presented, but asking subjects the standard Y/N IB question anyway, the authors obtained false alarm and correct rejection rates. When these FA/CR rates are taken into account along with hit/miss rates when critical stimuli were presented, the authors could calculate c (response criterion) for the super subject. Here, the authors report that response criteria are biased towards saying "no, I didn't notice anything". However, the validity of applying SDT to classic Y/N IB questioning is questionable.

      For example, with the subject numbers provided in Box 1 (the 2x2 table of hits/misses/FA/CR), one can ask, 'how many subjects would have needed to answer "yes, I noticed something unusual" when nothing was presented on the screen in order to obtain a non-biased criterion estimate, i.e., c = 0?' The answer turns out to be 800 subjects (out of the 2761 total subjects in the stimulus-absent condition), or 29% of subjects in this condition.

      In the context of these IB paradigms, it is difficult to imagine 29% of subjects claiming to have seen something unusual when nothing was presented. Here, it seems that we may have reached the limits of extending SDT to IB paradigms, which are very different than what SDT was designed for. For example, in classic psychophysical paradigms, the subject is asked to report Y/N as to whether they think a threshold-level stimulus was presented on the screen, i.e., to detect a faint signal in the noise. Subjects complete many trials and know in advance that there will often be stimuli presented and the stimuli will be very difficult to see. In those cases, it seems more reasonable to incorrectly answer "yes" 29% of the time, as you are trying to detect something very subtle that is out there in the world of noise. In IB paradigms, the stimuli are intentionally designed to be highly salient (and unusual), such that with a tiny bit of attention they can be easily seen. When no stimulus is presented and subjects are asked about their own noticing (especially of something unusual), it seems highly unlikely that 29% of them would answer "yes", which is the rate of FAs that would be needed to support the null hypothesis here, i.e., of a non-biased criterion. For these reasons, the analysis of response bias in the current context is questionable and the results claiming to demonstrate a biased criterion do not provide convincing evidence against IB.

      (4) One of the strongest pieces of evidence presented in the entire paper is the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result hinges on a single data point, it was not replicated in any of the other 4 experiments, and it can be explained by methodological limitations. I strongly encourage the authors (and other readers) to follow up on this result, in an in-person experiment, with improved questioning procedures.

      In the current Experiment 3, the authors asked the standard Y/N IB question, and then asked how confident subjects were in their answer. Asking back-to-back questions, the second one with a scale that pertains to the first one (including a tricky inversion, e.g., "yes, I am confident in my answer of no"), may be asking too much of some subjects, especially subjects paying half-attention in online experiments. This procedure is likely to introduce a sizeable degree of measurement error.

      An easy fix in a follow-up study would be to ask subjects to rate their confidence in having noticed something with a single question using an unambiguous scale:

      On the last trial, did you notice anything besides the cross?

      (1) I am highly confident I didn't notice anything else<br /> (2) I am confident I didn't notice anything else<br /> (3) I am somewhat confident I didn't notice anything else<br /> (4) I am unsure whether I noticed anything else<br /> (5) I am somewhat confident I noticed something else<br /> (6) I am confident I noticed something else<br /> (7) I am highly confident I noticed something else

      If we were to re-run this same experiment, in the lab where we can better control the stimuli and the questioning procedure, we would most likely find a d' of zero for subjects who were confident or highly confident (1-2 on the improved scale above) that they didn't notice anything. From there on, the d' values would gradually increase, tracking along with the confidence scale (from 3-7 on the scale). In other words, we would likely find a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. In the current online study with the successive (and potentially confusing) retrospective questioning, a handful of subjects could have easily misinterpreted the confidence scale (e.g., inverting the scale) which would lead to a mixture of genuine high-confidence ratings and mistaken ratings, which would result in a super subject d' that falls between zero and the other extreme of the scale (which is exactly what the data in Fig 3e shows).

      One way to check on this potential measurement error using the existing dataset would be to conduct additional analyses that incorporate the confidence ratings from the 2AFC location judgment task. For example, were there any subjects who reported being confident or highly confident that they didn't see anything, but then reported being confident or highly confident in judging the location of the thing they didn't see? If so, how many? In other words, how internally (in)consistent were subjects' confidence ratings across the IB and location questions? Such an analysis could help screen-out subjects who made a mistake on the first question and corrected themselves on the second, as well as subjects who weren't reading the questions carefully enough. As far as I could tell, the confidence rating data from the 2AFC location task were not reported anywhere in the main paper or supplement.

      (5) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial (or set of trials) is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials one cannot conclude that attention was manipulated.

      It is puzzling why the authors decided not to include any control trials with partial or full attention in their five experiments, especially given their online data collection procedures where stimulus size, intensity, eccentricity, etc. were uncontrolled and variable across subjects. Including such trials could have actually helped them achieve their goal of challenging the IB hypothesis, e.g., excluding subjects who failed to see the stimulus on the control trials might have reduced the inattentional blindness rates further. This design decision should at least be acknowledged and justified (or noted as a limitation) in a revision of this paper.

      (6) In the discussion section, the authors devote a short paragraph to considering an alternative explanation of their non-zero d' results in their super subject analyses: perhaps the critical stimuli were processed unconsciously and left a trace such that when later forced to guess a feature of the stimuli, subjects were able to draw upon this unconscious trace to guide their 2AFC decision. In the subsequent paragraph, the authors relate these results to above-chance forced-choice guessing in blindsight subjects, but reject the analogy based on claims of parsimony.

      First, the authors dismiss the comparison of IB and blindsight too quickly. In particular, the results from experiment 3, in which some subjects adamantly (confidently) deny seeing the critical stimulus but guess a feature at above-chance levels (at least at the super subject level and assuming the online subjects interpreted and used the confidence scale correctly), seem highly analogous to blindsight. Importantly, the analogy is strengthened if the subjects who were confident in not seeing anything also reported not being confident in their forced-choice judgments, but as mentioned above this data was not reported.

      Second, the authors fail to mention an even more straightforward explanation of these results, which is that ~8% of subjects misinterpreted the "unusual" part of the standard IB question used in experiments 1-3. After all, colored lines and shapes are pretty "usual" for psychology experiments and were present in the distractor stimuli everyone attended to. It seems quite reasonable that some subjects answered this first question, "no, I didn't see anything unusual", but then when told that there was a critical stimulus and asked to judge one of its features, adjusted their response by reconsidering, "oh, ok, if that's the unusual thing you were asking about, of course I saw that extra line flash on the left of the screen". This seems like a more parsimonious alternative compared to either of the two interpretations considered by the authors: (1) IB does not exist, (2) super-subject d' is driven by unconscious processing. Why not also consider: (3) a small percentage of subjects misinterpreted the Y/N question about noticing something unusual. In experiments 4-5, they dropped the term "unusual" but do not analyze whether this made a difference nor do they report enough of the data (subject numbers for the Y/N question and 2AFC) for readers to determine if this helped reduce the ~8% overestimate of IB rates.

      (7) The authors use sub-optimal questioning procedures to challenge the existence of the phenomenon this questioning is intended to demonstrate. A more neutral interpretation of this study is that it is a critique on methods in IB research, not a critique on IB as a manipulation or phenomenon. The authors neglect to mention the dozens of modern IB experiments that have improved upon the simple Y/N IB questioning methods. For example, in Michael Cohen's IB experiments (e.g., Cohen et al., 2011; Cohen et al., 2020; Cohen et al., 2021), he uses a carefully crafted set of probing questions to conservatively ensure that subjects who happened to notice the critical stimuli have every possible opportunity to report seeing them. In other experiments (e.g., Hirschhorn et al., 2024; Pitts et al., 2012), researchers not only ask the Y/N question but then follow this up by presenting examples of the critical stimuli so subjects can see exactly what they are being asked about (recognition-style instead of free recall, which is more sensitive). These follow-up questions include foil stimuli that were never presented (similar to the stimulus-absent trials here), and ask for confidence ratings of all stimuli. Conservative, pre-defined exclusion criteria are employed to improve the accuracy of their IB-rate estimates. In these and other studies, researchers are very cautious about trusting what subjects report seeing, and in all cases, still find substantial IB rates, even to highly salient stimuli. The authors should consider at least mentioning these improved methods, and perhaps consider using some of them in their future experiments.

    3. Reviewer #2 (Public review):

      In this study, Nartker et al. examine how much observers are conscious of using variations of classic inattentional blindness studies. The key idea is that rather than simply asking observers if they noticed a critical object with one yes/no question, the authors also ask follow-up questions to determine if observers are aware of more than the yes/no questions suggest. Specifically, by having observers make forced choice guesses about the critical object, the authors find that many observers who initially said "no" they did not see the object can still "guess" above chance about the critical object's location, color, etc. Thus, the authors claim, that prior claims of inattentional blindness are mistaken and that using such simple methods has led numerous researchers to overestimate how little observers see in the world. To quote the authors themselves, these results imply that "inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them."

      Before getting to a few issues I have with the paper, I do want to make sure to explicitly compliment the researchers for many aspects of their work. Getting massive amounts of data, using signal detection measures, and the novel use of a "super subject" are all important contributions to the literature that I hope are employed more in the future.

      Main point 1: My primary issue with this work is that I believe the authors are misrepresenting the way people often perform inattentional blindness studies. In effect, the authors are saying, "People do the studies 'incorrectly' and report that people see very little. We perform the studies 'correctly' and report that people see much more than previously thought." But the way previous studies are conducted is not accurately described in this paper. The authors describe previous studies as follows on page 3:

      "Crucially, however, this interpretation of IB and the many implications that follow from it rest on a measure that psychophysics has long recognized to be problematic: simply asking participants whether they noticed anything unusual. In IB studies, awareness of the unexpected stimulus (the novel shape, the parading gorilla, etc.) is retroactively probed with a yes/no question, standardly, "Did you notice anything unusual on the last trial which wasn't there on previous trials?". Any subject who answers "no" is assumed not to have any awareness of the unexpected stimulus.

      If this quote were true, the authors would have a point. Unfortunately, I do not believe it is true. This is simply not how many inattentional blindness studies are run. Some of the most famous studies in the inattentional blindness literature do not simply as observes a yes/no question (e.g., the invisible gorilla (Simons et al. 1999), the classic door study where the person changes (Simons and Levin, 1998), the study where observers do not notice a fight happening a few feet from them (Chabris et al., 2011). Instead, these papers consistently ask a series of follow-up questions and even tell the observers what just occurred to confirm that observers did not notice that critical event (e.g., "If I were to tell you we just did XYZ, did you notice that?"). In fact, after a brief search on Google Scholar, I was able to relatively quickly find over a dozen papers that do not just use a yes/no procedure, and instead as a series of multiple questions to determine if someone is inattentionally blind. In no particular order some papers (full disclosure: including my own):

      (1) Most et al. (2005) Psych Review<br /> (2) Drew et al. (2013) Psych Science<br /> (3) Drew et al. (2016) Journal of Vision<br /> (4) Simons et al. (1999) Perception<br /> (5) Simons and Levin (1998) Perception<br /> (6) Chabris et al. (2011) iPerception<br /> (7) Ward & Scholl (2015) Psych Bulletin and Review<br /> (8) Most et al. (2001) Psych Science<br /> (9) Todd & Marois (2005) Psych Science<br /> (10) Fougnie & Marois (2007) Psych Bulletin and Review<br /> (11) New and German (2015) Evolution and Human Behaviour<br /> (12) Jackson-Nielsen (2017) Consciousness and cognition<br /> (13) Mack et al. (2016) Consciousness and cognition<br /> (14) Devue et al. (2009) Perception<br /> (15) Memmert (2014) Cognitive Development<br /> (16) Moore & Egeth (1997) JEP:HPP<br /> (17) Cohen et al. (2020) Proc Natl Acad Sci<br /> (18). Cohen et al. (2011) Psych Science

      This is a critical point. The authors' key idea is that when you ask more than just a simple yes/no question, you find that other studies have overestimated the effects of inattentional blindness. But none of the studies listed above only asked simple yes/no questions. Thus, I believe the authors are mis-representing the field. Moreover, many of the studies that do much more than ask a simple yes/no question are cited by the authors themselves! Furthermore, as far as I can tell, the authors believe that if researchers do these extra steps and ask more follow-ups, then the results are valid. But since so many of these prior studies do those extra steps, I am not exactly sure what is being criticized.

      To make sure this point is clear, I'd like to use a paper of mine as an example. In this study (Cohen et al., 2020, Proc Natl Acad Sci USA) we used gaze-contingent virtual reality to examine how much color people see in the world. On the critical trial, the part of the scene they fixated on was in color, but the periphery was entirely in black and white. As soon as the trial ended, we asked participants a series of questions to determine what they noticed. The list of questions included:

      (1) "Did you notice anything strange or different about that last trial?"<br /> (2) "If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?"<br /> (3) "If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?"<br /> (4) "Did you notice anything different about the colors in the last scene?"<br /> (5) We then showed observers the previous trial again and drew their attention to the effect and confirmed that they did not notice that previously.<br /> In a situation like this, when the observers are asked so many questions, do the authors believe that "the inattentionally blind can see after all?" I believe they would not say that and the reason they would not say that is because of the follow-up questions after the initial yes/no question. But since so many previous studies use similar follow-up questions, I do not think you can state that the field is broadly overestimating inattentional blindness. This is why it seems to me to be a bit of a straw-man: most people do not just use the yes/no method.

      Main point 2: Let's imagine for a second that every study did just ask a yes/no question and then would stop. So, the criticism the authors are bringing up is valid (even though I believe it is not). I am not entirely sure that above chance performance on a forced choice task proves that the inattentionally blind can see after all. Could it just be a form of subliminal priming? Could there be a significant number of participants who basically would say something like, "No I did not see anything, and I feel like I am just guessing, but if you want me to say whether the thing was to the left or right, I will just 100% guess"? I know the literature on priming from things like change and inattentional blindness is a bit unclear, but this seems like maybe what is going on. In fact, maybe the authors are getting some of the best priming from inattentional blindness because of their large sample size, which previous studies do not use.<br /> I'm curious how the authors would relate their studies to masked priming. In masked priming studies, observers say the did not see the target (like in this study) but still are above chance when forced to guess (like in this study). Do the researchers here think that that is evidence of "masked stimuli are truly seen" even if a participant openly says they are guessing?

      Main point 3: My last question is about how the authors interpret a variety of inattentional blindness findings. Previous work has found that observers fail to notice a gorilla in a CT scan (Drew et al., 2013), a fight occurring right in front of them (Chabris et al., 2011), a plane on a runway that pilots crash into (Haines, 1991), and so forth. In a situation like this, do the authors believe that many participants are truly aware of these items but simply failed to answer a yes/no question correctly? For example, imagine the researchers made participants choose if the gorilla was in the left or right lung and some participants who initially said they did not notice the gorilla were still able to correctly say if it was in the left or right lung. Would the authors claim "that participant actually did see the gorilla in the lung"? I ask because it is difficult to understand what it means to be aware of something as salient as a gorilla in a CT scan, but say "no" you didn't notice it when asked a yes/no question. What does it mean to be aware of such important, ecologically relevant stimuli, but not act in response to them and openly say "no" you did not notice them?

      Overall: I believe there are many aspects of this set of studies that are innovative and I hope the methods will be used more broadly in the literature. However, I believe the authors misrepresent the field and overstate what can be interpreted from their results. While I am sure there are cases where more nuanced questions might reveal inattentional blindness is somewhat overestimated, claims like "the inattentionally blind can see after all" or "Inattentionally blind subjects consciously perceive thest stimuli after all" seem to be incorrect (or at least not at all proven by this data).

    4. Reviewer #3 (Public review):

      Summary:

      Authors try to challenge the mainstream scientific as well as popularly held view that Inattentional Blindness (IB) signifies subjects having no conscious awareness of what they report not seeing (after being exposed to unexpected stimuli). They show that even when subjects indicate NOT having seen the unexpected stimulus, they are at above chance level for reporting features such as location, color or movement of these stimuli. Also, they show that 'not seen' responses are in part due to a conservative bias of subjects, i.e. they tend to say no more than yes, regardless of actual visibility. Their conclusion is that IB may not (always) be blindness, but possibly amnesia, uncertainty etc.

      Strengths:

      A huge pool of (25.000) subjects is used. They perform several versions of the IB experiments, both with briefly presented stimuli (as the classic Mack and Rock paradigm), as well as with prolonged stimuli moving over the screen for 5 seconds (a bit like the famous gorilla version), and all these versions show similar results, pointing in the same direction: above chance detection of unseen features, as well as conservative bias towards saying not seen.

      Weaknesses:

      Results are all significant but effects are not very strong, typically a bit above chance. Also, it is unclear what to compare these effects to, as there are no control experiments showing what performance would have been in a dual task version where subjects have to also report features etc for stimuli that they know will appear in some trials

      There are quite some studies showing that during IB, neural processing of visual stimuli continues up to high visual levels, for example, Vandenbroucke et al 2014 doi:10.1162/jocn_a_00530 showed preserved processing of perceptual inference (i.e. seeing a kanizsa illusion) during IB. Scholte et al 2006 doi: 10.1016/j.brainres.2005.10.051 showed preserved scene segmentation signals during IB. Compared to the strength of these neural signatures, the reported effects may be considered not all that surprising, or even weak.

    1. eLife Assessment

      This valuable study introduces a novel split-belt treadmill learning task to reveal distinct and parallel learning sub-components of gait adaptation: slow and gradual error-based perceptual realignment, and a more deliberate and flexible "stimulus-response" style learning process. While the behavioural results convincingly support the presence of a non-error-based learning process during continuous movements, the computational modelling provides incomplete evidence for establishing the nature of this secondary learning process.

    2. Reviewer #1 (Public review):

      Summary:

      Rossi et al. asked whether gait adaptation is solely a matter of slow perceptual realignment or if it also involves fast/flexible stimulus-response mapping mechanisms. To test this, they conducted a series of split-belt treadmill experiments with ramped perturbations, revealing behavior indicative of a flexible, automatic stimulus-response mapping mechanism.

      Strengths:

      (1) The study includes a perceptual test of leg speed, which correlates with the perceptual realignment component of motor aftereffects. This indicates that there are motor performances that are not accounted for by perceptual re-alignment.

      (2) They study incorporates qualitatively distinct, hypothesis-driven models of adaptation and proposes a new framework that integrates these various mechanisms.

      Weaknesses:

      (1) The study could benefit from considering other alternative models. As the authors noted in their discussion, while the descriptive models explain some patterns of behaviour/aftereffects, they don't currently account for how these mechanisms influence the initial learning process itself.

      a. For example, the pattern of gait asymmetric might differ for perceptual realignment (a smooth, gradual process), structural learning (more erratic, involving hypothesis testing/reasoning to understand the perturbation, see (Tsay et al. 2024) for a recent review on Reasoning), and stimulus-response mapping (possibly through a reinforcement based trial-and-error approach). If not formally doing a model comparison, the manuscript might benefit from clearly laying out the behavioural predictions for how these different processes shape initial learning.

      b. Related to the above, the authors noted that the absence of difference during initial learning suggests that the differences in Experiment 2 in the ramp-up phase are driven by two distinct processes: structural learning and memory-based processes. If the assumptions about initial learning are not clear, this logic of this conclusion is hard to follow.

      c. The authors could also test a variant of the dual-rate state-space model with two perceptual realignment processes where the constraints on retention and learning rate are relaxed. This model would be a stronger test for two perceptual re-alignment processes: one that is flexible and another that is rigid, without mandating that one be fast learning and fast forgetting, and the other be slow learning and slow forgetting.

      (2) The authors claim that stimulus-response mapping operates outside of explicit/deliberate control. While this could be true, the survey questions may have limitations that could be more clearly acknowledged.

      a. Specifically, asking participants at the end of the experiments to recall their strategies may suffer from memory biases (e.g., participants may be biased by recent events, and forget about the explicit strategies early in the experiment), be susceptible to the framing of the questions (e.g., participants not being sure what the experimenter is asking and how to verbalize their own strategy), and moreover, not clear what is the category of explicit strategies one might enact here which dictates what might be considered "relevant" and "accurate".

      b. The concept of perceptual realignment also suggests that participants are somewhat aware of the treadmill's changing conditions; therefore, as a thought experiment, if the authors have asked participants throughout/during the experiment whether they are trying different strategies, would they predict that some behaviour is under deliberate control?

      (3) The distinction between structural and memory-based differences in the two subgroups was based on the notion that memory-based strategies increase asymmetry. However, an alternative explanation could be that unfamiliar perturbations, due to the ramping up, trigger a surprise signal that leads to greater asymmetry due to reactive corrections to prevent one's fall - not because participants are generalizing from previously learned representations (e.g., (Iturralde & Torres-Oviedo, 2019)).

      Further contextualization:

      Recognizing the differences in dependent variables (reaching position vs. leg speed/symmetry in walking), could the Proprioceptive/Perceptual Re-alignment model also apply to gait adaptation (Tsay et al., 2022; Zhang et al., 2024)? Recent reaching studies show a similar link between perception and action during motor adaptation (Tsay et al., 2021) and have proposed a model aligning with the authors' correlations between perception and action. The core signal driving implicit adaptation is the discrepancy between perceived and desired limb position, integrating forward model predictions with proprioceptive/visual feedback.

      References

      Iturralde, P. A., & Torres-Oviedo, G. (2019). Corrective Muscle Activity Reveals Subject-Specific Sensorimotor Recalibration. eNeuro, 6(2). https://doi.org/10.1523/ENEURO.0358-18.2019

      Tsay, Jonathan S., Hyosub E. Kim, Samuel D. McDougle, Jordan A. Taylor, Adrian Haith, Guy Avraham, John W. Krakauer, Anne G. E. Collins, and Richard B. Ivry. 2024. "Fundamental Processes in Sensorimotor Learning: Reasoning, Refinement, and Retrieval." ELife 13 (August). https://doi.org/10.7554/eLife.91839.

      Tsay, Jonathan S., Hyosub E. Kim, Darius E. Parvin, Alissa R. Stover, and Richard B. Ivry. 2021. "Individual Differences in Proprioception Predict the Extent of Implicit Sensorimotor Adaptation." Journal of Neurophysiology, March. https://doi.org/10.1152/jn.00585.2020.

      Tsay, Jonathan S., Hyosub Kim, Adrian M. Haith, and Richard B. Ivry. 2022. "Understanding Implicit Sensorimotor Adaptation as a Process of Proprioceptive Re-Alignment." ELife 11 (August). https://doi.org/10.7554/eLife.76639.

      Zhang, Zhaoran, Huijun Wang, Tianyang Zhang, Zixuan Nie, and Kunlin Wei. 2024. "Perceptual Error Based on Bayesian Cue Combination Drives Implicit Motor Adaptation." ELife. https://doi.org/10.7554/elife.94608.1.

    3. Reviewer #2 (Public review):

      Recent findings in the field of motor learning have pointed to the combined action of multiple mechanisms that potentially contribute to changes in motor output during adaptation. A nearly ubiquitous motor learning process occurs via the trial-by-trial compensation of motor errors, often attributed to cerebellar-dependent updating. This error-based learning process is slow and largely unconscious. Additional learning processes that are rapid (e.g., explicit strategy-based compensation) have been described in discrete movements like goal-directed reaching adaptation. However, the role of rapid motor updating during continuous movements such as walking has been either under-explored or inconsistent with those found during the adaptation of discrete movements. Indeed, previous results have largely discounted the role of explicit strategy-based mechanisms for locomotor learning. In the current manuscript, Rossi et al. provide convincing evidence for a previously unknown rapid updating mechanism for locomotor adaptation. Unlike the now well-studied explicit strategies employed during reaching movements, the authors demonstrate that this stimulus-response mapping process is largely unconscious. The authors show that in approximately half of subjects, the mapping process appears to be memory-based while the remainder of subjects appear to perform structural learning of the task design. The participants that learned using a structural approach had the capability to rapidly generalize to previously unexplored regions of the perturbation space.

      One result that will likely be particularly important to the field of motor learning is the authors' quite convincing correlation between the magnitude of proprioceptive recalibration and the magnitude error-based updating. This result beautifully parallels results in other motor learning tasks and appears to provide a robust marker for the magnitude of the mapping process (by means of subtracting off the contribution of error-based motor learning). This is a fascinating result with implications for the motor learning field well beyond the current study.

      A major strength of this manuscript is the large sample size across experiments and the extent of replication performed by the authors in multiple control experiments.

      Finally, I commend the authors on extending their original observations via Experiment 2. While it seems that participants use a range of mapping mechanisms (or indeed a combination of multiple mapping mechanisms), future experiments may be able to tease apart why some subjects use memory versus structural mapping. A future ability to push subjects to learn structurally-based mapping rules has the potential to inform rehabilitation strategies.

      Overall, the manuscript is well written, the results are clear, and the data and analyses are convincing. The manuscript's weaknesses are minor, mostly related to the presentation of the results and modeling.

      Weaknesses:

      The overall weaknesses in the manuscript are minor and can likely be addressed with textual changes.

      (1) A key aspect of the experimental design is the speed of the "ramp down" following the adaptation period. If the ramp-down is too slow, then no after-effects would be expected even in the alternative recalibration-only/error-based only hypothesis. How did the authors determine the appropriate rate of ramp-down? Do alternative choices of ramp-down rates result in step length asymmetry measures that are consistent with the mapping hypothesis?

      (2) Overall, the modeling as presented in Figure 3 (Equation 1-3) is a bit convoluted. To my mind, it would be far more useful if the authors reworked Equations 1-3 and Figure 3 (with potential changes to Figure 2) so that the motor output (u) is related to the stride rather than the magnitude of the perturbation. There should be an equation relating the forward model recalibration (i.e., Equation 1) to the fraction of the motor error on a given stride, something akin to u(k+1) = r * (u(k) - p(k)). This formulation is easier to understand and commonplace in other motor learning tasks (and likely what the authors actually fit given the Smith & Shadmehr citation and the derivations in the Supplemental Materials). Such a change would require that Figure 3's independent axes be changed to "stride," but this has the benefit of complementing the presentation that is already in Figure 5.

    4. Reviewer #3 (Public review):

      Summary:

      In this work, Rossi et al. use a novel split-belt treadmill learning task to reveal distinct sub-components of gait adaptation. The task involved following a standard adaptation phase with a "ramp-down" phase that helped them dissociate implicit recalibration and more deliberate SR map learning. Combined with modeling and re-analysis of previous studies, the authors show multiple lines of evidence that both processes run simultaneously, with implicit learning saturating based on intrinsic learning constraints and SR learning showing sensitivity to a "perceptual" error. These results offer a parallel with work in reaching adaptation showing both explicit and implicit processes contributing to behavior; however, in the case of gait adaptation the deliberate learning component does not appear to be strategic but is instead a more implicit SR learning processes.

      Strengths:

      (1) The task design is very clever and the "ramp down" phase offers a novel way to attempt to dissociate competing models of multiple processes in gait adaptation.

      (2) The analyses are thorough, as is the re-analysis of multiple previous data sets.

      (3) The querying of perception of the different relative belt speeds is a very nice addition, allowing the authors to connect different learning components with error perception.

      (4) The conceptual framework is compelling, highlighting parallels with work in reaching but also emphasizing differences, especially w/r/t SR learning versus strategic behaviors. Thus the discovery of an SR learning process in gait adaptation would be both novel and also help conjoin different siloed subfields of motor learning research.

      Weaknesses:

      (1) The behavior in the ramp-down phase does indeed appear to support multiple learning processes. However, I may have missed something, but I have a fundamental worry about the specific modeling and framing of the "SR" learning process. If I correctly understand, the SR process learns by adjusting to perceived L/R belt speed differences (Figure 7). What is bugging me is why that process would not cause the SR system to still learn something in the later parts of the ramp-down phase when the perceived speed differences flip (Figure 4). I do believe this "blunted learning" is what the SR component is actually modeled with, given this quote in the caption to Figure 7: "When the perturbation is perceived to be opposite than adaptation, even if it is not, mapping is zero and the Δ motor output is constant, reflecting recalibration adjustments only." It seems a priori odd and perhaps a little arbitrary to me that a SR learning system would just stop working (go to zero) just because the perception flipped sign. Or for that matter "generalize" to a ramp-up (i.e., just learn a new SR mapping just like the system did at the beginning of the first perturbation). What am I missing that justifies this key assumption? Or is the model doing something else? (if so that should be more clearly described).

      (2) A more minor point, but given the sample size it is hard to be convinced about the individual difference analysis for structure learning (Figure 5). How clear is it that these two groups of subjects are fully separable and not on a continuum? The lack of clusters in another data set seems like a somewhat less than convincing control here.

    1. Author response:

      Reviewer #1 (Public Review):

      In this work, the authors aimed to understand how titins derived from different nuclei within the syncytium are organized and integrated after cell fusion during skeletal muscle development and remodeling. The authors developed mCherry titin knock-in mice with the fluorophore mCherry inserted into titin's Z-disk region to track the titin during cell fusion. The results suggested that titin exhibited homogenous distribution after cell fusion. The authors also probed on how titin behaves during muscle injury by implantation of titin-eGFP myoblasts into adult mCherry-titin mice. Interestingly, titin is retained at the proximal nucleus and does not diffuse across the whole syncytium in this system. The findings of the study are novel and interesting. The experimental approaches are appropriate. The results are described well. However, the manuscript needs revisions to enhance its clarity.

      (1) In this work, the authors have not described the statistical analysis appropriately. In most of the figures, significance levels are not described. The information on the biological and technical replicates is missing in almost all the figures. This information is critical for understanding the strength of the experimentation.

      Thank you for this feedback, added the missing information to the figure legends.

      (2) The in vivo experiments are underpowered. The authors have used only 3 animals in the cardiotoxin injury experiment and eliminated another 3 animals from the analysis. How did they determine insufficient myoblast integration?

      The experimental design was targeted at using transplantation of myoblasts into skeletal muscle to obtain information on the ability of transplanted cells to fuse with cells in the injured area – and if those myoblasts could provide titin protein beyond the confinement of the transplanted cells (as would be expected after cell fusion). The goal was not to optimize cell transplantation with improved force generation of lesioned muscle. For this, we agree, the experiments would be underpowered.

      Here, we use a different approach, and successfully demonstrate the integration of titin protein from transplanted cells into sarcomeres of host muscle fibers. Here, only an animal number of 5 per group was approved by the local authorities, in agreement with the scope of our proposed hypothesis on cell fusion contributing titin beyond the transplanted cell and in agreement with the 3R guidelines and the necessity to addressed our research question in as few animals as possible. We proposed the need for at least 3 animals per implantation group and included 2 additional animals for compensation in case there was insufficient myoblast integration (no detection of GFP+ cells). The resulting n=3 and n=4 animals provided enough fusion events to show that even after 3 weeks, titin protein is confined to the address our hypothesis: in case after cell fusion titin is homogenously distributed, we would have expected red and greed striation throughout the fiber. This was not the case. In 8 out of 8 fused cells we had a segregation of green and red titin molecules as depicted in figure 6 and S5.

      (3) Similarly, the in vitro imaging experiments, especially the in vitro titin mobility assays used only 3 cells (Fig 2b) or 6-9 cells (Fig 2c-2e). The number of cells imaged is insufficient to derive a valid conclusion. What is the variability in the results between cells? Whether all the cells behave similarly in titin mobility assays?

      For Figure 2 we had described our replicates insufficiently. Quantification in 2b-e consists of total 9 cells out of 3 independent experiments (3 per experiment). For 2d one outlier (Grubbs test) was excluded for the GFP signal. For 2e we only included cells that could be fitted with a two-phase association curve. That resulted in 6 cells for the GFP signal and 7 cells for the mCherry signal.

      (4) Figure 1c-e, Figure 2a, Figure 3, Figure 4, Figure 5, Figure 6- please describe the replicates and also if possible, quantify the data and present them as separate figures.

      1) Figure 1d (former 1c) is the validation that titin is properly integrated into the sarcomere and that the cherry signal localizes to the Z-disk, overlapping with actinin. This is qualitative, not quantitative information and replicated and confirmed in figure 2. 1e (former 1d) is a representative image for the quantification in 1f (former 1e) with 3 biological replicates (=cells) and 3 technical replicates (=Z-disks) each, for every time point significantly different with p<0.001, tested by 2-way ANOVA

      2) 2a: representative image (+regarding profile) for quantification in 2b (9 biological replicates(=cells) measured at 3 different experiment days) (see answer to 1-3)

      3) Representative images: Cells were seeded on several cover slips and fusion was started. This was done on 4 occasions (=technical replicates) with different stainings (see supplement) and 30+ images were taken in total with at least 5 images per staining. The taken images of different fusion stadiums were later classified based on the distribution of the differentially labeled titin.

      4) a-c: representative image that shows two independent fusion events; fusion experiments were performed at 4 days with a total of 13 fusion events captured (6 only immature cells, 7 with one mature cell). For quantification in d+e, very small (< 1000 μm2) and very large (> 10,000 μm2) syncytia were excluded to minimize the effect of large size differences of the syncytia, so that 5 immature and 4 mature fusion events remained for comparative analysis.

      5) smFISH Experiment was repeated on 2 days and 6 images of fusion events were made. Since they were in different stages of fusion and 4 elements contributed to the images (mCherry-RNA, GFP-RNA, mCh-Titin protein, GFP-Titin protein), it was difficult to compare. However, we added the quantification to Fig. S4 (b and c) and added a regarding paragraph to the results. There seems to be a smaller overlap region for the RNA than for the protein signal.

      6) Representative images with n=6 (but 3 excluded due to insufficient myoblast integration) biological replicates (mice) for the CTX+cells group (main experiment group) and n=4 for the only cells control and n=1 for the only CTX control, based on 3R regulation of animal experiments. From each mouse (n=11) the contralateral TA muscle was harvested as well to serve as an uninjured and without cell transplantation control.

      (5) Figure 2- the authors excluded samples with an obvious decrease in cell quality during imaging from the analysis. How do the authors assess the cell quality? Simply by visual examination? Or were the samples that did not show fluorescence recovery eliminated? I am wondering what percentage of cells showed poor cell quality. How do they avoid the bias? I recommend that the authors include these cells also for the analysis of data presented in Figures 2b, 2c, and 2f.

      Cells were not excluded for their recovery status, but only if they showed signs of cell death (collapse of sarcomere structures, membrane bubbling, etc). All cells that stayed alive during the imaging showed a fluorescence recovery. Cells that had only a slower or uncomplete recovery were not excluded from the complete analysis. One cell was excluded from the comparison of exchange half-life (Fig. 2d), since it was a significant outlier. For Figure 2e (Fast phase) only cells could be included, where we were able to fit a two-phase association curve.

      (6) It is unclear how the authors identified the different stages of cell fusion in the microscopy images i.e. early fusion, distribution, and complete distribution.

      Early fusion was characterized when two cells made connection with their membranes, but differentially labeled titin has not yet mixed. Distribution was characterized when titin mixing has started but is not yet complete.

    2. eLife Assessment

      This study offers valuable information on how titin derived from different nuclei within the syncytium is organized and integrated during skeletal muscle development and remodeling. The authors developed a novel mCherry titin knock-in mice with the fluorophore mCherry inserted into titin's Z-disk region to track the titin during cell fusion. The approach using mcherry adds to understanding of the role and localization of titin in controlling stiffness of striated muscles and fine tuning contraction. The results demonstrate that the integration of titin into the sarcomere is tightly regulated, with its unexpected mobility aiding in the uniform distribution of titin post-cell fusion. Although the experimental approach is convincing, the work is very qualitative in its approaches, and the data needs rigorous statistical analysis. There is a need for some clarification concerning numbers of animals and control groups. Future studies will need more rigorous data analysis and interpretation.

    3. Reviewer #1 (Public Review):

      In this work, the authors aimed to understand how titins derived from different nuclei within the syncytium are organized and integrated after cell fusion during skeletal muscle development and remodeling. The authors developed mCherry titin knock-in mice with the fluorophore mCherry inserted into titin's Z-disk region to track the titin during cell fusion. The results suggested that titin exhibited homogenous distribution after cell fusion. The authors also probed on how titin behaves during muscle injury by implantation of titin-eGFP myoblasts into adult mCherry-titin mice. Interestingly, titin is retained at the proximal nucleus and does not diffuse across the whole syncytium in this system. The findings of the study are novel and interesting. The experimental approaches are appropriate. The results are described well. However, the manuscript needs revisions to enhance its clarity.

      (1) In this work, the authors have not described the statistical analysis appropriately. In most of the figures, significance levels are not described. The information on the biological and technical replicates is missing in almost all the figures. This information is critical for understanding the strength of the experimentation.<br /> (2) The in vivo experiments are underpowered. The authors have used only 3 animals in the cardiotoxin injury experiment and eliminated another 3 animals from the analysis. How did they determine insufficient myoblast integration?<br /> (3) Similarly, the in vitro imaging experiments, especially the in vitro titin mobility assays used only 3 cells (Fig 2b) or 6-9 cells (Fig 2c-2e). The number of cells imaged is insufficient to derive a valid conclusion. What is the variability in the results between cells? Whether all the cells behave similarly in titin mobility assays?<br /> (4) Figure 1c-e, Figure 2a, Figure 3, Figure 4, Figure 5, Figure 6- please describe the replicates and also if possible, quantify the data and present them as separate figures.<br /> (5) Figure 2- the authors excluded samples with an obvious decrease in cell quality during imaging from the analysis. How do the authors assess the cell quality? Simply by visual examination? Or were the samples that did not show fluorescence recovery eliminated? I am wondering what percentage of cells showed poor cell quality. How do they avoid the bias? I recommend that the authors include these cells also for the analysis of data presented in Figures 2b, 2c, and 2f.<br /> (6) It is unclear how the authors identified the different stages of cell fusion in the microscopy images i.e. early fusion, distribution, and complete distribution.

    4. Reviewer #2 (Public Review):

      The titin protein, a large component of striated muscle, plays a crucial role in the formation of the sarcomere during muscle development. As myocytes merge, titin integrates into the sarcomere structure, creating a stable myofilament system. The authors of the present study have shed light on the intricate process of myofilament assembly and disassembly, which is made possible by tracking labeled sarcomere components. In this study, they introduced the mCherry marker into titin's Z-disk to investigate its role in skeletal muscle development and remodeling. Their findings demonstrate that the integration of titin into the sarcomere is tightly regulated, with its unexpected mobility aiding in the uniform distribution of titin post-cell fusion. This distribution is crucial for the formation and maturation of skeletal muscle syncytium. In adult mice with mCherry-labeled titin, treating muscle injuries by introducing titin-eGFP myoblasts illustrates how myocytes integrate, fuse, and contribute to a seamless myofilament system across cell boundaries. The manuscript is well written, and the study is very novel.

    5. Reviewer #3 (Public Review):

      Hüttemeister et. al. describe a study where researchers utilized a genetic modification technique to knockin a red fluorescence protein variant mCherry into titin, a giant muscle protein, at the Z-disk in order to investigate skeletal muscle development and remodeling. The study revealed that titin's integration into the sarcomere is tightly regulated during muscle development, and its mobility allows for a homogeneous distribution of titin after cell fusion, which is crucial for syncytium formation and skeletal muscle maturation. Furthermore, in adult mice with mCherry-tagged titin, the researchers observed the process of muscle injury treatment by implanting myoblasts containing titin tagged with another fluorescent protein, eGFP. This experiment provided insights into how myocytes integrate, fuse, and contribute to the continuous myofilament system across cell boundaries during muscle regeneration. Interestingly, the behavior of titin proteins differed between immature primary cells and adult muscle tissue. The manuscripts point our interesting observation that develop treatment protocols that target the early postnatal patient or consider in utero cell therapy approaches based on controlling the ratio of therapeutic to diseased cells. though the approach is very interesting, the paper is very qualitative in its approaches. Community will benefit from better quantification of data as most of them are microscopic data that requires quantification.

    1. Author response:

      Reviewer #1 (Public Review):

      Lactobacillus plantarum is a beneficial bacterium renowned for its positive physiological effects and probiotic functions. Fu et al. conducted an investigation into the involvement of this bacterium in host purine metabolism. Initially, they employed microbiomics to analyze changes in L. plantarum within a hyperuricemia model, followed by isolation of the bacterium from this model. The gene map associated with purine nucleoside metabolism was determined through whole-genome analysis. Metabolic shifts in L. plantarum under nucleoside-enriched conditions were assessed using HPLC and metabolomics, while underlying mechanisms were explored through gene knockout experiments. Finally, the efficacy of L. plantarum was validated in hyperuricemia models involving goslings and mice. The authors presented their findings coherently and logically, addressing key questions using appropriate methodologies and yielding significant and innovative results. The authors demonstrated that host-derived Lactobacillus plantarum alleviates host hyperuricemia by influencing purine metabolism. However, their study primarily focused on this bacterium without delving deeper into the mechanisms underlying hyperuricemia beyond verification through two models. Nevertheless, these findings are sufficient to support their conclusion effectively. Additionally, further research is warranted to investigate the metabolites of Lactobacillus plantarum.

      We appreciate the reviewers' suggestions. We have studied Lactobacillus plantarum in detail, focusing specifically on its role in the purine nucleoside metabolism of the host, confirmed through in vitro and in vivo experiments. Our key finding demonstrates how Lactobacillus plantarum contributes to this process. We also examined the expression of hepatic uric acid synthesis proteins and renal uric acid excretion proteins related to alleviating host hyperuricaemia (Figure 9). While discussing the metabolites of Lactobacillus plantarum may fall outside the scope of this article, we plan to investigate this further. Our goal is to identify a signature metabolite via in vitro and in vivo studies and explore how it may help reduce hyperuricaemia in the host.

      Reviewer #2 (Public Review):

      Summary:

      Purine nucleoside metabolism in intestinal flora is integral to the purine nucleoside metabolism in the host. This study identified the iunH gene in Lactobacillus plantarum that regulates its purine nucleoside metabolism. Oral gavage of Lactobacillus plantarum and subsequent analysis showed it maintains homeostasis of purine nucleoside metabolism in the host.

      Strengths:

      This study presents sufficient evidence for the role of Lactobacillus plantarum in alleviating hyperuricaemia, combining microbiomics, whole genomics, in vitro bacterial culture, and metabolomics. These results suggest the iunH gene of Lactobacillus plantarum is crucial in host purine nucleoside metabolism. The experimental design is robust, and the data are of high quality. This study makes significant contributions to the fields of hyperuricaemia, purine nucleoside metabolism, and Lactobacillus plantarum investigation.

      We appreciate the reviewers' encouraging feedback.

      Weaknesses:

      A key limitation of this manuscript is the absence of an in-depth study on the alleviation metabolism of Lactobacillus plantarum. Notable questions include: What overall metabolic changes occur in a purine nucleoside-enriched environment? How do the metabolites of Lactobacillus plantarum vary? Do these metabolites influence host purine nucleoside metabolism? These areas merit further investigation.

      Thank you! The Supplementary Material link includes intracellular and extracellular metabolomics data for Lactobacillus plantarum, detailing the overall metabolic changes. We agree with the reviewer that the effect of metabolites on host purine nucleoside metabolism is worth investigating, but it has not been explored too much in this paper as it focuses more on the changes in the metabolites of the purine nucleosides themselves. We plan to explore this topic further in future research.

      Reviewer #3 (Public Review):

      Fu et al. present a multi-model study using goose and mouse that investigates the protective effects of Lactobacillus plantarum against hyperuricaemia. They highlight this strain's significance and clarify its role in responding to intestinal nucleoside levels and affecting uric acid metabolism through modulation of host signaling pathways.

      Strengths:

      (1) Fu et al. created two animal models for validation, yielding more reliable and extensive data. In addition, the in vitro tests were repeatedly tested by a multitude of methods, proving to be convincing.

      (2) This study integrates microbiomics, whole genomics, in vitro bacterial culture, and metabolomics, providing a wealth of data and valuable insights for future research.

      We thank the reviewer for their encouraging assessment.

      Weakness:

      Fu et al. clearly described the role of Lactobacillus plantarum, but it is also important to explore its other mechanisms influencing uric acid metabolism in the host. While changes in hepatic and renal uric acid metabolism were confirmed, the gut's role in this process deserves investigation, particularly regarding whether Lactobacillus plantarum or its metabolites act within the gut. The authors have effectively conveyed the story outlined in the article's title, and the remainder can be explored later. In addition, further discussion is needed to highlight how this strain of Lactobacillus plantarum differs from other Lactobacillus strains or how it innovatively functions differ from some literature reported.

      Thank you! We fully acknowledge the importance of investigating the role of gut in this process, especially whether Lactobacillus plantarum or its metabolites have an effect within the gut, which would be an interesting topic for a follow-up study. We fully agree that it is crucial to highlight how this Lactobacillus plantarum differs from other strains and those reported in the literature regarding its innovative functions, as discussed in detail in lines 343 to 376. We fully acknowledge the importance of investigating the role of gut in this process, especially whether Lactobacillus plantarum or its metabolites have an effect within the gut, which would be an interesting topic for a follow-up study. We fully agree that it is crucial to highlight how this Lactobacillus plantarum differs from other strains and those reported in the literature regarding its innovative functions, as discussed in detail in lines 343 to 376. Previous studies indicate that Lactobacillus plantarum can reduce hyperuricaemia, but its specific uric acid-lowering mechanism and the process of nucleoside degradation remain unclear. We investigated the nucleoside hydrolysis function of Lactobacillus plantarum, identified key genes, and validated by gene knockout. Our findings suggest that host-derived Lactobacillus plantarum plays an antagonistic role against hyperuricaemia.

    2. eLife Assessment

      The landmark significance of this manuscript is based on the mechanistic description of purine metabolism by Lactobacillus plantarum, which helps to alleviate hyperuricemia, which is a phenotype that underlies multiple disease symptoms. The evidence provided for L. plantarum's involvement in reducing hyperuricemia was exceptional, combining microbiomics, whole genomics, in vitro bacterial culture, gene knock-outs, and metabolomics. Collectively, the study shows a clear link between the gut microbiota and hyperuricemia, providing a pathway for modification to help alleviate this condition.

    3. Reviewer #1 (Public Review):

      Lactobacillus plantarum is a beneficial bacterium renowned for its positive physiological effects and probiotic functions. Fu et al. conducted an investigation into the involvement of this bacterium in host purine metabolism. Initially, they employed microbiomics to analyze changes in L. plantarum within a hyperuricemia model, followed by isolation of the bacterium from this model. The gene map associated with purine nucleoside metabolism was determined through whole-genome analysis. Metabolic shifts in L. plantarum under nucleoside-enriched conditions were assessed using HPLC and metabolomics, while underlying mechanisms were explored through gene knockout experiments. Finally, the efficacy of L. plantarum was validated in hyperuricemia models involving goslings and mice. The authors presented their findings coherently and logically, addressing key questions using appropriate methodologies and yielding significant and innovative results. The authors demonstrated that host-derived Lactobacillus plantarum alleviates host hyperuricemia by influencing purine metabolism. However, their study primarily focused on this bacterium without delving deeper into the mechanisms underlying hyperuricemia beyond verification through two models. Nevertheless, these findings are sufficient to support their conclusion effectively. Additionally, further research is warranted to investigate the metabolites of Lactobacillus plantarum.

    4. Reviewer #2 (Public Review):

      Summary:<br /> Purine nucleoside metabolism in intestinal flora is integral to the purine nucleoside metabolism in the host. This study identified the iunH gene in Lactobacillus plantarum that regulates its purine nucleoside metabolism. Oral gavage of Lactobacillus plantarum and subsequent analysis showed it maintains homeostasis of purine nucleoside metabolism in the host.

      Strengths:<br /> This study presents sufficient evidence for the role of Lactobacillus plantarum in alleviating hyperuricaemia, combining microbiomics, whole genomics, in vitro bacterial culture, and metabolomics. These results suggest the iunH gene of Lactobacillus plantarum is crucial in host purine nucleoside metabolism. The experimental design is robust, and the data are of high quality. This study makes significant contributions to the fields of hyperuricaemia, purine nucleoside metabolism, and Lactobacillus plantarum investigation.

      Weaknesses:<br /> A key limitation of this manuscript is the absence of an in-depth study on the alleviation metabolism of Lactobacillus plantarum. Notable questions include: What overall metabolic changes occur in a purine nucleoside-enriched environment? How do the metabolites of Lactobacillus plantarum vary? Do these metabolites influence host purine nucleoside metabolism? These areas merit further investigation.

    5. Reviewer #3 (Public Review):

      Fu et al. present a multi-model study using goose and mouse that investigates the protective effects of Lactobacillus plantarum against hyperuricaemia. They highlight this strain's significance and clarify its role in responding to intestinal nucleoside levels and affecting uric acid metabolism through modulation of host signaling pathways.

      Strengths:<br /> (1) Fu et al. created two animal models for validation, yielding more reliable and extensive data. In addition, the in vitro tests were repeatedly tested by a multitude of methods, proving to be convincing.<br /> (2) This study integrates microbiomics, whole genomics, in vitro bacterial culture, and metabolomics, providing a wealth of data and valuable insights for future research.

      Weakness:<br /> Fu et al. clearly described the role of Lactobacillus plantarum, but it is also important to explore its other mechanisms influencing uric acid metabolism in the host. While changes in hepatic and renal uric acid metabolism were confirmed, the gut's role in this process deserves investigation, particularly regarding whether Lactobacillus plantarum or its metabolites act within the gut. The authors have effectively conveyed the story outlined in the article's title, and the remainder can be explored later. In addition, further discussion is needed to highlight how this strain of Lactobacillus plantarum differs from other Lactobacillus strains or how it innovatively functions differ from some literature reported.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, Steinemann et al. characterized the nature of stochastic signals underlying the trial-averaged responses observed in the lateral intraparietal cortex (LIP) of non-human primates (NHPs), while these performed the widely used random dot direction discrimination task. Ramp-up dynamics in the trial averaged LIP responses were reported in numerous papers before. However, the temporal dynamics of these signals at the single-trial level have been subject to debate. Using large-scale neuronal recordings with Neuropixels in NHPs, allows the authors to settle this debate rather compellingly. They show that drift-diffusion-like computations account well for the observed dynamics in LIP.

      Strengths:

      This work uses innovative technical approaches (Neuropixel recordings in behaving macaque monkeys). The authors tackle a vexing question that requires measurements of simultaneous neuronal population activity and hence leverage this advanced recording technique in a convincing way

      They use different population decoding strategies to help interpret the results.

      They also compare how decoders relying on the data-driven approach using dimensionality reduction of the full neural population space compare to decoders relying on more traditional ways to categorize neurons that are based on hypotheses about their function. Intriguingly, although the functionally identified neurons are a modest fraction of the population, decoders that only rely on this fraction achieve comparable decoding performance to those relying on the full population. Moreover, decoding weights for the full population did not allow the authors to reliably identify the functionally identified subpopulation.

      Weaknesses:

      No major weaknesses beyond a few, largely clarification issues, detailed below.

      We thank Reviewer 1 (R1) for this summary. The revised manuscript incorporates R1’s suggestions, as detailed below.

      Reviewer #2 (Public Review):

      Steinemann, Stine, and their co-authors studied the noisy accumulation of sensory evidence during perceptual decision-making using Neuropixels recordings in awake, behaving monkeys. Previous work has largely focused on describing the neural underpinnings through which sensory evidence accumulates to inform decisions, a process which on average resembles the systematic drift of a scalar decision variable toward an evidence threshold. The additional order of magnitude in recording throughput permitted by the methodology adopted in this work offers two opportunities to extend this understanding. First, larger-scale recordings allow for the study of relationships between the population activity state and behavior without averaging across trials. The authors’ observation here of covariation between the trial-to-trial fluctuations of activity and behavior (choice, reaction time) constitutes interesting new evidence for the claim that neural populations in LIP encode the behaviorally-relevant internal decision variable. Second, using Neuropixels allows the authors to sample LIP neurons with more diverse response properties (e.g. spatial RF location, motion direction selectivity), making the important question of how decision-related computations are structured in LIP amenable to study. For these reasons, the dataset collected in this study is unique and potentially quite valuable.

      However, the analyses at present do not convincingly support two of the manuscript’s key claims: (1) that ”sophisticated analyses of the full neuronal state space” and ”a simple average of Tconin neurons’ yield roughly equivalent representations of the decision variable; and (2) that direction-selective units in LIP provide the samples of instantaneous evidence that these Tconin neurons integrate. Supporting claim (1) would require results from sophisticated population analyses leveraging the full neuronal state space; however, the current analyses instead focus almost exclusively on 1D projections of the data. Supporting claim (2) convincingly would require larger samples of units overlapping the motion stimulus, as well as additional control analyses.

      We thank the reviewer (R2) for their careful reading of our paper and the many useful suggestions.

      As detailed below, the revised manuscript incorporates new control analyses, improved quantification, and statistical rigor, which now provide compelling support for key claim #1. We do not regard claim #2 as a key claim of the paper. It is an intriguing finding with solid support, worthy of dissemination and further investigation. We have clarified the writing on this matter.

      Specific shortcomings are addressed in further detail below:

      (1) The key analysis-correlation between trial-by-trial activity fluctuations and behavior, presented in Figure 5 is opaque, and would be more convincing with negative controls. To strengthen the claim that the relationship between fluctuations in (a projection of) activity and fluctuations in behavior is significant/meaningful, some evidence should be brought that this relationship is specific - e.g. do all projections of activity give rise to this relationship (or not), or what level of leverage is achieved with respect to choice/RT when the trial-by-trial correspondence with activity is broken by shuffling.

      We do not understand why R2 finds the analysis opaque, but we are grateful for the lucid recommendations. The relationships between fluctuations in neural activity and behavior are indeed “specific” in the sense that R2 uses this term. In addition to the shuffle control, which destroys both relationships (Reviewer Figure 1), we performed additional control analyses that preserve the correspondence of neural signals and behavior on the same trial. We generated random coding directions (CDs) by establishing weight vectors that were either chosen from a standard normal distribution or by permuting the weights assigned to PC-1 in each session. The latter is the more conservative measure. Projections of the neural responses onto these random coding directions render 𝑆rand(𝑡). Specifically, the degree of leverage is effectively zero or greatly reduced. These analyses are summarized in a new Supplementary Figure S10. The bottom row of Figure S10 also addresses the question, “What degree of leverage and mediation would be expected for a theoretical decision variable?” This is accomplished by simulating decision variables using the drift-diffusion model fits in Figure 1c. The simulation is consistent with the leverage and (incomplete) mediation observed for the populations of Tcon neurons. For details see Methods, Simulated decision variables and Leverage of single-trial activity on behavior.

      (2) The choice to perform most analysis on 1D projections of population activity is not wholly appropriate for this unique type of dataset, limiting the novelty of the findings, and the interpretation of similarity between results across choices of projection appears circular:

      We disagree with the characterization of our argument as circular, but R2 raises several important points that will probably occur to other careful readers. We address them as subpoints 2.1–2.4, below. Importantly, we are neither claiming nor assuming that the LIP population activity is one-dimensional. We have revised the paper to avoid giving this impression. We are also not claiming that the average of Tin neurons (or the 1D projections) explains all features of the LIP population, nor would we expect it to, given the diversity of response fields across the population. Our objective is to identify the specific dimension within population activity that captures the decision variable (DV), which has been characterized successfully as a one-dimensional stochastic process—that is, a scalar function of time. We have endeavored to clarify our thinking on this point in the revised manuscript (e.g., lines 97–98, 103–104).

      (2.1) The bulk of the analyses (Figure 2, Figure 3, part of Figure 4, Figure 5, Figure 6) operate on one of several 1D projections of simultaneously recorded activity. Unless the embedding dimension of these datasets really does not exceed 1 (dimensionality using e.g. participation ratio in each session is not quantified), it is likely that these projections elide meaningful features of LIP population activity.

      We now report the participation ratio (4.4 ± 0.4, mean ± s.e. across sessions), and we state that the first 3 PCs explain 67.1±3.1% of the variance of time- and coherence-dependent signals used for the PCA. We agree that the 1D projections may elide meaningful features of LIP population activity. Indeed, we make this point through our analysis of the Min neurons. We do not claim that the 1D projections explain all of the meaningful features of LIP population activity. They do, however, reveal the decision variable, which is our main focus. These 1D signals contain features that correlate with events in the superior colliculus, summarized in Stine et al. (2023), attesting to their biological relevance.

      (2.2) Further, the observed similarity of results across these 1D projections may not be meaningful/interpretable. First, the rationale behind deriving Sramp was based on the ramping historically observed in Tin neurons during this task, so should be expected to resemble Tin.

      The Reviewer is correct that we would expect 𝑆ramp to resemble the ramping observed in Tin neurons. We refer to this approach as hypothesis-driven. It captures the drift component of drift-diffusion. It is true that the Tcon neurons exhibit such ramps in their trial average firing rates, but this does not guarantee in

      that the single-trial population firing rates would manifest as drift-diffusion. Indeed Latimer et al. (2015) concluded that the ramp-like averages comprise stepping from a low to a high firing rate on each trial at a random time. Therefore, while R2 is right to characterize the similarity of Tcon to the ramp direction in in trial-averaged activity as unsurprising, their similarity on single trials is not guaranteed.

      (2.3) Second, Tin comprises the largest fraction of the neuron groups sampled during most sessions, so SPC1 should resemble Tin too. The finding that decision variables derived from the whole population’s activity reduce essentially to the average of Tin neurons is thus at least in part ’baked in’ to the approach used for deriving the decision variables.

      This is incorrect. The Tcon in neurons constitute only 14.5% of the population, on average, across the sessions (see Table 1). This misunderstanding might contribute to R2’s concern about the importance of these neurons in shaping PC1. It is not simply because they are over-represented. Also, addressing R2’s concern about circularity, we would like to remind R2 that the selection of Tin neurons was based only on their spatial selectivity in the delayed saccade task. We do not see how it could be baked-in/guaranteed that a simple average of these neurons (i.e. zero degrees of freedom) yields dynamics and behavioral correlations that match those produced by dimensionality-reduction techniques that (𝑖) have degrees of freedom equal to the number of neurons and (𝑖𝑖) are blind to the neurons’ spatial selectivity. We have additionally modified what is now Supplementary Figure S13 (old Supplementary Figure S8), which portrays the mean accuracy of choice decoders trained on the neural activity of all neurons, only Tin neurons, all but the Tin neurons, and all but Tin and Min neurons, respectively. Figure S13 now highlights how much more readily choice can be decoded from the small population of Tin neurons than the remainder of the population.

      (2.4) The analysis presented in Figure S6 looks like an attempt to demonstrate that this isn’t the case, but is opaque. Are the magnitudes of weights assigned to units in Tin larger than in the other groups of units with preselected response properties? What is their mean weighting magnitude, in comparison with the mean weight magnitude assigned to other groups? What is the null level of correspondence observed between weight magnitude and assignment to Tin (e.g. a negative control, where the identities of units are scrambled)?

      The revised Figure S6—what is now Figure S9—displays more clearly that the weights assigned to Tcon and Tips neurons (purple & yellow, respectively) are larger in magnitude than those assigned in in to other neurons (gray). Author response table 1 shows a more detailed breakdown of the groups. Note that the length of the vector of weights is one. We are unsure what R2 means by “the null level of correspondence.” Perhaps it helps to know that the mean weight of the “other neurons” is close to zero for all four coding directions. However, it is the overlap of the weights and the relative abundance of non-Tin neurons that is more germane to the point we are making. To wit, knowing the weight (or percentile) of a neuron is a poor predictor that it belongs to the Tin category. This point is most clearly supported by the logistic regression (Fig. S9, bottom row). In other words, the large group of non-Tin neurons contribute substantially to all four coding directions examined in Figure S9. Thus, the similarity between Tin neurons and PC1 is not simply due to an over-representation of Tin neurons as suggested in item 2.3.

      Author response table 1.

      Mean weights assigned to neuron classes in four coding directions.

      (3) The principal components analysis normalization procedure is unclear, and potentially incorrect and misleading: Why use the chosen normalization window (±25ms around 100ms after motion stimulus onset) for standardizing activity for PCA, rather than the typical choice of mean/standard deviation of activity in the full data window? This choice would specifically squash responses for units with a strong visual response, which distorts the covariance matrix, and thus the principal components that result. This kind of departure from the standard procedure should be clearly justified: what do the principal components look like when a standard procedure is used, and why was this insufficient/incorrect/unsuitable for this setting?

      We used the early window because it is a robust measure of overall excitability, but we now use a more conventional window that spans the main epoch of our analyses, 200–600 ms after motion onset. This method yields results qualitatively similar to the original method. We are persuaded that this is the more sensible choice. We thank R2 for raising this concern.

      (4) Analysis conclusions would generally be stronger with estimates of variability and control analyses: This applies broadly to Figures 2-6.

      We have added estimates of variability and control analyses where appropriate.

      Figure 2 shows examples of single-trial signals. The variability is addressed in Figure 3a and the new Supplementary Figure S5.

      Figure 3 now contains error bars derived by bootstrapping (see Methods, Variance and autocorrelation of smoothed diffusion signals). We have also added Supplementary Figure S5, which substantiates the sublinearity claim using simulations.

      Figure 4 (i) We now indicate the s.e.m. of decoding accuracy (across sessions) by the shading in Figure 4a. (ii) The black symbols in new Supplementary Figure S8 show the mean±s.e.m. for all pairwise comparisons shown in Figure 4d & e. (iii) Supplementary Figure S8 also summarizes two control analyses that deploy random coding directions (CDs) in neuronal state space. The upper row of Fig S9 compares the observed cosine similarity (CoSim)—between the CD identified by the graph title and the other four CDs labeled along the abscissa—with values obtained with 1000 random CDs established by random permutations of the weight assignments. The brown symbols are the mean±sdev of the CoSim (N=1000). The error bars are smaller than the symbols. We use the cumulative distribution of CoSim under permutation to estimate p-values (p<0.001 for all comparisons). We used a similar approach to estimate the distribution of the analogous correlation statistics between signals rendered by random directions in state space (Figure S8, lower row). For additional details, please see Methods, Similarity of single-trial signals.

      Figure 5: The rigor of all claims associated with this figure is adduced from two control analyses and a simulation. The first control breaks the trial-by-trial correspondence between neural signals and behavior (Reviewer Figure 1). The second control shows that neural activity does not have substantial leverage on behavior when projected onto random directions in state space (Supplementary Figure S10, top). Simulations of decision variables using parameters derived from the fits to the behavioral data (Figure 1) support a degree of leverage and mediation comparable to the values observed for 𝑆Tincon (Supplementary Figure S10, bottom). For additional details, please see Methods (Leverage of single-trial activity on behavior) and the reply to item 1, above.

      Figure 6: Panels c&d show estimates of variability across neurons and experimental sessions, respectively. The reported p-value is based on a permutation test (see Methods, Correlations between Min and Tconin ). The correlations shown in panel e (heatmap) are derived from pooled data across sessions. The reported p-value is based on a permutation test (see Methods, Correlations between Min and Tconin ).

      Reviewer #3 (Public Review):

      Summary:

      The paper investigates which aspects of neural activity in LIP of the macaque give rise to individual decisions

      (specificity of choice and reaction times) in single trials, by recording simultaneously from hundreds of neurons. Using a variety of dimensionality reduction and decoding techniques, they demonstrate that a population-based drift-diffusion signal, which relies on a small subset of neurons that overlap choice targets, is responsible for the choice and reaction time variability. Analysis of direction-selective neurons in LIP and their correlation with decision-related neurons (T con in [Tconin ] neurons ) suggests that evidence integration occurs within area LIP.

      Strengths:

      This is an important and interesting paper, which resolves conflicting hypotheses regarding the mechanisms that underlie decision-making in single trials. This is made possible by exploiting novel technology (Primatepixels recordings), in conjunction with state-of-the-art analyses and well-established dynamic random dot motion discrimination tasks.

      General recommendations:

      (1) Please tone down causal language. You presentcompelling correlativeevidencefor the idea thatLIP population activity encodes the drift-diffusion DV. We feel that claims beyond that (e.g., ”Single-trial drift-diffusion signals control the choice and decision time”) would require direct interventions, and are only partially supported by the current evidence. Further examples are provided in point 1) of Reviewer 1 below.

      We have adopted the recommendation to “tone down the causal language.” Throughout the manuscript, we strive to avoid conveying the false impression that the present findings provide causal support for the decision mechanism. However, other causal studies of LIP support causality in the random dot motion task (Hanks et al., 2006; Jeurissen et al., 2022). It is therefore justifiable to use terms that imply causality in statements intended to convey hypotheses about mechanism. We agree that we should not give the false impression that the present support for said mechanism is adduced from causal perturbations in this study, as there were none.

      (2) Please provide a commonly used, data-driven quantification of the dimensionality of the population activity – for example, using participation ratio or the number of PCs explaining 90 % of the variance. This will help readers evaluate the conclusions about the dimensionality of the data.

      Principal component analysis reveals a participation ratio of 4.4 ± 0.4 (mean ±s.e., across sessions), and the first 3 PCs explain 67.1 ± 3.1 percent of the variance. The dimensionality of the data is low, but greater than one. We state this in Methods (Principal Component Analysis) and in Results (Single-trial drift-diffusion signals approximate the decision variable, lines 200–201).

      (3) Please justify the normalization procedure used for PCA: Why use the chosen normalization window (±25ms around 100ms after motion stimulus onset) for standardizing activity for PCA, rather than the more common quantification of mean/standard deviation across the full data window? What do the first principal components look like when the latter procedure is used?

      We now use a more conventional window that spans the main epoch of our analyses, 200–600 ms after motion onset. This method yields results qualitatively similar to the original method. We are persuaded that this is the more sensible choice.

      (4) Please provide estimates of variability for variance and autocorrelation in Fig. 3 (e.g., through bootstrapping). Further, simulations could substantiate the claim about the expected sub-linearity at later time points (Fig. 3a) due to the upper stopping bound and limited firing rate range.

      We thank the reviewers for these helpful recommendations. The revised Fig. 3 now contains error bars derived by bootstrapping (see Methods, Variance and autocorrelation of smoothed diffusion signals). We have also added Supplementary Figure S5, which substantiates the sub-linearity claim using simulations.

      (5) Please add controls and estimates of variability for decoding across sessions in Fig. 4: what are the levels of within-trial correlation/cosine similarity for random coding directions? What is the variability in the estimates of values shown in a/d/e?

      We have addressed each of these items. (1) Figure 4a now shows the s.e.m. of decoding accuracy (across sessions). (2) Regarding the variability of estimates shown in Figure 4d & e, the standard errors are displayed in the new supplementary Figure S8. It makes sense to show them there because there is no natural way to represent error on the heat maps in Figure 4, and Figure S8 concerns the comparison of the values in Figure 4d&e to values derived from random coding directions. (3) Random coding directions lead to values of cosine similarity and within-trial correlation that do not differ significantly from zero. We show this in several ways, summarized in our reply to Public Review item 4. Additional details are in the revised manuscript (Methods, Similarity of single-trial signals) and the new Supplementary Figure S8.

      (6) Please perform additional analysis to strengthen the claim from Fig. 6, that Min represents the integrand and not the integral. The analysis in Fig. 6d could be repeated with the integral (cumulative sum) of the single-trial Min signals. Does this yield an increase in leverage over time?

      The short answer is, yes in part. Reviewer Figure 2a provides support for leverage of the integral on choice, and this leverage, like 𝑆Tincon (t), increases as a function of time. The effect is present in all seven sessions that have both Mleftin and Mrightin neurons (all 𝑝 < 1𝑒 − 10). However, as shown in panel b, the same integral fails to demonstrate more than a hint of leverage on RT. All correlations are barely negative, and the magnitude does not increase as a function of time. We suspect—but cannot prove—that this failure arises because of limited power and the expected weak effect. Recall that the mediation analysis of RT is restricted to longer trials. Moreover, the correlation between the Min difference and the Tin signal is less than 0.1 (heatmap, Fig. 6e), implying that the Min difference explains less than 1% of the variance of 𝑆Tin(𝑡). We considered including Reviewer Figure 2 in the paper, but we feel it would be disingenuous (cherry-picking) to report only the positive outcome of the leverage on choice. If the editors feel strongly about it, we would be open to including it, but leaving these analyses out of the revised manuscript seems more consistent with our effort to deëmphasize this finding. In the future, we plan to record simultaneously from populations MT and LIP neurons (Min and Tin, of course) and optimize Min neuron yield by placing the RDM stimulus in the periphery.

      (7) Please describe the complete procedure for determining spatially-selective activity. E.g.: What response epoch was used, what was the spatial layout of the response targets, were responses to all ipsi- vs contralateral targets pooled, what was the spatial distribution of response fields relative to the choice targets across the population?

      We thank the reviewers for pointing out this oversight. We now explain this procedure in the Methods (lines 629–644):

      Neurons were classified post hoc as Tin by visual-inspection of spatial heatmaps of neural activity acquired in the delayed saccade task. We inspected activity in the visual, delay, and perisaccadic epochs of the task. The distribution of target locations was guided by the spatial selectivity of simultaneously recorded neurons in the superior colliculus (see Stine 2023 for details). Briefly, after identifying the location of the SC response fields, we randomly presented saccade targets within this location and seven other, equally spaced locations at the same eccentricity. In monkey J we also included 1–3 additional eccentricities, spanning 5–16 degrees. Neurons were classified as Tin if they displayed a clear, spatially-selective response in at least one epoch to one of the two locations occupied by the choice targets in the main task. Neurons that switched their spatial selectivity in different epochs were not classified as Tin. The classification was conducted before the analyses of activity in the motion discrimination task. The procedure was meant to mimic those used in earlier single-neuron studies of LIP (e.g., Roitman & Shadlen 2002) in which the location of the choice targets was determined online by the qualitative spatial selectivity of the neuron under study. The Tcon neurons in the in present study were highly selective for either the contralateral or ipislateral choice target used in the RDM task (AUC = 0.89±0.01; 𝑝 < 0.05 for 97% of neurons, Wilcoxon rank sum test). Given the sparse sampling of saccade target locations, we are unable to supply a quantitative estimate of the center and spatial extent of the RFs.

      (8) Please clarify if a neuron could be classified as both Tin and Min. Or were these categories mutually exclusive?

      These categories are mutually exclusive. If a neuron has spatially-selective persistent activity, as defined by the method described above, it is classified as a Tin neuron and not as an Min neuron even if it also shows motion-selective activity during passive motion viewing. We now specify this in the Methods (lines 831–832).

      Reviewer #1 (Recommendations For The Authors):

      𝑅∗1.1a Causal language (Line 23-24): “population activity represents […] drift” and “we provide direct support for the hypothesis that drift-diffusion signal is the quantity responsible for the variability in choice and RT” reads at first sight as if the authors claim that they present evidence for a causal effect of LIP activity on choice. The authors areotherwisenuanced and carefultopointout thattheir evidence is correlational. What seems to be meant is that the population activity/drift-diffusion signal ”approximates the DV that gives rise to the choices […]” (cf. line 399). I would recommend using such alternative phrasing to avoid confusion (and the typically strong reactions by readers against misleading causal statements).

      We have adopted the reviewer’s recommendation and have modified the text throughout to reduce causal language. See our response to General Recommendation 1.

      𝑅∗1.1b Relatedly, any discussion about the possibility of LIP being causally involved in evidence integration (e.g. lines 429-445 [Au: now 462–478]) should also comment on the possibility of a distributed representation of the decision variable given that neural correlates of the DV have been reported in several areas including PFC, caudate and FEF.

      We believe this is possible. However, we hope to avoid discussions about causality given that it is not a focus of the paper. Although it is somewhat tangential, we have shown elsewhere that LIP is causal in the sense that causal manipulations affect behavior, but it is also true that causality does not imply necessity, and similarly, lack of necessity does not imply “only correlation.” Regarding distributed representations, it is worth keeping in mind the cautionary counter-example furnished by the SC study (Stine et al., 2023). The firing rates measured by averaging over trials are similar in SC and LIP; both manifest as coherence and direction-dependent ramps, leading to the suggestion that they form a distributed representation of the decision variable. With single-trial resolution, we now know that LIP and SC exhibit distinct dynamics—drift-diffusion and bursting, respectively. It remains to be seen if single-trial resolution achievable by simultaneous Neuropixels recordings from prefrontal areas and LIP reveal shared or distinct dynamics.

      𝑅∗1.2 How was the spatially selective activity determined? The classification of Tin neurons is critical to this study - how was their spatial selectivity determined? Please describe this in similar detail as the description of direction selectivity on lines 681-690 [Au: now 824–832]. E.g.: what response epoch was used, what was the spatial layout of the response targets, were responses to all ipsi- vs contralateral targets pooled, and what was the spatial distribution of response fields relative to the choice targets across the population?

      We now explain the selection procedure in Methods (lines 629–644). Please see our reply to General Recommendation 7, above.

      𝑅∗1.3 Could a neuron be classified as both Tin and Min, or were these categories mutually exclusive? Please clarify. (This goes beyond the scope of the current study: but did the authors find evidence for topographic organization or clustering of these categories of neurons?)

      These categories are mutually exclusive. Please see our response to General Recommendation 8, above.

      𝑅∗1.4 Contrary to the statement on line 121, the trial averages in Fig. 2a, 2b show coherence dependency at the time of the saccade in saccade-aligned traces for the coding strategies, except for STin (fig. 2c). Is this a result of the choice for t1 (= 0.1s)? (The authors may want to change their statement on line 121.) Relatedly, do the population responses for the two coding strategies Sramp and SPC1 depend on the epoch used to derive weights for individual neurons?

      We have revised the description to accommodate R2’s observation. 𝑆ramp retains weak coherence-dependence before saccades towards the choice target contralateral to the recording site. This was true in four of the eight sessions. For 𝑆PC1, there is no longer a coherence dependency for the Tin choices, owing to the change in normalization method (see revised Figure 2b).

      We also corrected an error in the Methods section. Specifically, the ramp ends at 𝑡1 \= 0.05 s before the time of the saccade, not 𝑡1 \= 0.1 s. While we no longer emphasize the similarity of traces aligned to saccade, it is reasonable to find issue with the observation that they retain a dependency on coherence (𝑆ramp only) because, according to theory, traces associated with Tin choices should reach a common positive threshold at decision termination. That said, for the Ramp direction there may be a reason to expect this discrepancy from theory. The deterministic part of drift-diffusion includes an urgency signal that confers positive convexity to the deterministic drift. This accelerating nonlinearity is not captured by the ramp, and it is more prominent at longer decision times, thus low coherences. We do not share this interpretation in the revised manuscript, in part because retention of coherence dependency is present in only half the sessions (see Reviewer Figure 3) The correction to the definition of 𝑡1 also provides an opportunity to address R2’s final question (“Relatedly,…?”). For 𝑆ramp this particular variation in 𝑡1 does not affect 𝑆ramp, and 𝑆PC1 no longer retains coherence dependency for Tin choices. Note that our choice of 𝑡0 and 𝑡1 is based on the empirical observation that the ramping activity in response averages of Tin neurons typically begins 200 ms after motion onset and ends 50–100 ms before initiation of the saccadic choice. The starting time (𝑡0) is also supported by the observation that the decoding accuracy of a choice-decoder begins to diverge from chance at this time (Figure 4a).

      𝑅∗1.5 It is intriguing that Sramp and SPC1 show dynamics that look so similar (fig. 2a, 2b). How do the weights assigned to each neuron in both strategies compare across the population?

      The weights assigned to each neuron are very similar across the two strategies as indicated by a cosine similarity (0.65 ± 0.04, mean ±s.e.m. across sessions).

      𝑅∗1.6 Tin neurons, which show dynamics closely resembling different coding directions (fig. 2) and the decoders do not have weights that can distinguish them from the rest of the population in each of these analyses (fig. S7). Is it fair to interpret these findings as evidence for broad decision-related co-variability in the recorded neural population in LIP?

      Yes, our results are consistent with this interpretation. However, it is worth reiterating that decoding performance drops considerably when Tin neurons are not included (see Supplementary Figure S13). Thus, this broad decision-related co-variability is present but weak.

      𝑅∗1.7 It is intriguing that the decoding weights of the different decoders did not allow the authors to reliably identify Tin neurons. Could this be, in part, due to the low dimensionality of the population activity and task that the animals are presumably overtrained on? Or do the authors expect this finding to hold up if the population activity and task were higher dimensional?

      Great question! We can only speculate, but it seems possible that a more complex, “higher dimensional” task could make it easier to identify Tin neurons. For example, a task with four choices instead of two may decrease correlations among groups of neurons with different response fields. We have added this caveat to the discussion (lines 459-–461). One minor semantic objection: The animal has learned to perform a highly contrived task at low signal-to-noise. The animal is well-trained, not over-trained.

      𝑅∗1.8 Lines 135-137 [Au: now 141–142]: The similarity in the single trial traces from different coding strategies (fig. 2a-2c, left) is not as evident to me as the authors suggest. It might be worthwhile computing the correlation coefficients between individual traces for each pair of strategies and reporting the mean correlation to support the author’s point.

      We report the mean correlation between single-trial signals generated by the chosen dimensionality reduction methods in Figure 4e. We show the variability in this measure in Supplementary Figure S8. We have also adjusted the opacity of the single-trial traces in Figure 2, left.

      𝑅∗1.9 Minor/typos:

      -line 74: consider additionally citing Hyafil et al. 2023.

      -line 588: ”that were strongly correlated”?

      -line 615: ”were the actual drift-diffusion process were...”.

      -line 717: ”a causal influence” -> ”no causal influence”.

      Fig. 6: panel labels e vs d are swapped between the figure and caption.

      Fig. 3c: labels r1,3 & r2,3 are flipped.

      We have addressed all of these items. Thank you.

      Reviewer #2 (Recommendations For The Authors):

      𝑅∗2.1 (Figure 2) Determine whether restricting the analysis to 1D projections of the data is a suitable approach given the actual dimensionality of the datasets being analyzed:

      - Should show some quantification of the dimensionality of the recorded activity; could do this by quantifying the dimensionality of population activity in each session, e.g. with participation ratio or related measures (like # PCs to explain some high proportion of the variance, e.g. 90 %). If much of the variation is not described in 1 dimension, then the paper would benefit from some discussion/analysis of the signals that occupy the other dimensions.

      We now report the participation ratio (4.4 ± 0.4, mean ±s.e. across sessions), and we state that the first 3 PCs explain 67.1 ± 3.1% of the variance of the time- and coherence-dependent signals used for the PCA (mean ±s.e). We agree that the 1D projections may elide meaningful features of LIP population activity. Indeed, we make this point through our analysis of the Min neurons. To reiterate our response above, we do not claim that the 1D projections explain all of the meaningful features of LIP population activity. They do, however, reveal the decision variable, which is our main focus. These 1D signals contain features that correlate with events in the superior colliculus, summarized in Stine et al. (2023), attesting to their biological relevance.

      The Reviewer is correct that our approach presupposes a linear embedding of the 1D decision variable inthepopulationactivity. Inotherwords, anonlinearrepresentationofthe1Ddecisionvariableinpopulation activity could have an embedding dimensionality greater than 1, and there may well be a non-linear method that reveals this representation. To test this possibility, we decoded choice on each trial from population activity using (1) a linear decoder (logistic classifier) or (2) a multi-layer neural network, which can exploit non-linearities. We found that, for each session, the two decoders performed similarly: the neural network outperforms the logistic decoder (barely) in just one session. The analysis suggests that the assumption of linear embedding of the decision variable is justified. We hope this analysis convinces the reviewer that “sophisticated analyses of the full neuronal state space” and “a simple average of [Tcon ] neurons” do in indeed yield roughly equivalent representations of the decision variable. We have included the results of this analysis in Supplementary Figure S12. See also item 2 of the Public response.

      𝑅∗2.2 (Figure 3) Add estimates of variability for variance and autocorrelation through time from single-trial signals:

      –   E.g. by bootstrapping. Would be helpful for making rigorous the discussion of when the deviation from the theory is outside what would be expected by chance, even if it doesn’t change the specific conclusions here.

      –   If possible, it would help (by simulations, or maybe an added reference if it exists) to substantiate the claim about the expected sub-linearity at later time-points (Figure 3a) due to the upper stopping bound and limited firing rate range.

      We thank the reviewer for this helpful comment. The revised Fig. 3 now contains error bars derived by bootstrapping (see Methods, §Variance and autocorrelation of smoothed diffusion signals). We have also added Supplementary Figure S5, which substantiates the sub-linearity claim using simulations.

      𝑅∗2.3 (Figure 4) Add controls and estimates of variability for decoding across sessions:

      –   As a baseline - what is the level of within-trial correlation/cosine similarity when random coding directions are used?

      –   What is the variability in the estimates of values shown in a/d/e?

      We have addressed each of these items. (1) Figure 4a now shows the s.e.m. of decoding accuracy (across sessions). (2) Regarding the variability of estimates shown in Figure 4d & e, the standard errors are displayed in the new Supplementary Figure S8. It makes sense to show them there because (i) there is no natural way to represent error on the heat maps in Figure 4, and (ii) S8 concerns the comparison of the values in Figure 4d & e to values derived from random coding directions. (3) Random coding directions lead to values of cosine similarity and within-trial correlation that do not differ significantly from zero. We show this in several ways, summarized in our reply to Public Review item 4. Additional details are in the revised manuscript (Methods: Similarity of single-trial signals) and the new Supplementary Figure S8. We also provide this information in response to Recommendation 5, above.

      𝑅∗2.4 (Figure 5) Add negative controls and significance tests to support claims about trends in leverage:

      –   What is the level of increase in leverage attained from random 1D projections of the data, or other projections where the prior would be no leverage?

      –   What is the range of leverage values fit for a simulated signal with a ground-truth of no trend?

      We have added two control analyses. In addition to a shuffle control, which destroys the relationship (Review Figure 1) we performed additional analyses that preserve the correspondence of neural signals and behavior on the same trial. We generated random coding directions (CDs) by establishing weight-vectors that were either chosen from a Normal distribution or by permuting the weights assigned to PC-1 in each session. The latter is the more conservative measure. Projections of the neural responses onto these random coding directions render 𝑆rand(𝑡). Specifically, the degree of leverage is effectively zero or very much reduced. These analyses are summarized in a new Supplementary Figure S10. The distributions of our test statistics (e.g., leverage on choice and RT) under the variants of the null hypothesis also support traditional metrics of statistical significance. Figure S10 (bottom row) also provides an approximate answer to the question: What degree of leverage and mediation would be expected for a theoretical decision variable? Briefly, we simulated 60,000 trials using the race model that best fits the behavioral data of monkey M. For any noise-free representation of a Markovian integration process, the leverage of an early sample of the DV on behavior would be mediated completely by later activity as the latter sample—up to the time of commitment—subsumes all variability captured by the earlier sample. We, therefore, generated 𝑆sim(𝑡) by first subsampling the simulated data to match the trial numbers of each session. To evaluate a DV approximated from the activity of 𝑁 Tconin neurons per session rather than the true DV represented by the entire population, we generated 𝑁 noisy instantiations of the signal for each of the subsampled, simulated trials. The noisy decision variable, 𝑆sim (t) is the mean activity of these 𝑁 noise-corrupted signals. The simulation is consistent with the leverage and incomplete mediation observed for the populations of Tcon neurons. For in additional details, see Methods, §Leverage of single-trial activity on behavior) and Supplementary Figure S10, caption. See also our response to item 1 of the Public Response.

      𝑅∗2.5 The analysis is performed across several signed coherence levels, with data detrended for each signed coherence and choice to enable comparison of fluctuations relative to the relevant baseline; are results similar for the different coherences?

      The results are qualitatively similar for individual coherences. There is less power, of course, because there are fewer trials. The analyses cannot be performed for coherences ≥ 12.8% because there are not enough trials that satisfy the inclusion criteria (presence of left and right choice trials with RT ≤ 670 ms). Nonetheless, leverage on choice and RT is statistically significant for 27 of the 30 combinations of motion strengths < 12.8% × three signals (𝑆ramp, 𝑆PC1 and 𝑆Tin) × behavioral measures (RT and choice) (RT: all 𝑝 < 0.008, Fisher-z; choice: all 𝑝 < 0.05, t-test ). The three exceptions are trials with 6.4% coherence rightward motion, which do not correlate significantly with RT on leftward choice trials. Reviewer Figure 4 shows the results of the leverage and mediation analyses, using only the 0% coherence trials.

      𝑅∗2.6 (Figure 6) Additional analysis to strengthen the claim that Min represents the integrand and not the integral:

      a. Repeating the analysis in Figure 6d with the integral (cumulative sum) of the single-trial Min signals and instead observing a significant increase in leverage over time would be strong evidence for this interpretation. If you again see no increase, then it suggests that the activity of these units (while direction selective) may not be strongly yoked to behavior. This scenario (no increasing leverage of the integral of Min on behavior through time) also raises an intriguing alternative possibility: that the noise driving the ’diffusion’ of drift-diffusion here may originate in the integrating circuit, rather than just reflecting the complete integration of noise in the stream of evidence itself.

      b. Repeating the analysis in Figure 6d with the projection of the M subspace onto its own first PC (e.g. take the union of units {Mrightin, Mleftin} [our ], do PCA just on those units’ single

      trial activities, identify the first PC, and project those activities on that dimension to obtain SPC1-M.

      c. Ameliorating the sample-size limitation by relaxing the criteria for inclusion in Min - performing the same analyses shown, but including all units with visual RFs overlapping the motion stimulus, irrespective of their direction selectivity.

      a. Reviewer Figure 2a provides support for leverage of the integral on choice, and this leverage, like , increases as a function of time. The effect is present in all seven sessions that have both and neurons (all 𝑝 < 1𝑒 − 10). However, as shown in panel b, the same integral fails

      to demonstrate more than a hint of leverage on RT (all correlations are negative) and the magnitude does not vary as a function of time. We suspect—but cannot prove—that this failure arises because of limited power and the expected weak effect. Recall that the mediation analysis of RT is restricted to longer trials and that the correlation between the Min difference and the signal is less than 0.1 over the heatmap in Fig. 6e, implying that the Min difference explains less than 1% of the variance of 𝑆Tin(𝑡). We considered including Reviewer Figure 2 in the paper, but we feel it would be disingenuous (cherrypicking) to report only the positive outcome of the leverage on choice. If the editors feel strongly about it, we would be open to including it, but leaving these analyses out of the revised manuscript seems more consistent with our effort to deëmphasize this finding. In the future, we plan to record simultaneously from populations MT and LIP neurons (Min and Tin, of course) and optimize Min neuron yield by placing the RDM stimulus in the periphery. We also provide this information in response to Recommendation (6) above.

      b.  We tried the R’s suggestion to apply PCA to the union of Min neurons , , fully expecting PC1 to comprise weights of opposite sign for the right and left preferring neurons, but that is not what we observed. Instead, the direction selectivity is distributed over at least two PCs. We think this is a reflection of the prominence of other signals, such as the strong visual response and normalization signals (see Shushruth et al., 2018). In the spirit of the R’s suggestion, we also established an “evidence coding direction” using a regression strategy similar to the Ramp CD applied to the union of Min neurons. The strategy produced a coding direction with opposite signed weights dominating the right and left subsets. The projection of the neural data on this evidence CD yields a signal similar to the difference variable used in Fig. 6e (i.e., signals that are approximately constant firing rates vs time and scale as a function of signed coherence). These unintegrated signals exhibit weak leverage on choice and RT, consistent with Figure 6d. However, the integrated signal has leverage on choice but not RT, similar to the integral of the difference signal in Reviewer Figure 2.

      c.   We do not understand the motivation for this analysis. We could apply PCA or dPCA (or the regression approach, described above) to the population of units with RFs that overlap the motion stimulus, but it is hard to see how this would test the hypothesis that direction-selective neurons similar to those in area MT supply the momentary evidence. As mentioned, we have very few Min neurons (as few as two in session 3). Future experiments that place the motion stimulus in the periphery would likely increase the yield of Min neurons and would be better suited to study this question. As such, we do not see the integrand-like responses of Min neurons as a major claim of the paper. Instead, we view it as an intriguing observation that deserves follow-up in future experiments, including simultaneous recordings from populations of MT and LIP neurons (Min and Tin, of course). We have softened the language considerably to make it clear that future work will be needed to make strong claims about the nature of Min neurons.

      𝑅∗2.7 Other questions: Figure 2c is described as showing the average firing rate of units in Tconin on single trials, but must also incorporate some baseline subtraction (as the shown traces dip into negative firing rates). Whatbaselineissubtracted? Aretheseresidualsignals, asdescribedforlaterfigures, orisadifferent method used? (Presumably, a similar procedure is used also for Figure 2a/b, given that all single-trial traces begin at 0.). Is the baseline subtraction justified? If the dataset really does reflect the decision variable with single-trial resolution, eliminating the baseline subtraction when visualizing single-trial activity might actually help to make the point clearer: trials which (for any reason) begin with a higher projection on the particular direction that furnishes the DV would be predicted to reach the decision bound, at any fixed coherence, more quickly than trials with a smaller projection onto this direction.

      We thank the reviewer for this comment. For each trial, the mean activity between 175 ms and 225 ms after motion onset was subtracted when generating the single-trial traces. The baseline subtraction was only applied for visualization to better portray the diffusion component in the signal. Unless otherwise indicated, all analyses are computed on non-baseline corrected data. We now describe in the caption of Figure 2 that “For visualization, single-trial traces were baseline corrected by subtracting the activity in a 50 ms window around 200 ms.” Examples of the raw traces used for all follow-up analyses are displayed in Reviewer Figure 6.

      Reviewer #3 (Recommendations For The Authors):

      I only have a few comments to make the paper more accessible:

      𝑅∗3.1 I struggle to understand how the linear fitting from -1 to 1 was done. More detail about how the single cell single-trial activity was generated to possibly go from -1 to 1 or do I completely misunderstand the approach? I assume the data standardization does that job?

      We have rephrased and added clarifying detail to the section describing the derivation of the ramp signal in the Methods (Ramp direction).

      We applied linear regression to generate a signal that best approximates a linear ramp, on each trial, 𝑖, that terminates with a saccade to the choice-target contralateral to the hemisphere of the LIP recordings. The ramps are defined in the epoch spanning the decision time: each ramp begins at 𝑓𝑖(𝑡0) = −1, where 𝑡0 \= 0.2 s after motion onset, and ends at 𝑓𝑖(𝑡1) = 1, where 𝑡1 \= 𝑡sac − 0.05 s (i.e., 50 ms before saccade initiation). The ramps are sampled every 25 ms and concatenated using all eligible trials to construct a long saw-tooth function (see Supplementary Figure S2). The regression solves for the weights assigned to each neuron such that the weighted sum of the activity of all neurons best approximates the saw-tooth. We constructed a time series of standardized neural activity, sampled identically to the saw-tooth. The spike times from each neuron are represented as delta functions (rasters) and convolved with a non-causal 25 ms boxcar filter. The mean and standard deviation of all sampled values of activity were used to standardize the activity for each neuron (i.e., Z-transform). The coefficients derived by the regression establish the vector of weights that define 𝑆ramp. The algorithm ensures that the population signal 𝑆ramp(𝑡), but not necessarily individual neurons, have amplitudes ranging from approximately −1 to 1.

      𝑅∗3.2 It is difficult to understand how the urgency signal is derived, to then generate fig S4.

      The urgency signal is estimated by averaging 𝑆𝑥(𝑡) at each time point relative to motion onset, using only the 0% coherence trials. We have clarified this in the caption of Supplementary Figure S4.

      Author response image 1.

      Shuffle control for Fig. 5. Breaking the within-trial correspondence between neural signal, 𝑆(𝑡), and choice suppresses leverage to near zero.

      Author response image 2.

      Leverage of the integrated difference signal on choice and RT. Traces are the average leverage across seven sessions. Same conventions as in Figure 5.

      Author response image 3.

      Trial-averaged 𝑆ramp activity during individual sessions. Same as Figure 2b for individual sessions for Monkey M (left) and Monkey J (right). The figure is intended to illustrate the consistency and heterogeneity of the averaged signals. For example, the saccade-aligned averages lose their association with motion strength before left (contra) choices in sessions 1, 2, 5, and 6 but retain the association in sessions 3, 4, 7, and 8.

      Author response image 4.

      Drift-diffusion signals have measurable leverage on choice and RT even when only 0%-coherence trials are included in the analysis.

      Author response image 5.

      Raw single-trial activity for three types of population averages. Representative single-trial activity during the first 300 ms of evidence accumulation using two motion strengths: 0% and 25.6% coherence toward the left (contralateral) choice target. Unlike in Figure 2 in the paper, single-trial traces are not baseline corrected by subtracting the activity in a 50 ms window around 200 ms. We highlight a number of trials with thick traces and these are the same trials in each of the rows.

    1. eLife Assessment

      This work explores the physical principles underlying fluid flow and luminal transport within the endoplasmic reticulum. Its important contribution is to highlight the strong physical constraints imposed by viscous dissipation in nanoscopic tubular networks. In particular, the work presents convincing evidence based on theoretical analysis that commonly discussed mechanisms such as tubular contraction are unlikely to be at the origin of the observed transport velocities. As such, it will be of relevance to cell biologists and physicists interested in organelle dynamics. As this study is solely theoretical and deals with order of magnitude estimates, its main conclusions await experimental validation.

    2. Reviewer #1 (Public review):

      Theoretical principles of viscous fluid mechanics are used here to assess likely mechanisms of transport in the ER. A set of candidate mechanisms are evaluated, making good use of imaging to represent ER network geometries. Evidence is provided that contraction of peripheral sheets provides a much more credible mechanism than contraction of individual tubules, junctions or perinuclear sheets.

      The work has been conducted carefully and comprehensively, making good use of underlying physical principles. There is good discussion of the role of slip; sensible approximations (low volume fraction, small particle size, slender geometries, pragmatic treatment of boundary conditions) allow tractable and transparent calculations; clear physical arguments, including an analysis of energy budgets, provide useful bounds; stochastic and deterministic features of the problem are well integrated.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Theoretical principles of viscous fluid mechanics are used here to assess likely mechanisms of transport in the ER. A set of candidate mechanisms is evaluated, making good use of imaging to represent ER network geometries. Evidence is provided that the contraction of peripheral sheets provides a much more credible mechanism than the contraction of individual tubules, junctions, or perinuclear sheets.

      The work has been conducted carefully and comprehensively, making good use of underlying physical principles. There is a good discussion of the role of slip; sensible approximations (low volume fraction, small particle size, slender geometries, pragmatic treatment of boundary conditions) allow tractable and transparent calculations; clear physical arguments provide useful bounds; stochastic and deterministic features of the problem are well integrated.

      We thank the reviewer for their positive assessment of our work.

      There are just a couple of areas where more discussion might be warranted, in my view.

      (1) The energetic cost of tubule contraction is estimated, but I did not see an equivalent estimate for the contraction of peripheral sheets. It might be helpful to estimate the energetic cost of viscous dissipation in generated flows at higher frequencies.

      This is a good point. We have now included an energetic cost estimate for the contractions of peripheral sheets in the revised manuscript.

      The mechanism of peripheral sheet contraction is unclear: do ATP-driven mechanisms somehow interact with thermal fluctuations of membranes?

      The new energetic estimates in the revision might help constrain possible hypotheses for the mechanism(s) driving peripheral sheet contraction, and suggest if a dedicated ATP-driven mechanism is required.

      (2) Mutations are mentioned in the abstract but not (as far as I could see) later in the manuscript. It would be helpful if any consequences for pathologies could be developed in the text.

      We are grateful for this suggestion. The need to rationalise pathology associated with the subtle effects of mutations of ER-morphogens is indeed pointed out as one factor motivating the study of the interplay between ER structure and performance. In the revised manuscript, we have included a brief discussion potentially linking the malfunction of ER morphogens to luminal transport, referencing freshly published findings.

      Reviewer #2 (Public Review):

      Summary:

      This study explores theoretically the consequences of structural fluctuations of the endoplasmic reticulum (ER) morphology called contractions on molecular transport. Most of the manuscript consists of the construction of an interesting theoretical flow field (physical model) under various hypothetical assumptions. The computational modeling is followed by some simulations.

      Strengths:

      The authors are focusing their attention on testing the hypothesis that a local flow in the tubule could be driven by tubular pinching. We recall that trafficking in the ER is considered to be mostly driven by diffusion at least at a spatial scale that is large enough to account for averaging of any random flow occurring from multiple directions [note that this is not the case for plants].

      We thank the reviewer. Indeed, the trafficking in the ER was historically presumed to be driven by passive diffusion but this has been challenged by recent findings suggesting that the transport may also involve an active super-diffusional component (the short-lasting flows). These findings include: the dependence of ER luminal transport on ATP-derived energy observed in the historical and recent publications cited here; fast and directional single-particle motion; and a linear scaling of photoactivated signal arrival times with distance. On a larger scale, indeed, the motion can be seen as a faster effective diffusion, as there is no persistent circulatory directionality of the currents.

      Weaknesses:

      The manuscript extensively details the construction of the theoretical model, occupying a significant portion of the manuscript. While this section contains interesting computations, its relevance and utility could be better emphasized, perhaps warranting a reorganization of the manuscript to foreground this critical aspect.

      Overall, the manuscript appears highly technical with limited conclusive insights, particularly lacking predictions confirmed by experimental validation. There is an absence of substantial conclusions regarding molecular trafficking within the ER.

      We sought to balance the theoretical/computational details of our model with the biophysical conclusions drawn from its predictions. Given the model's complexity and novelty, it was essential to elucidate the theoretical underpinnings comprehensively, in order to allow others to implement it in the future with additional, or different, parameters. To maintain clarity and focus in the main text, we have judiciously relegated extensive technical details to the methods section or supplementary materials, and divided the text into stand-alone section headings allowing the reader to skip through to conclusions.

      The primary focus of our manuscript is to introduce and explore, via our theoretical model, the interplay between ER structure dynamics and molecular transport. Our approach, while in silico, generates concrete predictions about the physical processes underpinning luminal motion within the ER. For instance, our findings challenge the previously postulated role of small tubular contractions in driving luminal flow, instead highlighting the potential significance of local flat ER areas—empirically documented entities—for facilitating such motion.

      Furthermore, by deducing what type of transport may or may not occur within the range of possible ER structural fluctuations, our model offers detailed predictions designed to bridge the gap between theoretical insight and experimental verification. These predictions detail the spatial and temporal parameters essential for effective transport, delineating plausible values for these parameters. We hope that the model’s predictions will invite experimentalists to devise innovative methodologies to test them. We have introduced text edits to the revised version to clarify the reviewer’s point as per the detailed comments below.

      Recommendations for the authors:

      Editor comments (Recommendations For The Authors):

      The two reviewers have different opinions about the strengths and weaknesses of this work. The editors do believe that this work is a valuable contribution to the field of ER dynamics and transport, and could stimulate experiments.

      We thank both reviewers and the editors for the time and care they have invested in reviewing our manuscript.

      Nevertheless, discussing further the role of diffusion vs. advection in ER luminal transport, including conflicting values of measured diffusion coefficients, would be valuable. For instance, it is possible that the active contraction-driven mechanism results in an effective diffusion over a long time, which could be quantified and compared to experiments.

      In our study we focus on tubule-scale transport because the statistics of transport at this scale have been measured and the origins of the observed transport is an outstanding problem. We already know from Holcman et al. (2018) that transport at the tubule scale involves an active, possibly advective, component beyond passive molecular diffusion. Although we do touch briefly on a network-scale phenomenon in our section on mixing/content homogenisation, our main focus is on trying to understand tubule-scale transport. We agree that a substantial exploration of effective diffusion over a network scale would be of value and increase the breadth of our paper, we feel that this is beyond the scope of the current paper. We believe the “conflicting” diffusion coefficients, in fact, characterise motion at different time and length scales: the global diffusion coefficient pointed out to us in the reviews may pertain to network-scale effective diffusion over long time scales, but this is different to the Brownian motion on the scale of tubules/tubular junctions relevant to our in silico model.

      Reviewer #1 (Recommendations For The Authors):

      I congratulate the authors on their work and do not have any substantial further recommendations, beyond two minor points.

      (1) Before (13), say "Using the expression (7) for Q_2, ..."

      (2) Typo on p.25: "principal" rather than "principle" (two instances)

      We thank the reviewer for spotting these and have addressed both points.

      Reviewer #2 (Recommendations For The Authors):

      Here are some specific comments:

      (1) Insufficient Influence of ER Tubule Contraction:

      The conclusion regarding weak fluid flows generated by ER tubule contractions may seem obvious. It would be more intriguing if the authors explored conditions necessary to achieve faster flows, such as those around 20 µm/s, within tubules.

      We agree these are important conditions to explore and it is extensively covered in Fig. 4e-f, which show that tubule contraction sites the length of entire tubules and occurring at 5 and 10 times the experimentally measured rates produce mean average edge traversal speeds exploring otherµconceivable scenarios. of ~30 and 60 m/s respectively. These pinch parameters seemed unlikely and motivated

      (2) Limited Impact of ER Network Geometry:

      The comparison across different ER network structures seems insufficiently documented. A comparison between distal and proximal ER from the nucleus could provide deeper insights.

      We have added text in the new paragraph 4 of the introduction to better articulate the core principles of the ER’s structural elements. As established by historical EM and light microscopy, the ER is universally composed of tubules, with 3-way junctions, and small (peripheral) or large perinuclear sheets. We establish that the specific shaping of these elements influences the nanofluidics we investigate here. While the proportion of these elements may vary across different cell types and cellular regions, the fundamental structure, and therefore the impact on local mobility remains consistent. Our categorisation of the ER into its elements reflects these ubiquitous components, allowing us to analyse the impact of shaping at the relevant scale, covering the perinuclear and peripheral ER.

      (3) Ineffectiveness of Tubule Junction Contraction:

      The study's negative result on ER tubule junction contraction's impact on molecular exchange may not capture broad interest without experimental validation. Conducting experiments to test this hypothesis could strengthen the study.

      We agree that experimental testing of this prediction in the future, when appropriate tools become available to correlate molecular motion speed and fast contractions of nanoscopic tubular junctions, will be needed for its validation.

      (4) Potential Role of Peripheral Sheets:

      While the speculation on the contraction of peripheral ER sheets is intriguing, further experimental investigation is warranted to validate this hypothesis, especially considering the observed slow diffusion in ER sheets.

      We agree with the reviewer that our study is theoretical in nature and on the necessity of further experimental investigation before we are able to make a definitive conclusion on peripheral sheets.

      In summary, while the study underscores the complexity of ER morphology dynamics and its implications for molecular transport, its novelty and broad implications seem limited. Given its reliance on computational simulations and dense theoretical language, submission to a computational journal could be more appropriate. In addition, given there is an absence of substantial conclusions regarding molecular trafficking within the ER, publication in a specialized journal of fluid mechanics or physics may be appropriate.

      Comments:

      - The manuscript is hard to read. There is no smooth transition from Figure 1 to Figure 2.

      To smoothen the transition, we edited the text at the beginning of results and added a reference there to the introductory Fig. 1.

      - Figure 8 serves no purpose. To make the text easier, C0, C1, C2... should be presented in Figure 2 and merged with Figure 10 with a table summarizing the information of these networks. It is not clear why 5 networks are needed. They look similar. Could you add the number of nodes per network?

      We have now merged Fig 8 and Fig 10 from the previous version into one figure (which is now Fig 9). We have also added information about the number of nodes and added a sentence in the manuscript to clarify that it showcases the source data used to model/reconstruct realistic ER structures.

      - Figure 13: seems out of contex. What is the message? The ER does not show any large flow--from early FRAP and recent photoactivation - the material seems to diffuse at long distances made by few tubules.

      Fig 13 (now Fig 12 in the revised version) does not illustrate any flow. Its purpose is to illustrate the computational methodology used to simulate flows and transport due to contraction of perinuclear sheets. (Note that we have spotted and fixed a small but important typo in the caption: “peripheral” →”perinuclear”.) It is worth noting that FRAP provides a relative estimate of mobility but contains no information as to the mode of motion. Whether the motion is diffusive or otherwise must be presumed in FRAP analyses and this presumption then can be used to extract metrics such as the diffusion coefficient. Photoactivation analyses suffer from the same limitation but analysis of how photoactivated signal arrival times scale with distance was recently suggested as a workaround. These measurements suggest a superdiffusive ER transport (https://doi.org/10.1016/j.celrep.2024.114357). Although a different approach used in a recent preprint to photoactivation signal analysis suggests that at long-distances transport can be approximated as diffusion (https://doi.org/10.1101/2023.04.23.537908), improved measurement in the future would be needed to address the seeming discrepancies.

      - Figure 1: what is the difference between a and b? How do you do your cross-section? This probability needs a drawing at least to understand how you define it.

      We expanded the explanation in the third last paragraph of Section I.

      - Figure 2: this manuscript is not a review. It is not clear why part of a figure is copied and pasted from another manuscript. It should be removed. Are the authors using the quantification [peaks of different color]? Where? The title should be given to explain each panel.

      We have chosen to keep the inset, which was not in the main text of the cited paper but its supplementary information, and provides a direct benchmark for our work.

      Why the mean flow in a is stochastic? With large excursion for large values? Could you plot the Fourrier or spectrogram so we can understand the frequencies? Are there regular patterns of bursts?

      The mean (i.e. cross-sectionally averaged) flow is stochastic because the pinching events are random (more precisely, they follow a Poisson distribution, as explained in the paper). Large excursions are rare and caused by interactions between pinches. We have prescribed the distributions of pinch durations and frequencies as per experimentally measured distributions and we do not expect to recover from a Fourier analysis more information than we have prescribed.

      What do we learn from the fit of Fig 2b-c? Is it a constant flow?

      The conclusion of Fig 2b-c is that the in silico simulation model based on the pinching tubule hypothesis produces solute transport, as quantified by the instantaneous particle speeds (Fig 2b) and the average edge traversal speeds (Fig 2c), that is much weaker than experimentally measured. This is one of the main results of our paper and explained in Section IIA, paragraph 3. Fig 2a tells us that the flow is not constant (flows in this system can only be generated transiently, with directionality persistence considered unlikely).

      Figure 14: Estimating of the area is unclear. The legend is largely insufficient.Why did the authors report only nine regions of contractions? Is it so rare? How many samples have they used? Nine among how many?

      Thank you; the details of area estimation are included in the main text, in Section I.4. The nine regions are an arbitrary selection of a sample we deemed representative of this phenomenon.

      - Abstract: this is misleading, it should start by explaining that diffusion is the consensus of trafficking in the ER.

      - "the content motion in actively contracting nanoscopic tubular networks" is misleading. We should recall that this is an assumption that has not been proven.

      The current abstract is a succinct summary of the question in scope and results. The sentence highlighted by the referee specifically refers to the model we study in the paper; we modified it in order to remove any ambiguity and to make clear that we are testing a proposed mechanism. We also point out that although the biological origins of the tubule contractions or their effects on solute transport have not been established, these contractions have been documented.

      Minor comments:

      Introduction: "Thus past measurements indicate that the transport of proteins in ER is not consistent with Brownian motion" is misleading. You should explain that this depends on the time scale. At large timescale, diffusion is a coarse-grained description and is actually accurate from FRAP and photoactivation data [see J. Lippincott-Schwartz publications over the past 20 years].The super-diffusion [9] "A photoactivation chase technique also measured a superdiffusive behaviour of luminal material spread through the ER network [9]." This is not clear and is probably due to an artifact of measurements or interpretation.

      We thank the reviewer for this comment. We expanded in paragraph 2 of the introduction to better reflect the state of knowledge around this point.

      Page 2 "Strocytes" does not exist you may be meant "Astrocytes".

      Thank you; typo fixed.

      Page 5: The value of the flow seems incompatible with previous literature ~ 20 mu m/s. Again where 0.6 is found? Where in [7]: if there is no diffusion in the tubule, why compare with 0.6 mu m^2/s? The global diffusion coefficient is much higher ~ 5 mu m^2/s.

      Supplementary Figure 3b of µHolcman et al., Nat Cell Biol, 2018 (Ref. [7] in the unrevised version: The value of 0.6   m^2/s is the intranodal diffusion coefficient reported empirically in our article), for ER in COS-7 cells. Motion inside the tubule would in general consist of a combination of advection and diffusion; since the same fluid occupies the tubules and the m^2/s as the diffusion coefficient in tubules as well. The experiments in Holcman et al. (2018) µ junctions, and the junction sizes are similar to the tubule diameters, it is reasonable to take 0.6 does not mention diffusion inside tubules because (i) the study reports a dominantly advective (or at least active) transport across tubules (the driving mechanism of which remains unknown) but this does not mean diffusion is not there as well; and (ii) the time resolution in these experiments are too low to capture the fine details of solute motion inside tubules, and the transport is captured only as “jumps” between junctions. We point out also that the higher global diffusion coefficient may pertain to network-scale effective diffusion over long time scales, which is different to the Brownian motion at the scale of tubules/tubular junctions relevant to our _in silic_o model.

      Page 5: "The distributions of the average edge traversal speeds appeared insensitive to ER structure variations for both pinching-induced and exclusively diffusion transport." is rather trivial. Similar to "the presumed pinching parameters would be inadequate to facilitate ER luminal material exchange."

      These sentences, and the surrounding text, report the observed outcome of our numerical simulations: pinch-induced transport statistics has little variation across different ER geometries, and pinching does not facilitate luminal content mixing. This conclusion was not clear to us without running the simulation, and hence we deemed it nontrivial and relevant to comment on.

      Page 7: The authors mention that they could measure "typical edge traversal speed of 45 µm/s".

      I am not aware of such a measurement. Could they explain where this number comes from?

      These measurements were reported in Supplementary Figure 3b (bottom right) of Holcman et al. (2018) and are for the ER in a COS-7 cell. The main figure Fig. 2g reports analogous measurements for COS-7 cells because the tubule contraction data reported in this work measurements (mean speed 20 m/s) for a HEK-293 cell. We have worked with the speed pertains to COS-7 cells.

      A contraction that leads to 3.9 mu m/s over a distance of a few microns would be interesting. Is this a prediction of the present model?

      Yes, as stated in Section II, C paragraph 2. The present model with the experimentally measured averages for the tubule contraction parameters does indeed predict that a particle, in the absence of diffusion, is transported by a single tubule contraction at a maximum speed of 3.9 µm/s over 0.19 µm.

    1. eLife Assessment

      The useful studies described here are broadly applicable to all antibody discovery subfields, even though they are not a significant improvement over published methods. The findings are incomplete with respect to the methodology, since details that are crucial in order to repeat the experiments are lacking (such as a timestamp). They also do not take into account multiple recent papers that have tested similar strategies. These studies will be of interest to a specialized audience working on generating antibodies to infectious agents.

    2. Reviewer #1 (Public review):

      Summary:

      This paper by Watanabe et al described an expression system that can express the paired heavy and light chains of IgG antibodies from single cell B cells. In addition, they used FACS sorting for specific antigen to screen/select the specific populations for more targeted cloning of mAb genes. By staining with multiple antigens, they were able to zoom in to cross-reactive antibodies.

      Strengths:

      A highly efficient process which combines selection/screening with dua expression of both antibody chains. It is particularly suitable for isolation of cross-reactive antibodies against conserved epitopes of different antigens, such as surface proteins of related viruses.

      Weaknesses:

      (1) The overall writing is very difficult to follow and the authors need to work on significant re-writing<br /> (2) The paper in its current form really lacks detail and it is not possible for readers to repeat or follow their methods. For example: a) It is not clear whether the authors checked the serum to see if the mice were producing antibodies before they sacrificed them to harvest spleen/blood i.e. using ELISA? b) How long after administration of the second dose were the mice sacrificed? c) What cell types are taken for single B cell sorting? Splenocytes or PBMC? These are just some of the questions which need to be addressed.<br /> (3) According to the authors, 77 clones were sorted from the PR8+ and H2+ double positive quadrant. It is surprising that after transfection and re-analysing of bulk antibody presenting EXPI cells on FACS from, only 13 clones (or 8 clones? - unclear) seemed to be truly cross reactive. If that is the case, the approach is not as efficient as the authors claimed.

      The authors have adequately addressed the issues raised

    3. Reviewer #2 (Public review):

      Summary:

      Watanabe, Takashi et al. investigated the use of the Golden Gate dual-expression vector system to enhance the modern standard for rapid screening of recombinant monoclonal antibodies. The presented data builds upon modern techniques that currently use multiple expression vectors to express heavy and light chain pairs. In a single vector, they express the linked heavy and light chain variable genes with a membrane-bound Ig which allows for rapid and more affordable cell-based screening. The final validation of H1 and H2 strain influenza screening resulted in 81 "H1+", 48 "H2+", and 9 "cross" reactive clones. The kinetics of some of the soluble antibodies were tested via SPR and validated with a competitive inhibition with classical well-characterized neutralizing clones.

      Strengths:

      In this study, Watanabe, Takashi et al. further develop and refine the methodologies for the discovery of monoclonal antibodies. They elegantly merge newer technologies to speed up turnaround time and reduce the cost of antibody discovery. Their data supports the feasibility of their technique.

      This study will have an impact on pandemic preparedness and antibody-based therapies.

      Weaknesses:

      Limitations of this new technique are as follows: there is a significant loss of cells during FACs, transfection and cloning efficiency are critical to success, and well-based systems limit the number of possible clones (as the author discussed in the conclusions).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The overall writing is very difficult to follow and the authors need to work on significant re-writing. 

      Thank you for your comment. We have rewritten the text and asked an immunology expert, who is also a native English speaking editor, to review it.

      (2) The paper in its current form really lacks detail and it is NOT possible for readers to repeat or follow their methods. For example: a) It is not clear whether the authors checked the serum to see if the mice were producing antibodies before they sacrificed them to harvest spleen/blood i.e. using ELISA? b) How long after administration of the second dose were the mice sacrificed? c) What cell types are taken for single B cell sorting? Splenocytes or PBMC?

      Thank you for your comment. We have revised the methodology section thoroughly to ensure that the readers can follow and replicate the method. Our responses to the specific examples raised are as follows:

      a) We did not examine the serum titer after immunization. An increased serum titer, as determined by ELISA, does not always reflect the number of cross-reactive B cells because we expected the serum titer to consist of polyclonal antibodies, which are a mixture of PR8-reactive, H2-reactive, and cross-reactive clones. We thus anticipated that we would not obtain enough cross-reactive B cells after a series of immunizations. After comparing various immunization methods, including different adjuvants and immunization sites, using the readout of the number of cross-reactive B cells, we decided to adopt the immunization protocol presented in this paper.

      b) We sacrificed the mice two weeks after the second immunization (see Supplementary Figure 5).

      c) For this experiment, we used CD43 MACS B cells from the spleen purified with negatively charged beads (see Supplementary Figure 6).

      (3) According to the authors, 77 clones were sorted from the PR8+ and H2+ double positive quadrant. It is surprising that after transfection and re-analyzing of bulk antibody presenting EXPI cells on FACS, only 13 clones (or 8 clones? - unclear) seemed to be truly cross-reactive. If that is the case, the approach is not as efficient as the authors claimed.

      Thank you for your comment. To isolate high affinity antibodies, we gated the high fluorescent intensity population of cross-reactive B cells during Ig-expressing 293 cell sorting, as shown in Fig 2B, while we collected a wide intensity population of cross-reactive cells during splenocyte sorting. The narrow gating reduced the number of clones. We, however, cannot quantify how many clones we lost in the process, but we achieved a cloning efficiency exceeding 75%. To avoid any confusion, we have clarified this point by attaching additional supplementary figures (Supplementary Figures 5 and 6).

      Reviewer #2 (Public Review):

      (4) A His tagged antigen was used for immunization and H1-his was used in all assays. Either the removal of His specific clones needs to be done before selection, or a different tag needs to be used in the subsequent assays.

      Thank you for your comment. As pointed out, the possibility of antibody generation in regions other than HA cannot be ruled out since the immunized antigen and the detection antigen were the same. However, as shown in Table 1, the cross-reactive antibodies obtained in this study exhibited characteristic binding abilities to each of the six types of HA. If these were antibodies recognizing His, they would bind to all six types of HA. This indicates that these cross-reactive antibodies were not His-specific clones.

      We have incorporated information on this potential caveat into the discussion (page 12, lines 4-9).

      (5) This assay doesn't directly test the neutralization of influenza but rather equates viral clearance to competitive inhibition. The results would be strengthened with the demonstration of a functional antibody in vivo with viral clearance.

      Thank you for your constructive comment. While we agree that demonstration of a functional antibody in vivo with viral clearance would strengthen our results, this is clearly out of the scope of our current study and will be subject of future research.

      (6) Limitations of this new technique are as follows: there is a significant loss of cells during FACs, transfection and cloning efficiency are critical to success, and well-based systems limit the number of possible clones (as the author discussed in the conclusions). Early enrichment of the B cells could improve efficiency, such as selection for memory B cells.

      Thank you for your comment. Our cloning efficiency for sorted B cells exceeded 75%. However, we selected high binders of cross-reactive B cells during Ig-expressing 293 functional screening on purpose, as shown in Figure 2B, while we collected all cross-reactive B cells during B cell sorting (see attached Supplementary Figure 5). This functional selection step reduced the number of clones. We clarified this point by attaching additional supplementary figures (Supplementary Figures 5 and 6).

      Our sorted cross-reactive B cells are most likely CD38+ memory B cells, as shown in Supplementary Figure 6.

      Reviewer #1 (Recommendations For The Authors):

      a) It is advised for the authors to provide a flow chart with time stamps to prove the many statements made in the paper. For example, it is stated that "we demonstrated efficient isolation of influenza cross-reactive antibodies with high affinity from mouse germinal B cells over 4 days". It is not clear how this was calculated.

      Thank you for your comment. We have prepared a time-stamped flow chart (Supplementary Figure 5).

      b) The papers cited by the authors are relatively old if not outdated. There are many papers published focusing on efficient isolation of mAbs for SARS-CoV-2 research. For example, the paper by Lima et al (Nat Comm 2022, 13:7733) used a very similar strategy for rapid isolation of cross-reactive mAbs by FACS sorting followed by cloning of paired heavy and light chains from single B cells. The authors need to incorporate citations from the latest publications in this field.

      Thank you for your comment. The paper by Lima et al. (Nat Comm 2022, 13:7733) has been cited in the Discussion as ref 28.

      c) Figure 2 needs much more detail for readers to follow.

      Thank you for your comment. We have revised the legend of Figure 2 accordingly and added additional supplementary figures (Supplementary Figures 5 and 6) to increase clarity.

    1. eLife Assessment

      This important study shows the effect of gut dysbiosis on the colonization of mycobacteria in the lung. The data with comprehensive analysis of gene expression profiles in the lung with dysbiotic mice is compelling and goes beyond the current state of the art. However, the mechanistic insight, where the lung epithelial cell line was used, and the experiments with Mtb infection are currently incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      This work sought to demonstrate that gut microbiota dysbiosis may promote the colonization of mycobacteria, and they tried to prove that Nos2 down-regulation was a key mediator of such gut-lung pathogenesis transition.

      Strengths:

      They did large-scale analysis of RNAs in lungs to analyze the gene expression of mice upon gut dysbiosis in MS-infected mice. This might help provide overview of gene pathways and critical genes for lung pathology in gut dysbiosis. This data is somewhat useful and important for the TB field.

      Weaknesses:

      (1) They did not use wide-type Mtb strain (e.g. H37Rv) to develop mouse TB infection models, and this may lead to the failure for establishment of TB granuloma and other TB pathology icons.<br /> (2) The usage of in vitro assays based on A542 to examine the regulation function of Nos2 expression on NO and ROS may not be enough. A542 is not the primary Mtb infection target in the lungs.<br /> (3) They did not examine the lung pathology upon gut dysbiosis to examine the true significance of increased colonization of Mtb.<br /> (4) Most of the studies are based on MS-infected mouse models with lack of clinical significance.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work sought to demonstrate that gut microbiota dysbiosis may promote the colonization of mycobacteria, and they tried to prove that Nos2 down-regulation was a key mediator of such gut-lung pathogenesis transition.

      Strengths:

      They did large-scale analysis of RNAs in lungs to analyze the gene expression of mice upon gut dysbiosis in MS-infected mice. This might help provide an overview of gene pathways and critical genes for lung pathology in gut dysbiosis. This data is somewhat useful and important for the TB field.

      Weaknesses:

      (1)They did not use wide-type Mtb strain (e.g. H37Rv) to develop mouse TB infection models, and this may lead to the failure of the establishment of TB granuloma and other TB pathology icons.

      The colonization of M.tb in the lungs and the amount of colonization are the first and primary conditions for the occurrence of TB. Our aim in this study is to explore the impact of gut microbiota dysbiosis on the colonization of M.tb in the lungs. However, due to the lack of necessary conditions for biosafety in our laboratory, some highly infectious bacteria (such as M.tb) are not allowed to be cultured, and establishing the M.tb infection animal model in our laboratory does not meet the requirements of biosafety. Hence, we used the model strain of M.tb, M.smegmatis (MS), and established the animal-infected model for exploring the effect of gut microbiota dysbiosis on MS colonization in mice lungs. However, the establishment of MS infected model may not necessarily produce typical TB granulomas and other TB pathology signs. we have discussed the limitations of the current study in the discussion part of the manuscript. The suggested revisions are shown in lines 21-39 of page 15. In future studies, we plan to adopt the reviewer's suggestion and will use a wide-type M.tb strain to establish the TB-infected model in the laboratory that has biosafety standards to further verify the results of the current study.

      (2) The usage of in vitro assays based on A542 to examine the regulation function of Nos2 expression on NO and ROS may not be enough. A542 is not the primary Mtb infection target in the lungs.

      Thanks for the reviewer’s comments. Although alveolar epithelial cells (AECs) are not the main target cells of Mtb infection, they are among the cells that are contacted early in M.tb infection. Early M.tb invasion of AECs is very essential for the establishment of infection ( PMID 11479618). AECs are usually the initial site of the lung’s response against M.tb. Available literature suggests that freshly isolated AECs are more permissive to M.tb growth than macrophages(PMID 33228849). As a cellular reservoir for M.tb, AECs are capable of facilitating rapid bacterial growth while potentially escaping recognition by phagocytes in the alveolus. The immune cells such as macrophages are the primary targets of M.tb infection, where the M.tb survive and proliferate, leading to the formation and maintenance of granulomas. However, AECs are subjected to the same density of infection, and the bacteria invade and replicate in these cells and induce cell apoptosis and necrosis, which is considered a major mechanism implicated in extra-pulmonary dissemination (PMID 12925134, PMID 32849525). Besides their direct barrier role, AECs also directly respond to M.tb infection by producing mediators such as cytokines, chemokines, and antimicrobial agents (PMID 35017314). Therefore, it is feasible to select alveolar epithelial cell A549 to explore the colonization mechanism of intestinal microbiota affecting M.tb in vitro.  

      (3) They did not examine the lung pathology upon gut dysbiosis to examine the true significance of increased colonization of Mtb.

      We have added the results of the lung pathological section in the revised manuscript. The results of lung pathological sections are shown in lines 11-13 of page 4, and Figure S2 of supplement information.

      (4) Most of the studies are based on MS-infected mouse models with a lack of clinical significance.

      The first and primary condition of any pathogen infection is that the bacteria must invade the host through colonization and multiply in the target organ. This study aimed to investigate the effect of intestinal microbial dysbiosis on the colonization of mycobacterium in mouse lungs. Our laboratory does not meet the biosafety standard for culturing highly infectious bacteria such as Mycobacterium tuberculosis. So, we used the Mycobacterium smegmatis as a model strain for M.tb to establish the infected mice model in the current research. Although M. smegmatis is generally considered nonpathogenic, M. smegmatis is closely related to M.tb in biochemical characteristics, genetic information, cell structure, and metabolism( PMID 32674978). M.smegmatis is regarded as a valuable model organism in the study of M.tb, which is widely been used to explore the biological characteristics of M.tb such as physiological state, stress response, non-culture state reactivation, antimicrobial activity, and biochemical protection (PMID 32674978). It has also been reported that M.smegmatis could be used as a model strain to study the molecular mechanism of interaction between M.tb and its host (PMID 30546046, PMID25970481, PMID 29568875). However, in preclinical experimental research, we used M. smegmatis as the object of study. Instead of focusing on the pathological changes caused by M.smegmatis in the host lungs, we mainly focused on the influence of intestinal microbiota on the colonization of mycobacterium in the lungs and its possible mechanism, which provides a reliable model to study the prevention of early infection and spread of M.tb through regulating the intestinal microbiota. It has important clinical significance for the further development of new measures for the prevention and control of tuberculosis. If experimental conditions permit, the establishment of an infected model with wild-type M.tb can be used to verify the findings of the present study which may provide important clinical guidelines.

      Reviewer #2 (Public Review):

      The manuscript entitled "Intestinal microbiome dysbiosis increases Mycobacteria pulmonary colonization in mice by regulating the Nos2-associated pathways" by Han et al reported that using clindamycin, an antibiotic to selectively disorder anaerobic Bacteriodetes, intestinal microbiome dysbiosis resulted in Mycobacterium smegmatis (MS) colonization in the mice lungs. The authors found that clindamycin induced damage of the enterocytes and gut permeability and also enhanced the fermentation of cecum contents, which finally increased MS colonization in the mice's lungs. The study showed that gut microbiota dysbiosis up-regulated the Nos2 gene-associated pathways, leading to increased nitric oxide (NO) levels and decreased reactive oxygen species (ROS) and β-defensin 1 (Defb1) levels. These changes in the host's immune response created an antimicrobial and anti-inflammatory environment that favored MS colonization in the lungs. The findings suggest that gut microbiota dysbiosis can modulate the host's immune response and increase susceptibility to pulmonary infections by altering the expression of key genes and pathways involved in innate immunity. The authors reasonably provided experimental data and subsequent gene profiles to support their conclusion. Although the overall outcomes are convincing, there are several issues that need to be addressed:

      (1) In Figure S1, the reviewer suggests checking the image sizes of the pathological sections of intestinal tissue from the control group and the CL-treatment group. When compared to the same intestinal tissue images in Figure S4, they do not appear to be consistently magnified at 40x. The numerical scale bars should be presented instead of just magnification such as "40x".

      Thanks for the precise comments. We have carefully checked the pathological section in Figure S1 and Figure S5 and added the numerical scale bars to the figure. The revised sections are added in the supplementary materials.

      (2) In Figure 4d, the ratio of Firmicutes in the CL-FMT group decreased compared to the CON-FMT group, whereas the CL-treatment group showed an increase in Firmicutes compared to the Control group in Figure 3b. The author should explain this discrepancy and discuss its potential implications on the study's findings.

      The success of fecal microbial transfer (FMT) is influenced by many factors, such as host intestinal microbiota, immunity, and genetic factors (PMID 37167953). During FMT procedure, all microbiota of the donor feces do not have the same colonization ability in the recipients. Some research has revealed that the colonization success rate of Bacteroidetes is higher than that of Firmicutes [PMID 24637796]. In this study, we noticed that the reason for the difference between Figure 4D and Figure 3B was that during FMT, the colonization of Firmicutes decreased in the Cl-FMT receptor after transplantation, while the colonization of Bacteroides increased, resulting in a decrease in the proportion of Firmicutes/ Bacteroides in the Cl-FMT group. However, we considered the gut microbiota as a whole in the present study. After FMT, we found that 85.11% of bacterial genera and 52.38% of fungi genera present in the CL inocula were successfully transferred to the CL-recipient mice, and 91.45% of bacteria genera and 56.36% of fungi genera in the CON inocula were also successfully transferred to the CON-recipient mice, respectively (Figure 4g). The trans-kingdom network analyses between bacteria and fungi showed that the trends of the gut microbiome in recipient mice were consistent with those in the donor mice. Therefore, the FMT model established in this study remains successful. For reviewer clarification, we have added explanations in the discussion part of the manuscript. See lines 8-29 of page 12 for details.

      (3) In Figure 6, did the authors have a specific reason for selecting Nos2 but not Tnf for further investigation? The expression level of the Tnf gene appears to be the most significant in both RT-qPCR and RNA-sequencing results in Figure 5f. Tnf is an important cytokine involved in immune responses to bacterial infections, so it is also a factor that can influence NO, ROS, and Defb1 levels.

      Thanks for the valuable reviewer’s comment. By analyzing the transcriptome data, we found that there were 8 genes strongly associated with TB infection in the KEGG pathway, including Nos2, Cd14, Tnf, Cd74, Clec4e, Ctsd, Cd209a, and Il6. Then, we performed KO pathway analysis and found that the Nos2 gene was strongly associated with multiple pathways including “cytokine activity ", "chemokine activity", and "nitric oxide synthase binding". Moreover, in a clinical study on tuberculosis, the expression level of Nos2 in the plasma of patients with newly diagnosed tuberculosis was significantly higher than that of healthy people, indicating that Nos2 is associated with the occurrence of tuberculosis (PMID 34847295). Therefore, we selected Nos2 as the main target gene in the current study to conduct the correlation pathway analysis. As an important cytokine involved in the immune response to bacterial infection, Tnf mentioned by the reviewers may also be a factor affecting the levels of NO, ROS, and Defb1, which provides a new idea for our future research.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      First, they need to use a true Mtb-infected mouse model to determine the relationship between gut dysbiosis and increased lung infection of Mtb.

      Second, the mechanism by which nos2-mediated NO and ROS production need to be further analyzed in the real Mtb infection process (either in vivo or in vitro).

      Third, Lung pathology should be included in addressing the increased colonization of mycobacteria. Addressing these problems may help improve this work.

      (1) Our laboratory does not meet the biosafety standard for culturing highly infectious bacteria such as Mycobacterium tuberculosis. So, we used the Mycobacterium smegmatis as a model strain for M.tb to establish the infected mice model in the current research. Although M. smegmatis is generally considered nonpathogenic. M. smegmatis is closely related to M.tb in biochemical characteristics, genetic information, cell structure, and metabolism( PMID 32674978). M.smegmatis is regarded as a valuable model organism in the study of M.tb, and has been widely used to explore the biological characteristics of M.tb such as physiological state, stress response, non-culture state reactivation, antimicrobial activity, and biochemical protection (PMID 32674978). It has also been reported that M.smegmatis was used as a model strain to study the molecular mechanism of interaction between M.tb and its host (PMID 30546046, PMID25970481, PMID 29568875). However, in preclinical experimental research, we mainly focused on the influence of intestinal microbiota on the colonization of mycobacterium in the lungs and its possible mechanism which provides a reliable model to study the prevention of early infection and spread of M.tb through regulating the intestinal microbiota.

      (2) In the future, we will establish an infected model with wild-type M.tb to verify the mechanism by which nos2-mediated NO and ROS production and promote M.tb colonization.

      (3) We have added the results in the lung pathological section in the revised manuscript. The results of lung pathological sections are shown in lines 11-13 of page 4, and Figure S2 of supplement information.

    1. eLife Assessment

      This valuable study investigated the appearance of ultrasonic vocalizations around 44 kHz that occurs in response to prolonged fear conditioning in male rats. Evidence in support of the conclusions is solid and may be of interest to some researchers also investigating distress-related ultrasonic vocalizations.

    2. Reviewer #2 (Public Review):

      Olszyński et al. claim that they identified a "new-type" ultrasonic vocalization around 44 kHz that occurs in response to prolonged fear conditioning (using foot-shocks of relatively high intensity, i.e. 1 mA) in rats. Typically, negative 22-kHz calls and positive 50-kHz calls are distinguished in rats, commonly by using a frequency threshold of 30 or 32 kHz. Olszyński et al. now observed so-called "44-kHz" calls in a substantial number of subjects exposed to 10 tone-shock pairings, yet call emission rate was low (according to Fig. 1G around 15%, according to the result text around 7.5%). They also performed playback experiments and concluded that "the responses to 44-kHz aversive calls presented from the speaker were either similar to 22-kHz vocalizations or in-between responses to 22-kHz and 50-kHz playbacks".

      Strengths: Detailed spectrographic analysis of a substantial data set of ultrasonic vocalizations recorded during prolonged fear conditioning, combined with playback experiments.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      The exclusive use of males is a major concern lacking adequate justification and should be disclosed in the title and abstract to ensure readers are aware of this limitation. With several reported sex differences in rat vocal behaviors this means caution should be exercised when generalizing from these findings. The occurrence of an estrus cycle in typical female rats is not justification for their exclusion. Note also that male rodents experience great variability in hormonal states as well, distinguishing between individuals and within individuals across time. The study of endocrinological influences on behavior can be separated from the study of said behavior itself, across all sexes. Similarly, concerns about needing to increase the number of animals when including all sexes are usually unwarranted (see Shansky [2019] and Phillips et al. [2023]).

      As suggested by the Reviewer, we have disclosed the use of males in the title and the abstract. Also, we have added the statement that research on female rat subjects is required: “Here we are showing introductory evidence that 44-kHz vocalizations are a separate and behaviorally-relevant group of rat ultrasonic calls. These results require further confirmations and additional experiments, also in form of repetition, including research on female rat subjects.”

      Regarding the analysis where calls were sorted using DBSCAN based on peak frequency and duration, my comment on the originally reviewed version stands. It seems that the calls are sorted by an (unbiased) algorithm into categories based on their frequency and duration, and because 44kHz calls differ by definition on frequency and duration the fact that the algorithm sorts them as a distinct category is not evidence that they are "new calls [that] form a separate, distinct group". I appreciate that the authors have softened their language regarding the novelty and distinctness of these calls, but the manuscript contains several instances where claims of novelty and specificity (e.g. the subtitle on line 193) is emphasized beyond what the data justifies.

      We further softened our language regarding novelty and distinctness of 44-kHz vocalizations – including the aforementioned subtitle. However, in response, we would like to bring to the readers’ attention that all major groups of calls, i.e., long 22-kHz calls, short 22-kHz calls, and 50-kHz vocalization, are also defined in our manuscript and in the literature by their frequency and duration. However not one of these groups was identified separately by DBSCAN clustering excepting the 44kHz vocalizations. If they were not a distinct group, we would expect the 44-kHz and 50-kHz vocalizations to blend first (because of the similar frequencies) or 44-kHz and 22-kHz calls to merge first (because of the similar durations), but they do not in this unbiased examination.

      The behavioral response to call playback is intriguing, although again more in line with the hypothesis that these are not a distinct type of call but merely represent expected variation in vocalization parameters. Across the board animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls. This does raise interesting questions about how, ethologically, animals may interpret such variation and integrate this interpretation in their responses. However, the categorical approach employed here does not address these questions fully.

      This paragraph is exactly the same as in the previous review. There was no comment regarding our previous answer. Here is the previous answer:

      “We are unsure of the Reviewer’s critique in this paragraph and will attempt to address it to the best of our understanding. Our finding of up to >19% of long seemingly aversive, 44-kHz calls, at a frequency in the define appetitive ultrasonic range (usually >32 kHz) is unexpected rather than “expected”. We would agree that aversive call variation is expected, but not in the appetitive frequency range.

      Kindly note the findings by Saito et al. (2019), which claim that frequency band plays the main role in rat ultrasonic perception. It is possible that the higher peak frequency of 44-kHz calls may be a strong factor in their perception by rats, which is, however, modified by the longer duration and the lack of modulation. 

      Also, from our experience, it is quite challenging to demonstrate different behavioral responses of naïve rats to pre-recorded 22-kHz (aversive) vs. 50-kHz (appetitive) vocalizations. Therefore, to demonstrate a difference in response to two distinct, potentially aversive, calls, i.e., 22kHz vs. 44-kHz calls, to be even more difficult (as to our knowledge, a comparable experiment between short vs. long 22-kHz ultrasonic vocalizations, has not been done before). 

      Therefore, we do not take lightly the surprising and interesting finding that “animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls”. We would rather put this description in analogous words: “the rats responded similarly to hearing 44-kHz calls as they did to hearing aversive 22-kHz calls, especially regarding heart-rate change, despite the 44-kHz calls occupying the frequency band of appetitive 50-kHz vocalizations” and “other responses to 44-kHz calls were intermediate, they fell between response levels to appetitive vs. aversive playback” – which we added to the Discussion.

      Finally, we acknowledge that our findings do not present a finite and complete picture of the discussed aspects of behavioral responses to the presented ultrasonic stimuli (44-kHz vocalizations). Therefore, we have incorporated the Reviewer’s suggestion in the discussion. The added sentence reads: “Overall, these initial results raise further questions about how, ethologically, animals may interpret the variation in hearing 22-kHz vs. 44-kHz calls and integrate this interpretation in their responses.”

      I appreciate the amendment in discussing the idea of arousal being the key determinant for the increased emission of 44kHz, and the addition of other factors. Some of the items in this list, such as annoyance/anger and disgust/boredom, don't really seem to fit the data. I'm not sure I find the idea that rats become annoyed or disgusted during fear conditioning to be a particularly compelling argument. As such the list appears to be a collection of emotion-related words, with unclear potential associations with the 44kHz calls.

      We agree that most of the factors listed are not supported by the data. These are hypotheses and speculations only – hence, an assumption / tentative statement, i.e., “It could also be argued that…”. We have changed it into “It could also be speculated that…”.

      Later in the Discussion the authors argue that the 44kHz aversive calls signal an increased intensity of a negative valence emotional state. It is not clear how the presented arguments actually support this. For example, what does the elongation of fear conditioning to 10 trials have to do with increased negative emotionality? Is there data supporting this relationship between duration and emotion, outside anthropomorphism? Each of the 6 arguments presented seems quite distant from being able to support this conclusion.

      We have added a description summarizing the literature that expounds the differences in employing one-two vs. five-ten foot-shocks during fear-conditioning training. It says:

      “Importantly, it has been demonstrated multiple times that training rats with several electric foot-shocks (i.e., 5-10 shocks) produces a qualitatively different kind of fear-memory compared to training with only 1-2 shocks. Training with more numerous shocks has been shown to result in augmented freezing (e.g., Fanselow and Bolles, 1979, Haubrich et al., 2020, Haubrich and Nader, 2023, Poulos et al., 2016, Wang et al., 2009) which reflects a more intense fear-memory that is resistant to extinction (Haubrich et al., 2020, Haubrich and Nader, 2023), resistant to reconsolidation blockade (Haubrich et al., 2020, Wang et al., 2009, Finnie and Nader, 2020), associated with downregulation of NR2B NMDA-receptor subunits as well as elevated amyloid-beta concentrations in the lateral and basal amygdala (Finnie and Nader, 2020, Wang et al., 2009). Additionally, it involves activation of the noradrenaline-locus coeruleus system (Haubrich et al., 2020) and collective changes in connectivity across multiple brain regions within the neural network (Haubrich and Nader, 2023). 

      Notably, it has also been shown that higher freezing as a result of fear-conditioning training correlates with increased concentrations of stress hormone, corticosterone, in the blood (Dos Santos Correa et al., 2019). The rats subjected to 6- and 10-trial fear conditioning, whose results are reported herein (Tab. 1/Exp. 2/#7,8,11,12; n = 73), also demonstrated higher freezing than rats subjected to 1trial conditioning (Tab. 1/Exp. 2/#6,10; n = 33), which is reported elsewhere (Olszynski et al., 2021, Fig. S1C-E; Olszynski et al., 2022, Fig. S1D-G). Therefore, we postulate that emission of 44-kHz calls is associated with increased stress and the training regime forming robust memories.”  

      In sum, rather than describing the 44kHz long calls as a new call type, it may be more accurate to say that sometimes aversive calls can occur at frequencies above 22 kHz. Individual and situational variability in vocalization parameters seems to be expected, much more so than all members of a species strictly adhering to extremely non-variable behavioral outputs.

      This paragraph is exactly the same as in the previous review. There was no comment regarding our previous answer. Here is the previous answer:

      “The surprising fact that there are presumably aversive calls that are beyond the commonly applied thresholds, i.e., >32 kHz, while sharing some characteristics with 22-kHz calls, is the main finding of the current publication. Whether they be finally assigned as a new type, subtype, i.e. a separate category or become a supergroup of aversive calls with 22-kHz vocalizations is of secondary importance to be discussed with other researchers of the field of study. 

      However, we would argue – by showing a comparison – that 22-kHz calls occur at durations of <300 ms and also >300 ms, and are, usually, referred to in literature as short and long 22-kHz vocalizations, respectively (not introduced with a description that “sometimes 22-kHz calls can occur at durations below 300 ms”). These are then regarded and investigated as separate groups or classes usually referred to as two different “types” (e.g., Barker et al., 2010) or “subtypes” (e.g., Brudzynski, 2015). Analogously, 44-kHz vocalizations can also be regarded as a separate type or a subtype of 22kHz calls. The problem with the latter is that 22-kHz vocalizations are traditionally and predominantly defined by 18–32 kHz frequency bandwidth (Araya et al., 2020; Barroso et al., 2019; Browning et al., 2011; Brudzynski et al., 1993; Hinchcliffe et al., 2022; Willey & Spear, 2013).”

      Reviewer #1 (Recommendations For The Authors):

      Additional considerations:

      Abstract: The 19.4% seems to be the percentage of 44 kHz calls observed during the 9th trial of the 10trial experiment, not the percentage of calls that were 44kHz during bouts of freezing.

      We clarified the sentence. It now says:

      “We observed 44-kHz calls to be associated with freezing behavior during fear conditioning training, during which they constituted up to 19.4% of all calls and most of them appeared next to each other forming groups of vocalizations (bouts).”

      Abstract: "We hope that future investigations of 44-kHz calls in rat models of human diseases will  contribute to expanding our understanding and therapeutic strategies related to human psychiatric conditions." This sounds like a far too strong of an implication provided the link between these calls and models of human psychiatric conditions is not clear.

      We agree, the link is not clear. Therefore we only express our hope. We hope “the link” is there. While other ultrasonic calls are already being investigated in such animal models, training regimes employing numerous electric shocks are used as models of PTSD, helplessness etc.

      Line 101: Seems a strong assumption to state the authors of the other publication were inspired by this paper, unless there is personal communication corroborating this.

      The wording of the sentence has been changed.

      It is still not clear why both Friedman and Wilcoxon tests were used, especially in situations where only one result seems to be referenced (for example on line 108-109).

      We added the explanation within Methods: “In particular, the Friedman test was used to assess the presence of change within the sequence of several ITI, while the Wilcoxon test was used for the difference between the first and the last ITI analyzed.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations for The Authors):

      Q1: Please replace lymphocytes with lymphatic endothelial cells throughout the manuscript.

      A1: Thank you for your conscientious review. Per your suggestion, we have replaced “lymphocytes” with “lymphatic endothelial cells (LECs)” throughout the manuscript.

      Q2: Please re-analyse lymphatics using LYVE1 and CD68 or another macrophage marker, as Lyve1 is NOT specific for lymphatics.

      A2: Thank you for your suggestion. We completely agree with your opinion. Because both the CD68 (CST,97778S) and LYVE1 antibodies (Abcam,ab14917) are rabbit multiclonal antibodies and to more accurately label cardiac lymphatics, we performed immunofluorescence co-staining using LYVE1 and PDPN antibodies (Thermo,53-5381-82) and re-measured the lymphatic vessel area using the Image J software (version 1.53). The result is shown in Figure 1A and 1B. Further, we performed co-staining with PDPN and CD68 to observe the relationship between macrophage and cardiac lymphatic vessel distributions at different time points post-myocardial infarction (MI) (Figure1-figure supplement 1F). Per your comment, some LYVE1 markers are positive, whereas PDPN markers may be negative for macrophages in the heart tissue. We have added notes on the catalog numbers of anti-PDPN and anti-CD68 in the methods (Page 10, Lines 351‒352) and updated them in the KRT template and MDAR checklist.

      Q3: Rephrase title 2.6, 2.7 to fit the results in these sections that are purely descriptive and do not add any insight into the functional relevance of the findings.

      A3: Thank you for your suggestion. We have rephrased titles 2.6 and 2.7 as follows:

      2.6 AQP1 in LEC is correlated with myocardial edema occurrence and resolution post-MI.

      2.7 Gal9 secreted by LEC can affect macrophage migration.

      Q4: Please refrain from extensive discussion of non-significant findings, such as Figures 6D, and 7A, B, and M (ifng vs ifng + antiGal9 is n.s).

      A4: Thank you for your suggestions. Lymphatic endothelial cells (LECs) are a type of cell that exists in the myocardial tissue in small quantities. Owing to the extremely small number of LECs, elucidating their biological functions and regulation may be challenging during MI. To gain a deeper understanding of the role of the lymphatic system post-MI, we attempted to analyze the transcriptomic changes of LEC subsets at different time points after MI by combining single-cell sequencing and spatial transcriptomics data. We have selected relevant molecules with significant differences in transcription levels and conducted the validation analysis in LECs at different time points after MI. Among them, AQP1 and GAL9 showed significant differences. CD44, as a receptor for GAL9, showed significant differences in its expression in macrophages at different time points after MI. Therefore, we have added the relevant information to the discussion section (marked with yellow) on Page 9, Lines 299‒312.

      Q5: Please explain the method used to calculate lymphatic areas in Figure 1.

      A5: Thank you for your observation. The method we used is consistent with that described in previous studies[1,2]. (PMID: 30582443 and PMID: 32404007). The detailed methods have been described in the Methods as follows (Page 10, Lines 358‒363):

      For quantification of vessel area, vessels with visible co-staining were measured using Image J software. First, we selected an image, turned it into 8-bit, and then applied a suitable threshold adjustment (present co-stained areas wherever possible). Second, five equally sized squares were selected in the respective zones (remote, infarct, and border zones) of each slice. ROI manager tools were used to analyze the automatic signal intensity quantification by the software in the area inside this square. Finally, the GraphPad software was used to plot the results as a bar graph.   

      Q6: In Figure 1 supp C, the upper and lower panels don't seem to have the same zoom factor.

      A6: Thank you for pointing this out. The upper and lower images in Figure S1C have the same magnification. To facilitate your review, we have added a 1× image and re-labeled the position and scale information of the image. The revised Figure S1C was added to the manuscript and is shown as follows:

      Q7: In Figure 2d please include aqp1 among displayed genes.

      A7: Thank you for your suggestion. The Aqp1 gene is already displayed in the 11th, and we have labeled it.

      Q8: In Figure 2f include markers of LECs such as Prox1, Flt4, Itga9, and also show Aqp1 here.

      A8: Thank you for your valuable comment. We have updated Figure 2f.

      Q9: Please indicate in Figure 3a what the y axis means? % of total LECs? % of total LECs at a given time point? The data is really not clear.

      A9: Thank you for your suggestion. The y-axis represents the percentage of the total number of LECs at d1, d3, d7, d14, and d28 post-MI, relative to the number of LECs at d0, which is used as the reference value set at 100%. Meanwhile, different colors were applied to represent the proportion of different cell subtypes at different time points. We have updated Figure 3a.

      Q10:Add n of LECs per time points in Figures 3a and b.

      A10: Thank you for your suggestion. We have updated Figure 3b.

      Q11: For Figure 3c please explain what marker genes were used to identify LEC enriched areas. What was the spatial resolution of the transcriptomic screens? How do these images relate to the localization of lymphatics in the heart?

      A11: We appreciate your observation. We have added the required information to the Methods on Page 13, Lines 442‒448, as follows:

      “We conducted spatial transcriptome data analysis using the deconvolution algorithm. The deconvolution algorithm refers to the application of feature genes to infer the full matrix information of single-cell transcriptome of cell subclusters. We then compared and anchored the matrix information of the single-cell transcriptome with the information of each SPOT in the spatial transcriptome, predicting cell types based on the similarity between the two sets of information.”

      Q12:Figure 6 explains the y-axis in panel A, the timepoint in panel G, and absence of aqp1 staining in blood vessels in images d1 and d3 in panel D.

      A12: Thank you for your suggestion. The y-axis in Figure 6A (Figure to reviewer 7A) shows Aqp1 expression in LECs at different time points from the sc-RNA sequence data. We have also added the timepoint in Figure 6G, which is for 24 hours. To clarify the expression trend of APQ1 more clearly, we performed immunofluorescence staining of APQ1 and LYVE1 at different time points after MI (d0, d1, d3, d7, and d14). The results are shown in Figure to reviewer 7C. APQ1 expression was found to be increased in the border zone of infarction at d3 post-MI adjacent to LYVE1 staining positive area.

      Q13: Explain the y-axis unit in Figure 7a.

      A13: Thank you for your comment. The y-axis in Figure 7A shows Lgals9 gene expression in LECs at different time points from the Sc-RNA sequence data.

      Q14: In Figure 7c, d how was the induction of cell death excluded as a cause of IFNg-mediated effects in LECs?

      A14: Thank you for your suggestion. To remove the interference of apoptosis on the results, we performed TUNEL staining of LECs after stimulation with different concentrations of IFN-r for 24 h. As shown in the Figure to reviewer 9, little apoptosis of LECs was observed in this concentration gradient range. Therefore, we can exclude the potential impact of IFN-r-induced cell apoptosis.

      Author response image 1.

      TUNEL staining of LECs after stimulation with different concentrations of IFN-r for 24 h.

      Q15: Results with hypoxia in Figure 7 are mentioned but not shown.

      A15: Thank you for your observation. In the revised article, we supplemented the detection of Gal9 expression after hypoxic stimulation. We conducted hypoxia intervention experiments using two methods. First, we applied 1% oxygen concentration stimulation to detect the expression of Gal9 at 0 h, 2 h, 4 h, 8 h, 12 h, and 24 hours. Second, we applied CoCl2 intervention to activate HIF1α expression and simulated cell hypoxia stimulation to detect Gal9 expression. Both results confirmed that hypoxia could not stimulate LECs to secrete galectin 9. The results are presented in Figure 7-figure Supplement 1 (A-D).

      Reviewer #3 (Recommendations For The Authors):

      Q1: In Figure 1, the so-called "LYVE1-labeled lymphatic capillaries with discontinuous walls" might be macrophages. The authors measured lymphatic area by measuring "vessels with visible lumens", which is unclear. This may underestimate the number of capillaries that expand after MI in the border zone of the infarct area. The authors need to use CD68 and Pdpn markers, as Lyve1 is not specific for lymphatics and also stains macrophages, and Pdpn is more reliable for assessing lymphatic identity.

      A1: Thank you for your good suggestion. We totally agree with your opinion. Because both the CD68 (CST,97778S) and LYVE1 antibodies (Abcam,ab14917) are rabbit multiclonal antibodies and to more accurately label cardiac lymphatics, we performed immunofluorescence co-staining using LYVE1 and PDPN antibodies(Thermo,53-5381-82) and re-measured the lymphatic vessel area using the Image J software (version 1.53). The result is shown in Figure to reviewer 1 (Figure 1A and 1B in manuscript). Further, we performed co-staining with PDPN and CD68 to observe the relationship between macrophage and cardiac lymphatic vessel distributions at different time points post-myocardial infarction (Figure to reviewer 2,and Figure1-figure supplement 1F in manuscript). Per your comment, some LYVE1 markers are positive, whereas PDPN markers may be negative for macrophages in the heart tissue. We have added notes on the catalog numbers of anti-PDPN and anti-CD68 in the methods (Page 10, Lines 351‒352) and updated them in the KRT template and MDAR checklist.

      Q2: It is not clear how they analyse the lymphatic area in Figure 1, please explain.

      A2: Thank you for your observation. The method we used is consistent with that described in previous studies[1,2]. (PMID: 30582443 and PMID: 32404007). The detailed methods have been described in the Methods as follows (Page 10, Lines 347‒352):

      For quantification of vessel area, vessels with visible co-staining were measured using Image J software. First, we selected an image, turned it into 8-bit, and then applied a suitable threshold adjustment (present co-stained areas wherever possible). Second, five equally sized squares were selected in the respective zones (remote, infarct, and border zones) of each slice. ROI manager tools were used to analyze the automatic signal intensity quantification by the software in the area inside this square. Finally, the GraphPad software was used to plot the results as a bar graph.   

      Q3: Figure 1-supplement 1D: The authors claim that the observed structure is a lymphatic valve, however in 2D sections, this shape might result from membrane destruction due to the cutting and staining process. To accurately identify valves, the authors should employ 3D imaging of the lymphatic network, such as using a clearing protocol followed by lightsheet microscopy.

      A3: Thank you for your good suggestion. We performed a 3D scan using a confocal microscope on another slice. The results are shown in Figure 1-supplement 1D. We believe it is more like the lymphatic valve than chips from membrane destruction.

      Q4: In Figure 2, the number of LECs is too little. Indeed, 242 LECs were identified over 44860 total cell numbers and 5688 endothelial cells cannot be representative and cannot afford to distinguish 4 different clusters.

      A4: We further analyzed the percentage of LEC in the adult mouse heart in the physiological state on day d0 based on the results of single-cell nuclear sequencing from public databases (GSE214611). A total of 292 LEC cells were obtained from 26,779 cells captured on board in three samples, meaning that the percentage of LEC cells in the normal adult mouse heart is 1.09%. Cardiac LECs are really rare, and enrichment methods such as flow cytometry and magnetic beads separation for cardiac LECs are under marked probing, which might exhibit more irrefutable evidence in future studies.

      Q5: The authors claimed that there is transcriptional heterogeneity in regenerated cardiac LECs post-MI, based on their over-clusterization. However, to substantiate this claim, they need to include a control comparison. Currently, the observed differences in cardiac LEC profiles lack a direct connection to the disease condition.

      A5: Thank you for pointing this out. Because we could not download spatial transcriptome data for day d0 in the public database (GSE214611) or from the authors, we have used data of 1 h after IR as a reference for approximating the physiological state in Figure 3 and in Supplemental Figure 1.

      Q6: Line 131, what is the regeneration ratio the authors cite here?

      A6: Thank you for the comment. Regeneration ratio is an inappropriate use of the word, and we apologize for this confusion. We were actually referring to the regenerative potential of LECs.

      Q7: Line 132, it is not clear what is the "normal myocardial tissue" in the graphs presented Figures 3A and B. Is it d0 time point?

      A7: Thank you for your suggestion. The d0 time point means LECs in the normal adult mouse heart.

      Q8: In Figure 2D, please add more lymphatic markers as Ccl21, Flt4, Itga9, FoxC2 and Aqp1.

      A8: Thank you for your suggestion. We have added these markers (Except Ccl21, whose gene expression is too low to mark) in Figure 2D in the revised manuscript.

      Q9: The authors must replace "lymphocyte" with "lymphatic" from 2.5, where they start to present interactions between lymphatic and immune cells.

      A9: Thank you for your good comments. We have corrected these words.

      Q10: In Figure 3, please indicate what the color scale means.

      A10: Thank you for your suggestion. We have supplied a color scale label.

      Q11: In Figures 3C and D, the authors distinguished the same LECs clusters in the spatial transcriptomic as in the scRNAseq analysis. This is not clear whether they used the same markers.

      A11: We appreciate your observation. We have added the required information to the Methods on Page 12, Lines 429‒434, as follows:

      “We conducted spatial transcriptome data analysis using the deconvolution algorithm. The deconvolution algorithm refers to the application of feature genes to infer the full matrix information of single-cell transcriptome of cell subclusters. We then compared and anchored the matrix information of the single-cell transcriptome with the information of each SPOT in the spatial transcriptome, predicting cell types based on the similarity between the two sets of information.”

      Q12: In 2.5, it is not clear whether the main message is about macrophage interactions with lymphocytes or with lymphatics(LEC interact with others)

      A12: Thank you for your suggestion. We have revised the title 2.5 as “Assessment of Cell-Cell Communication between LECs and immune cells,” which is clearer for the reader.

      Q13: In 2.6, the authors claim that they reveal "that fluid retention occurs in LEC ca I and LEC co. They don't show any data supporting this.

      A13: Thank you for your comment. “…that fluid retention occurs in LEC ca I and LEC co” is mainly supported by Figure 3D KEGG enrichment. LEC Ca I is related to vasopressin-regulated water reabsorption, and LEC co is related to renin secretion.

      Q14: In Figure 6A, please add statistical values, as the authors claim a significant correlation. Please also add a figure to support the correlation between Aqp1 and edema score, as mentioned in 2.6.

      A14: Thank you for pointing this out. We have presented the information on statistical values in Figure 6A. Moreover, we calculated the correlation between Aqp1 and edema score in Figure 6D (shown in Author response image 2).

      Author response image 2.

      Correlation between Aqp1 expression intensity and edema score.

      Q15: In Figure 6B, myocardial edema assessment using H&E staining is not accurate. If the authors wish to analyse cardiac edema, they must use gravimetry or MRI techniques.

      A15: Thank you for your comment. We totally agree with your opinion. However, owing to limitations in experimental conditions, we could not perform MRI detection of mouse myocardial injury. To evaluate whether edema occurred in the mouse heart tissue, we used classic pathological evaluation methods described in the literature (PMID: 30582443). This method has been described in detail as follows (Page 11, Lines 365‒370):

      Four high-power (40×) representative images were chosen per animal under the H&E stained section; each image must have a clear border of the section visible. Images were blinded, and five visual fields per sample were evaluated. Subsequently, an edema score was determined for each sample (Score 1=no edema, 2=mild edema, 3=severe edema). Graphs represent the average score value per animal.

      Q16: Line 227, please correct "LVEC" with "LEC".

      A16: Thank you for your careful review. We have revised this in the manuscript.

      Q17: In Figure 6D, IF co-staining of Aqp1 and lymphatic vessels is mentioned as "significantly reduced". However, we don't see any quantification data supporting this.

      A17: Thank you for your comment. To clarify the expression trend of APQ1 more clearly, we performed immunofluorescence staining of APQ1 and LYVE1 at different time points post-MI (d0, d1, d3, d7, and d14). The results are shown in the corrected Figure 6-figure supplement 1A. The result showed that APQ1 expression increased in the border zone of infarction in d3 post-MI adjacent to LYVE1 staining positive area.

      Q18: As Gal9 was not significantly impaired in LECs post. MI, Figure 7A does not support any real finding concerning the role of this molecule in monocytes/macrophages interaction with cardiac lymphatics.

      A18: Thank you for your comment. The Lgals9 gene is significantly impaired in LEC post-MI, as well as the Cd44 gene in macrophage. We have updated them in Figures 7A and 7B.

      Q19:  In Figure 7, please correct INF by IFN.

      A19: Thank you for your careful review. We have revised this in the manuscript.

    2. eLife Assessment

      This study presents useful, yet preliminary, findings on the transcriptomic changes in cardiac lymphatic cells after myocardial infarction in mice. The conclusions of the authors remain uncertain as sample sizes for lymphatic endothelial cells are very low. The single-cell transcriptomic data were analyzed using solid advanced methodology and may be used as a starting point for future studies of the impact of lymphatic cells on heart disease.

    3. Reviewer #1 (Public review):

      Summary:

      Assessment of cardiac LEC transcriptomes post-MI may yield new targets to improve lymphatic function. scRNAseq is a valid approach as cardiac LECs are rare compared to blood vessel endothelial cells.

      Strengths:

      Extensive bioinformatics approaches employed by the group

      Weaknesses:

      Too few cells included in scRNAseq data set and the spatial transcriptomics data that was exploited has little relevance, or rather specificity, for cardiac lymphatics. This study seems more a collection of preliminary transcriptomic data than a true scientific report to help advance the field.

    4. Reviewer #2 (Public review):

      Summary:

      This study integrated single-cell sequencing and spatial transcriptome data from mouse heart tissue at different time points post-MI. They identified four transcriptionally distinct subtypes of lymphatic endothelial cells and localized them in space. They observed that LECs subgroups are localized in different zones of infarcted heart with functions. Specifically, they demonstrated that LEC ca III may be involved in directly regulating myocardial injuries in the infarcted zone concerning metabolic stress, while LEC ca II may be related to the rapid immune inflammatory responses of the border zone in the early stage of MI. LEC ca I and LEC collection mainly participate in regulating myocardial tissue edema resolution in the middle and late stages post-MI. Finally, cell trajectory and Cell-Chat analyses further identified that LECs may regulate myocardial edema through Aqp1, and likely affect macrophage infiltration through the galectin9-CD44 pathway. The authors concluded that their study revealed the dynamic transcriptional heterogeneity distribution of LECs in different regions of the infarcted heart and that LECs formed different functional subgroups that may exert different bioeffects in myocardial tissue post-MI.

      Strengths:

      The study addresses a significant clinical challenge, and the results are of great translational value. All experiments were carefully performed, and their data support the conclusion.

      Weaknesses:

      (1) Language expression must be improved. Many incomplete sentences exist throughout the manuscript. A few examples: Line 70-71: In order to further elucidate the effects and regulatory mechanisms of the lymphatic vessels in the repair process of myocardial injury following MI. Line 71-73. This study, integrated single-cell sequencing and spatial transcriptome data from mouse heart tissue at different time points after MI from publicly available data (E-MTAB-7895, GSE214611) in the ArrayExpress and gene expression omnibus (GEO) databases. Line 88-89: Since the membrane protein LYVE1 can present lymphatic vessel morphology more clearly than PROX1.<br /> (2) The type of animal models (i.e., permeant MI or MI plus reperfusion) included in ArrayExpress and gene expression omnibus (GEO) databases must be clearly defined as these two models may have completely different effects on lymphatic vessel development during post-MI remodeling.<br /> (3) Line 119-120: Caution must be taken regarding Cav1 as a lymphocyte marker because Cav1 is expressed in all endothelial cells, not limited to LEC.<br /> (4) Figure 1 legend needs to be improved. RZ, BZ, and IZ need to be labeled in all IF images. Day 0 images suggest that RZ is the tissue section from the right ventricle. Was RZ for all other time points sampled from the right ventricular tissue section?<br /> (5) The discussion section needs to be improved and better focused on the findings from the current study.

    1. eLife Assessment

      This is a potentially interesting study regarding the role of gasdesmin D in experimental psoriasis. The study contains useful data from murine models of skin inflammation, however the main claims (on neutrophil pyroptosis) are incompletely supported in its current form and require additional experimental support to justify the conclusions made.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Liu, Jiang, Diao et.al. investigated the role of GSDMD in psoriasis-like skin inflammation in mice. The authors have used full-body GSDMD knock-out mice and Gsdm floxed mice crossed with the S100A8- Cre. In both mice, the deficiency of GSDMD ameliorated the skin phenotype induced by the imiquimod. The authors also analyzed RNA sequencing data from the psoriatic patients to show an elevated expression of GSDMD in the psoriatic skin.

      Overall, this is a potentially interesting study, however, the manuscript in its current format is not completely a novel study.

      Strengths:

      It has the potential to unravel the new role of neutrophils.

      Weaknesses:

      The main claims are only partially supported and have scope to improve

    3. Reviewer #2 (Public review):

      Summary:

      The authors describe elevated GSDMD expression in psoriatic skin, and knock-out of GSDMD abrogates psoriasis-like inflammation.

      Strengths:

      The study is well conducted with transgenic mouse models. Using mouse-models with GSDMD knock-out showing abrogating inflammation, as well as GSDMD fl/fl mice without neutrophils having a reduced phenotype.

      I fear that some of the conclusions cannot be drawn by the suggested experiments. My major concern would be the involvement of other inflammasome and GSDMD bearing cell types, esp. Keratinocytes (KC), which could be an explanation why the experiments in Fig 4 still show inflammation.

      Weaknesses:

      The experiments do not entirely support the conclusions towards neutrophils.

      Specific questions/comments:

      Fig 1b: mainly in KC and Neutrophils?

      Fig 2a: PASI includes erythema, scaling, thickness and area. Guess area could be trick, esp. in an artificial induced IMQ model (WT) vs. the knock-out mice.

      Fig 2d: interesting finding. I thought that CASP-1 is cleaving GSDMD. Why would it be downregulated?

      Line 313: as mentioned before (see Fig 1b). KC also show a stron GSDMD staining positivity and are known producers of IL-1b and inflammasome activation. Guess here the relevance of KC in the whole model needs to be evaluated.

      Fig 4i - guess here the conclusion would be that neutrophils are important for the pathogenesis in the IMQ model, which is true. This experiment does not support that this is done by pyroptosis.

    4. Author response:

      We sincerely appreciate the positive assessment regarding the significance of our study, as well as the valuable suggestions provided by editor and the reviewers.

      In response to the reviewers’ comments, we will modify the manuscript to include co-staining of CD66b and GSDMD in the whole skin samples of clinical patients, which will further clarify the expression of GSDMD in neutrophils.

      Additionally, we plan to conduct further analyses using publicly available data to elucidate the changes in neutrophil pyroptotic signaling in IMQ-induced psoriatic mice tissue, thereby strengthening our conclusions about the role of neutrophil pyroptosis in the progression of psoriasis.

      Moreover, while our research primarily focuses on the role of neutrophil pyroptosis in psoriasis, this does not conflict with existing reports indicating that KC cell pyroptosis also contributes to disease progression. Both studies underscore the significant role of GSDMD-mediated pyroptotic signaling in psoriasis, and the consistent involvement of KC cells and neutrophils further emphasizes the potential therapeutic value of targeting GSDMD signaling in psoriasis treatment. We will expand upon this discussion in the revised manuscript.

      In our model, to accurately assess the disease condition in mice, we standardized the drug treatment area on the dorsal side (2*3 cm). Therefore, the area was not factored into the scoring process, and we will include a detailed description of this in the revised manuscript.

      Regarding the downregulation of CASP in GSDMD KO mouse skin tissue, existing studies indicate that GSDMD generates a feed-forward amplification cascade via the mitochondria-STING-Caspase axis (PMID: 36065823, DOI: 10.1161/HYPERTENSIONAHA.122.20004). We hypothesize that the absence of GSDMD attenuates STING signaling’s activation of Caspase.

      Furthermore, in the revised manuscript, we will address the reviewers’ other comments to enhancing the manuscript quality, such as providing further clarification on relevant issues in the discussion section, refining the key experiments in the methods section, and adding details about the antibodies used, including their associated clones and catalog numbers, as well as including sample sizes (n numbers) in the figure legends.

      We believe that the new data and further discussions and clarifications included in the revised manuscript will adequately address all the concerns raised by the reviewers and better support our conclusions.

      Finally, we would like to express our gratitude once again to the editor and reviewers for their invaluable feedback on this work!

    1. eLife Assessment

      This valuable study presents a machine learning model to recommend effective antimicrobial drugs from patients' samples analysed with mass spectrometry. The evidence supporting the claims of the authors is convincing. This work will be of interest to computational biologists, microbiologists, and clinicians.

    2. Joint Public Reviews:

      De Waele et al. framed the mass-spectrum-based prediction of antimicrobial resistance (AMR) prediction as a drug recommendation task. Neural networks were trained on the recently available DRIAMS database of MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry data and their associated antibiotic susceptibility profiles (Weis et al. 2022). Weis et al. (2022) also introduced the benchmark models which take as the input a single species and are trained to predict resistance to a single drug. Instead here, a pair of drugs and spectrum are fed to two neural network models to predict a resistance probability. In this manner, knowledge from different drugs and species can be shared through the model parameters. Questions asked: What is the best way to encode the drugs? Does the dual neural network outperform the single spectrum-drug network?

      The authors showed consistent performance of their strategy to predict antibiotic susceptibility for different spectrum and antibiotic representations (i.e., embedders). Remarkably, the authors showed how small datasets collected at one location can improve the performance of a model trained with limited data collected at a second location. The authors also showed that species-specific models (trained in multiple antibiotic resistance profiles) outperformed both the single recommender model and the individual species-antibiotic combination models.

      Strengths:

      • A single antimicrobial resistance recommender system could potentially facilitate the adoption of MALDI-TOF based antibiotic susceptibility profiling into clinical practices by reducing the number of models to be considered, and the efforts that may be required to periodically update them.<br /> • The authors tested multiple combinations of embedders for the mass spectra and antibiotics while using different metrics to evaluate the performance of the resulting models. Models trained using different spectrum embedder-antibiotic embedder combinations had remarkably good performance for all tested metrics. The average ROC AUC scores for global and species-specific evaluations were above 0.8.<br /> • Authors developed species-specific recommenders as an intermediate layer between the single recommender system and single species-antibiotic models. This intermediate approach achieved maximum performance (with one type of the species-specific recommender achieving a 0.9 ROC AUC), outlining the potential of this type of recommenders for frequent pathogens.<br /> • Authors showed that data collected in one location can be leveraged to improve the performance of models generated using a smaller number of samples collected at a different location. This result may encourage researchers to optimize data integration to reduce the burden of data generation for institutions interested in testing this method.

      Weaknesses:

      • Authors do not offer information about the model features associated with resistance. While reviewers understand that it is difficult to map mass spectra to specific pathways or metabolites, mechanistic insights are much more important in the context of AMR than in the context of bacterial identification. For example, this information may offer additional antimicrobial targets. Thus, authors should at least identify mass spectra peaks highly associated with resistance profiles. Are those peaks consistent across species? This would be a key step towards a proteomic survey of mechanisms of AMR. See previous work on this topic (Hrabak et al. 2013, Torres-Sangiao et al. 2022).

      References:

      Hrabak et al. (2013). Clin Microbiol Rev 26. doi: 10.1128/CMR.00058-12.<br /> Torres-Sangiao et al. (2022). Front Med 9. doi: 10.3389/fmed.2022.850374.<br /> Weis et al. (2022). Nat Med 28. doi: 10.1038/s41591-021-01619-9.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      Section 4.3 ("expert baseline model"): the authors need to explain how the probabilities defined as baselines were exactly used to predict individual patient susceptible profiles.

      We have added a more detailed and mathematically formal explanation of the “simulated expert’s best guess” in Section 4.3.

      This section now reads:

      “More formally, considering all training spectra as Strain, all training labels corresponding to one drug j and species t are gathered:

      The "simulated expert's best guess" predicted probability for any spectrum si and drug dj, then, corresponds to, the fraction of positive labels in their corresponding training label set :

      Authors should explain in more detail how a ROC curve is generated from a single spectrum (i.e., per patient) and then average across spectra. I have an idea of how it's done but I am not completely sure.

      We have added a more detailed explanation in Section 3.2. It reads:

      To compute the (per-patient average) ROC-AUC, for any spectrum/patient, all observed drug resistance labels and their corresponding predictions are gathered. Then, the patient-specific ROC-AUC is computed on that subset of labels and predictions. Finally, all ROC-AUCs per patient are averaged to a "spectrum-macro" ROC-AUC.

      In addition, our description under Supplementary Figure 8 (showing the ROC curve) provides additional clarification:

      Note that this ROC curve is not a traditional ROC curve constructed from one single label set and one corresponding prediction set. Rather, it is constructed from spectrum-macro metrics as follows: for any possible threshold value, binarize all predictions. Then, for every spectrum/patient independently, compute the sensitivity and specificity for the subset of labels corresponding to that spectrum/patient. Finally, those sensititivies and specificities are averaged across patients to obtain one point on above ROC curve.

      Section 3.2 & reply # 1: can the authors compute and apply the Youden cutoff that gives max precision-sensitivity for each ROC curve? In that way the authors could report those values.

      We have computed this cut-off on the curve shown in Supplementary Figure 8. The Figure now shows the sensitivity and specificity at the Youden cutoff in addition to the ROC. We have chosen only to report these values for this model as we did not want to inflate our manuscript with additional metrics (especially since the ROC-AUC already captures sensitivities and specificities). We do, however, see the value of adding this once, so that biologists have an indication of what kind of values to expect for these metrics.

      Related to reply #5: assuming that different classifiers are trained in the same data, with the same number of replicates, could authors use the DeLong test compare ROC curves? If not, please explain why.

      We thank the reviewer for bringing our attention to the DeLong’s test. It does indeed seem true that this test is appropriate for comparing two ROC-AUCs using the same ground truth values.

      We have chosen not to use this test for one conceptual and one practical reason:

      (1) Our point still stands that in machine learning one chooses the test set, and hence one can artificially increase statistical power by simply allocating a larger fraction of the data to test.

      (2) DeLong’s test is defined for single AUCs (i.e. to compare two lists of predictions against one list of ground truths), but here we report the spectrum/patient-macro ROC-AUC. It is not clear how to adjust the test to macro-evaluated AUCs. One option may be to apply the test per patient ROC curve, and perform multiple testing correction, but then we are not comparing models, but models per patient. In addition, the number of labels/predictions per patient is prohibitively small for statistical power.

      Reviewer #2 (Recommendations For The Authors):

      After revision, all issues were been resolved.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      Comment 1. Clinical Data on Patient Brain Samples: The inclusion of specific details such as postmortem intervals and the age at disease onset for patient brain samples would be valuable. These factors could significantly affect the quality of the tissues and their relevance to the study. Moreover, given the large variation in disease duration between PD and PDD, it’s important to consider disease duration as a potential confounding factor, especially when concluding that PDD patients have a more severe form of synucleinopathy compared to PD.

      We thank the reviewer for this valuable comment. We have included the post-mortem interval (PMI) and age of death in Table S1, showing the clinicopathological information. Changes on page 16. As suggested by the reviewer, we included the discussion on the large variation in disease duration between PD and PDD cases. We noted that DLB cases also have shorter disease durations but still demonstrate seeding kinetics similar to PDD. Therefore, we hypothesise that the molecular differences we observed between different diseases were due to the strain properties or higher pathological load (seen in both PDD and DLB) and are unlikely due to the disease duration. Changes on pages 9-11, lines 204-212.

      Comment 2. Inclusion of Healthy Controls in Multiple Tests: Given the importance of healthy controls in scientific studies, especially those involving human brain samples, the authors could consider using healthy controls in more tests to strengthen the robustness of the findings. Expanding the use of healthy controls in biochemical profiling and phosphorylation profiles would provide a better basis for comparison and clarify the significance of results in a disease context. This will help the authors to elaborate on the interpretation of results, for example, in Figure 3, where the authors claim that PD brains show mostly monomeric _α_Syn forms (line 119 and 120, and also in 222 and 223). Whether it implies the absence of alpha-syn pathology in PD brains? If there are differences from healthy controls? What are these low molecular weight bands (¡15kD) (line 125-126) and whether they are also present in healthy controls? Also, we do not have a perfect pS129-specific (anti-p_α_Syn) antibody. They are known for non-specific labeling. Investigating the phosphorylation levels in healthy controls and comparing them to PD brains, especially considering the predominance of monomeric (healthy _α_Syn?) in PD brains, would help clarify the observed changes.

      We agree with the reviewer’s assessment and consider this an important suggestion. We performed biochemical profiling and immunogold imaging with the three HC cases and presented the results in Figure 4. aSyn in healthy controls was completely digested by PK. The low MW bands were absent in PD and HC, and there was no difference in the PK profiles. However, this may be due to the low pathology load and amount of pathological aSyn in the selected PD brains. Additional comments were added to the results. Changes are on pages 4 (lines 136-137) and page 7 (Figure 4).

      Comment 3. Age of Healthy Controls: Providing information about the age at death for healthy controls is crucial, as age can impact the accumulation of aSyn. Also include if the brain samples were age-matched, or analyses were age-adjusted.

      We have described the age of each patient, and the analyses were age-adjusted. Changes on page 16 (Table S1).

      Comment 4. Braak Staging Discrepancy: The study reports the same Braak staging for both PD and PDD, despite the significant difference in disease duration. Maybe other reviewers with clinical experience might have a better take on this. This observation merits discussion in the paper, allowing readers to better understand the implications of this finding.

      ddressed: Our PD and PDD cases are Braak stage 6, indicating that the LB pathology had progressed to the neocortex. It‘s important to note that Braak stage represents only where the LB pathogy has spread and does not indicate anything about the load of LBs. However, our immunohistochemistry results (page 20) show that PDD demonstrates a higher LB load than PD cases in the entorhinal cortex. As the reviewer has suggested, this comment has been amended in the manuscript. Changes on pages 9-11, lines 204-212.

      Comment 5. Citation of Relevant Studies: The paper should consider citing and discussing a recent celebrated study on PD biomarkers that used thousands of cerebrospinal fluid (CSF) samples from different PD patient cohorts to demonstrate the effectiveness of SAA as a biochemical assay for diagnosing PD and its subtypes.

      As suggested by the reviewer, we included this study in the discussion. Changes on page 12, lines 275-278.

      Reviewer 3 (Public Review):

      The experiments are missing two important controls. 1) what to fibrils generated by different in vitro fibril preparations made from recombinant synclein protein look like; and 2) the use of CSF from the same patients whose brain tissue was used to assess whether CSF and brain seeds look and behave identically. The latter is perhaps the most important question of all - namely how representative are CSF seeds of what is going on in patients’ brains?

      We thank the reviewers for this valuable comment. Although in vitro preformed fibrils (PFFs) made out of recombinant aSyn are still important sources for cellular and animal studies to generate disease models and investigate mechanisms, many studies have now turned to use human brain amplified fibrils considering them to more closely present the human structure. Therefore, our study was designed to specifically address this hypothesis by comparing e human derived and SAA-amplified fibrils. It would be interesting to compare these structures also to PFFs but this was beyond the scope of our study. Comparing the CSF and brain seed from the same patients would be very interesting indeed but also difficult as this would require biosample collection during life followed by brain donation. The SAA cannot be done from the PM CSF due to contamination with blood. However, we are in a privileged position to examine such a comparison soon with our longitudinal Discovery cohort, where some participants have donated their brains. These future studies will address the critical question of whether the CSF seeds reflect those in the brain.

      In their discussion the authors do not comment on the obvious differences in the conditions leading to the formation of seeds in the brain and in the artificial conditions of the seeding assay. Why should the two sets of conditions be expected to yield similar morphologies, especially since the extracted fibrils are subjected to harsh conditions for solubilization and re-suspension.

      We agree with the reviewer that the formation of seeds in the brain and the SAA reaction conditions are very different, and one would not expect similar fibrillar morphologies. However, the theory is that pathological seeds are known to amplify through templated seeding, where seeds copy their intrinsic properties to the growing SAA fibrils. Thus, numerous studies use the SAA fibrils as model fibrils to investigate the different aSyn strains. Our study aimed to test whether the SAA fibrils are representative models of the brain fibrils. We included a more explicit comment on this discussion. Changes on page 3, lines 78-83.

      Finally, the key experiment was not performed - would the resultant seeds from SAA preparations from the different nosological entities produce different pathologies when injected into animal brains? But perhaps this is the subject of a future manuscript.

      We agree this is an essential experiment to build on our conclusion. Animal studies would be imperative to assess whether the SAA fibrils reflect the brain fibrils’ toxicity. However, these were beyond the scope of the present study but are being performed in collaboration with some expert groups.

      Furthermore, the authors comment on phosphorylation patterns, stating that the resultant seeds are less heavy phosphorylated than the original material. Again, this should not be surprising, since the SAA assay conditions are not known to contain the enzymes necessary to phosphorylate synuclein. The discussion of PTMs is limited to pS-129 phosphorylation. What about other PTMs? How does the pattern of PTMs affect the seeding pattern.

      We agree with the reviewer that other PTMs should be explored, but this was beyond the scope of this study. Here, we could focus on pS129, which has multiple reliable antibodies that also work with immunogold-TEM.

      Lastly, the manuscript contains no data on how the diagnostic categories were assigned at autopsy. This information should be included in the supplementary material.

      Clinical and neuropathological diagnostic criteria are now included in Table S1. Changes on page 16, lines 448-461.

      Reviewer 1 (Recommendations for the authors):

      (1) Remove a duplicate sentence in line 94-96.

      Addressed: Thank you for pointing this out. The duplicated sentence has been corrected. Changes are on page 4, lines 105-106.

      (2) Figure 1 Placement of Healthy Controls: Moving the graph representing healthy controls from the supplementary materials to the main figures could help readers better appreciate the results of diseased states.

      The healthy control SAA curves were moved to the main figure. Changes are on page 5, Figure 2.

      (3) Commenting on Case 2 Healthy Control: In the discussion section, you may comment on the case of the healthy control that showed amplification towards the end. While definitive conclusions may be challenging, acknowledging the possibility of incidental Lewy bodies or the prodromal phase of the disease would add depth to the analysis? But make sure to include the age information for healthy controls.

      We believe this is an important point to discuss in the manuscript. We have referenced other studies with similar observations and stated that it is currently unknown what this phenomenon reflects (page 11, lines 221-226). The age information of the healthy control subjects was added to Table S1.

      (4) Figure S3 Clarity: To enhance the clarity of Figure S3, consider adding a reference marker or arrow in the low-magnification image that points to the region being magnified in the insets. This visual cue will make it easier for readers to connect the detailed insets with the corresponding area in the broader image.

      In Figure S3, we included a reference arrow in the low-magnification images to clarify where the higher-magnification images are taken. Changes are on page 19, Figure S3.

      Reviewer 2 (Recommendations for the authors):

      (1) A major issue confronting the field is the conflation of the PMCA and RT-QuIC assays (the latter of which was used here). The decision to rename and combine the two under the umbrella of SAAs does a major disservice to the field for many reasons. Recognizing that the push for this did not come from the authors, clarifying the differences in their Introduction would be very useful. I suggest this, in large part, because in the prion field, PMCA is known to amplify prion strains with high fidelity whereas the product from RT-QuIC does not. In fact, the RT-QuIC product for PrP is not even infectious, while the synuclein field uses it as a means to generate material for subsequent studies. Highlighting these differences would certainly strengthen the arguments the authors are making about the inadequacy of the synuclein RT-QuIC approach in research.

      We thank the reviewers for these very valuable comments. We have included a further introduction on PMCA and RT-QuIC, explaining the differences and clearly stating our selection of the RT-QuIC method in this paper (page 3, lines 55-68). In addition, we have highlighted that, unlike PMCA, the RT-QuIC end-products are non-infectious and biologically dissimilar to the seed protein. Combined with our results, the findings demonstrate the methodological limitation of RT-QuIC in reproducing the seed fibrils and replicating their intrinsic biophysical information.

      (2) On page 4, sentences starting on lines 94 and 95 are a duplication.

      The duplicated sentence has been corrected. Changes are on page 4, lines 105-106.

      (3) In the Results, noting that the pSyn staining on the RT-QuIC fibrils is coming from the human patient sample used to seed the reaction would be useful. This is mentioned in the Discussion, but the lack of mention in the Results made me pause reading to double check the methods. I think this could also be addressed a bit more clearly in the Abstract.

      We have clarified this in the Results and Abstract. Changes on page 1 (lines 21-22) and page 9 (lines 192-194)

      (4) On page 8 line 188, change was to were in the sentence, ”First, faster seeding kinetics was...”

      This grammar error has been corrected. Changes are on page 9, line 200.

      (5) The authors may want to comment on the unexpected finding that despite the RT-QuIC fibrils having a difference in twisted vs straight filaments, all 4 seeded reactions gave identical results in the conformational stability assay.

      Addressed: We want to thank the reviewer for this comment and have highlighted the unexpected finding with a comment on what could be causing the identical results in the conformational stability assay. Changes are on page 12, lines 297-303.

    2. eLife Assessment

      This important work compares the strain properties of a-synuclein fibrils isolated from LBD and MSA patient samples with the resulting amplified fibrils following SAA. Using orthogonal biochemical and structural approaches to strengthen their analyses, the authors provide solid evidence that the SAA-amplified fibrils do not recapitulate the disease-relevant strains present in the patient samples. CryoEM would further strengthen this data but it is outside the scope of the work. This work should be considered in the widespread applications of SAA in synucleopathies and its potential limitations.

    3. Reviewer #2 (Public review):

      Most neurodegenerative diseases are characterized by the self-templated misfolding of a particular protein in a manner that enables progressive spread throughout the central nervous system. In diseases including Parkinson's disease (PD) and multiple system atrophy (MSA), the protein a-synuclein misfolds into unique strains, which use this self-replicating mechanism to encode disease-specific information. Previous research suggests that a major contributor to the lack of successful clinical trials across neurodegenerative diseases is the lack of disease-relevant strains used in preclinical testing. While MSA patient samples are known to replicate efficiently in cell and mouse models of disease, Lewy body disease (LBD) patient samples do not. To overcome this obstacle, the seeding amplification assay (SAA) uses recombinant a-synuclein to amplify the misfolded protein structure present in a human patient sample. The resulting fibrils are then widely used by many laboratories as a model of PD. In this manuscript, Lee et al., set out to compare the strain properties of a-synuclein fibrils isolated from LBD and MSA patient samples with the resulting amplified fibrils following SAA. Using orthogonal biochemical and structural approaches to strengthen their analyses, the authors report that the SAA-amplified fibrils do not recapitulate the disease-relevant strains present in the patient samples. Moreover, their data suggest that regardless of which strain is used to seed the SAA reaction, the same strain is generated. These results clearly demonstrate that the SAA-amplified material is likely not disease-relevant. SAA fibrils are broadly used throughout academic and pharmaceutical laboratories. They are used in ongoing drug discovery efforts and recombinant fibrils broadly inform much of what is known about a-synuclein strain biology in LBD patients. The implications of the reported work are, therefore, expansive. These findings add to the growing ledger of reasons that the use of SAA fibrils in research should be halted until improved methods for amplification with high fidelity are developed.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #2 (Public Review):

      The authors responded that they would lose statistical power by studying RTE subfamilies with limited microarray probes, which is a fair point. However, the suggested analysis could have been conducted using the RNA-seq data they explored in the second round of revision. Choosing not to leverage RNA-seq to increase the granularity of their analysis is a matter of choice. In my opinion, however, the authors could have acknowledged in the discussion that some smaller yet potentially influential RTE species may be masked by their global approach."

      We will add one sentence addressing this in the Version of Record.


      The following is the authors’ response to the original reviews.

      We thank Reviewer #1 for their constructive comments.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. Compared to the previous round of review, the text of the manuscript has been polished and the phrasing of several findings has been made clearer and more precise. The authors also provided ample discussion to the prior reviewer comments in their rebuttal, including new analyses. All these changes are in the correct direction, however, I believe that part of the content of the rebuttal should be incorporated in the main text, for reasons that I will outline below. 

      Both reviewers found the reliance on microarray expression data to detract from the study. The authors argued that their choices are supported by existing publications which performed a similar quantification of TE expression using microarray data. It could still be argued that (as far as I can tell) Reichmann et al. used a substantially larger number of probes than this study, as a consequence of starting from different arrays, however, this is a minor point which the authors do not need to address. It is still undeniable that including the validation with RNA-seq data performed in the rebuttal would strengthen the manuscript. I especially believe that many readers would want to see this analysis be prominent in the manuscript, considering that both reviewers independently converged on the issue with microarray expression data. Personally, I would have included an RNA-seq dataset next to the microarray data in the main figures, however, I understand that this would require considerable restructuring and that placing RNAseq data besides array data might be misleading. Instead, I would ask that the authors include their rebuttal figures R1 and R2 as supplementary figures. 

      I would suggest introducing a new paragraph, between the section dedicated to expression data and the one dedicated to DNA methylation, mentioning the issues with microarray data (Some of which were mentioned by the reviewers and other which were mentioned by the authors in the discussion and introduction) to then introduce the validation with RNA-seq data. 

      We appreciate the reviewer’s understanding and detailed feedback. As suggested, Author response images 1 and 2 were added as supplementary figures to the manuscript, and one paragraph was added to the section investigating the correlation between RTE expression and chronological age. We have also added new descriptions to the introduction, discussion, and BAR analysis sections.

      Author response image 3 is also a good addition and should be expanded to include the GTP and MESA study and possibly mentioned in the paragraph titled "RTE expression positively correlates with BAR gene signature scores except for SINEs." 

      We have updated Author response image 3 (now Author response image 1) to include GTP and MESA cohorts in the analysis. As shown in Author response image 1, except IFN-I and senescence scores on the MESA cohort that positively correlate with chronological ageing, the rest of the gene signatures display no positive correlation with chronological ageing.  

      Author response image 1 was originally created to separate the effect of chronological age and RTE expression on BAR gene signature scores. As it was meant to discriminate between BAR and chronological age, it doesn't provide additional information regarding the positive correlation between RTE expression and BAR gene signature that was not already present in the manuscript. Therefore, we did not add it to the manuscript.

      Author response image 1.

      Generalized linear models (GLM) analysis (BAR gene signature scores ~ RTE expression +chronological age). For each RTE family, we separately performed GLM. Age (RTE family) indicates the chronological age when used in the design formula for that specific RTE family.

      "In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were collected from different types of blood samples." 

      Indeed, the datasets are not compared directly, but the associations between age, BER and TE expression for each dataset are plotted and discussed right next to each other. It is therefore natural to wonder if the differences between datasets are due to differences in the type of blood sample or if they are a consequence of the different probe sets. Using a common set of probes would help answer that question.  

      We understand that the reviewer is proposing a method to eliminate the possible causes of differences across datasets. However, incorporating such change would compromise the statistic power of MESA and GARP cohorts and also change our analysis structurally and digress from our main focus. Hence, we disagree to use the identical set of probes for all three cohorts.

    2. eLife Assessment

      The study by Tsai et al. employed multi-omics approaches, including transcriptomic, methylomic, and single-cell RNA-seq, and provided a solid and comprehensive analysis of the correlation between retrotransposable element (RTE) expression and biological aging in human blood. Their findings highlight the differential roles of RTE families, providing valuable insights for understanding the mechanisms of human aging.

    3. Reviewer #1 (Public Review):

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. The text of the manuscript has been polished and the phrasing of several findings has been made clearer and more precise. The authors also provided ample discussion to the prior reviewer comments in their rebuttal, including new analyses.

    4. Reviewer #2 (Public Review):

      Summary:

      Yi-Ting Tsai and colleagues conducted a systematic analysis of the correlation between the expression of retrotransposable elements (RTEs) and aging, using publicly available transcriptional and methylome microarray datasets of blood cells from large human cohorts, as well as single-cell transcriptomics. Although DNA hypomethylation was associated with chronological age across all RTE biotypes, the authors did not find a correlation between the levels of RTE expression and chronological age. However, expression levels of LINEs and LTRs positively correlated with DNA demethylation, and inflammatory and senescence gene signatures, indicative of "biological age". Gene set variation analysis showed that the inflammatory response is enriched in the samples expressing high levels of LINEs and LTRs. In summary, the study demonstrates that RTE expression correlates with "biological" rather than "chronological" aging.

      Strengths:

      The question the authors address is both relevant and important to the fields of aging and transposon biology.

      Comments on latest version:

      The authors introduced the analysis of RNA-seq data, addressing the key concerns raised by Reviewer #1 and myself. They also adopted more explicit terminology in their latest version, reducing ambiguity. The RNA-seq analysis demonstrating that the expression of different transposon groups is not associated with chronological aging is convincing, though, in my opinion, it still lacks granularity.

      I have two minor points:

      (1) Previously, I have mentioned the following:

      "The authors pool signals from RTEs by class or family, despite the fact that these groups include subfamilies and members with very different properties and harmful potentials. For example, while older subfamilies might be expressed through readthrough transcription, certain members of younger groups could be autonomously reactivated and cause inflammation... The aggregation of signals from different RTE biotypes may obscure potential reactivation of smaller groups or specific subfamilies."

      The authors responded that they would lose statistical power by studying RTE subfamilies with limited microarray probes, which is a fair point. However, the suggested analysis could have been conducted using the RNA-seq data they explored in the second round of revision. Choosing not to leverage RNA-seq to increase the granularity of their analysis is a matter of choice. In my opinion, however, the authors could have acknowledged in the discussion that some smaller yet potentially influential RTE species may be masked by their global approach.

      (2) Previously, I mentioned that 10x scRNA-seq is not ideal for analysing RTEs and requested a classical UMAP plot to visualize RTE expression across cell populations. The authors argued that they could only achieve sufficient statistical power by quantifying RTE classes through cumulative read counts for each cell type, which I accept. However, they divided cells into "high" and "low" BAR gene signature groups. I am surprised that the comparison of BAR signature expression between these groups was not presented using standard visualization methods commonly applied in scRNA-seq data analysis.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The study "Endogenous oligomer formation underlies DVL2 condensates and promotes Wnt/βcatenin signaling" by Senem Ntourmas et al. contributes to the understanding of phase separation in Dishevelled (DVL) proteins, specifically focusing on DVL2. It builds upon existing research by investigating the endogenous complexes of DVL2 using ultracentrifugation and contrasting them with DVL1 and DVL3 behavior. The study identifies a DVL2-specific region involved in condensate formation and introduces the "two-step" concept of DVL2 condensate formation, enriching the field's knowledge. 

      Strengths: 

      A notable strength of this study is the validation of endogenous DVL2 complexes, providing insights into its behavior compared to DVL1 and DVL3. The functional validation of the DVL C-terminus (here termed conserved domain 2 (CD2) and the identification of DVL2-specific regions (here termed LCR4) involved in condensate formation are significant contributions that complement the current knowledge on the importance of DVL DIX domain, DEP domain and intrinsically disordered regions between DIX and PDZ domains. Additionally, the introduction of the concept where oligomerization (step 1) precedes condensate formation (step 2) is an interesting hypothesis, which can be further experimentally challenged in the future.

      We thank the reviewer for her/his interest in our work and for acknowledging our significant contributions to the understanding of DVL2 phase separation.   

      Weaknesses: 

      However, the applicability of the findings to full-length DVL2 protein, hence the physiological relevance, is limited. This is mostly due to the fact that the authors almost completely depend on the set of DVL2 mutants, which lack the (i) DEP domain and (ii) nuclear export signal (NES). These variants fail to establish DEP domain-mediated interactions, including those with FZD receptors. Of note, the DEP domain itself represents a dimerization/tetramerization interface, which could affect the protein condensate formation of these mutants. Possibly even more importantly, the used mutants localize into the nucleus, which has different biochemical & biophysical properties than a cytoplasm, where DVL typically reside, which in turn affects the condensate formation. On top, in the nucleus, most of the DVL binding partners, including relevant kinases, which were reported to affect protein condensate formation, are missing.

      The most convincing way to address this valid concern and to support a physiological relevant role of our findings is to extend our experiments with full-length DVL2, which we did alongside the suggestion in point two (please see below). In addition, we address the specific issues as follows:

      We completely agree that interaction through the DEP domain contributes to condensate formation, which was thoroughly demonstrated in great studies by Melissa Gammons and Mariann Bienz, and complex formation (Fig. 2B, C). We deleted this domain on purpose for our mapping experiments, since we obtained more consistent results without any additional contribution of the DEP domain. Once we mapped CFR and identified crucial amino acids within CFR (VV, FF), we demonstrated that CFR-mediated interaction contributes to complex formation, condensate formation and pathway activation in the context of full-length DVL2 (Fig. 7A-G). 

      We also agree that the nuclear localization may affect condensate formation because of the reasons mentioned by the reviewer or others, such as differences in DVL2 protein concentration. However, later proof-of-concept experiments in full-length DVL2 confirmed that CFR and its identified crucial amino acids (VV, FF), which were mapped in this rather artificial nuclear context, contribute to the typical cytosolic condensate formation of DVL2 (Fig. 7C, D). Moreover, we also observed cells with cytosolic condensates for the NES-lacking DVL2 constructs, although to a lower extent as compared to cells with nuclear condensates. A new analysis of NES-lacking key constructs focusing exclusively on cells with cytosolic condensates revealed similar differences between the DVL2 mutants as were observed before when investigating cells with nuclear (and cytosolic) condensates (new Fig. S3E, F), suggesting that the detected differences are not due to nuclear localization but reflect the overall condensation capacity. 

      In addition, our condensate-challenging experiments (osmotic shock, 1,6-hexandiol) suggested that cytosolic condensates of full-length DVL2 and nuclear CFR-mediated condensates of deletion proteins lacking the DEP domain behave quite similar (Fig. 6A-C).

      Second, the use of an overexpression system, while suitable for comparing DVL2 protein condensate features, falls short in functional assays. The study could benefit from employing established "rescue systems" using DVL1/2/3 knockout cells and re-expression of DVL variants for more robust functional assessments. 

      We used the suggested established rescue system of DVL1/2/3 knockout cells (T-REx DVL1/2/3 triple knockout cells and T-REx DVL1/2/3 RNF43 ZNRF3 penta knockout cells, which are even more sensitive towards DVL re-expression as they lack RNF43/ZNRF3-mediated degradation of DVL activating receptors; both cell lines from the Bryja lab). Upon overexpression, our key mutants DVL2 VV-AA FF-AA and ∆CFR showed markedly reduced pathway activation compared to WT DVL2 (new Figs. 7F and S5J), as we observed before. Especially in the DVL1/2/3 triple knockout cells, DVL2 VV-AA FF-AA hardly activated the pathway and was as inactive as the established M2 mutant (new Fig. 7F). Most importantly, while re-expression of WT DVL2 at close to endogenous expression levels fully rescued Wnt3a-induced pathway activation in DVL1/2/3 knockout cells, DVL2 VV-AA FF-AA revealed significantly reduced rescue capacity and was almost as inactive as DVL2 M2 (new Figs. 7G and S5K). 

      Furthermore, the discussion and introduction overlook some essential aspects of DVL biology. One such example is the importance of the open/close conformation of DVL and its effects on DVL phase separation and activity. In the context of this study, it is important to say that this conformational plasticity is mediated by DVL C-terminus (CD2 in this study). The second example is the reported roles of DVL1 and DVL3, which can both mediate the Wnt3a signal. How this can be interpreted when DVL1 and DVL3 lack LCR4 and still form condensates? 

      We included the open/close conformation of DVL in our manuscript (introduction p. 3 and new discussion paragraph p. 10) and discussed it in the context of our findings. It is intriguing to speculate that Wnt-induced opening of DVL2 increases the accessibility of LCR4 and CD2, thereby triggering pre-oligomerization and subsequent phase separation of DVL2 (see discussion).

      We extended the last paragraph of the discussion to interpret the roles of DVL1 and DVL3 lacking LCR4 (see p. 10). In short, the general ability of DVL1 and DVL3 to form condensates and to activate the Wnt pathway can be potentially explained through the other interaction sites (DIX, DEP, intrinsically disordered region). However, previous studies suggest that the DVL paralogs exhibit (quantitative) differences in Wnt pathway activation and that all three paralogs have to interact at a certain ratio for optimal pathway activation. In this context, a physiologic role for DVL2 LCR4 may be to promote the formation of these DVL1/2/3 assemblies and/or to enhance the stability of these assemblies.

      In order to increase the physiological relevance of the study, I would recommend analyzing several key mutants in the context of the full-length DVL2 protein using the rescue/complementation system. Further, a more thorough discussion and connections with the existing literature on DVL protein condensates/puncta/LLPS can improve the impact of the study. 

      We thank the reviewer for her/his suggestions to improve our study, which we addressed as detailed above.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors aimed to identify which regions of DVL2 contribute to its endogenous/basal clustering, as well as the relevance of such domains to condensate/phase separation and WNT activation. 

      Strengths: 

      A strength of the study is the focus on endogenous DVL2 to set up the research questions, as well as the incorporation of various techniques to tackle it. I found also quite interesting that DVL2-CFR addition to DVL1 increased its MW in density gradients. 

      We thank the reviewer for her/his interest in our work and the constructive suggestions to improve our study.

      Weaknesses: 

      I think that several of the approaches of the manuscript are subpar to achieve the goals and/or support several of the conclusions. For example: 

      (1) Although endogenous DVL2 indeed seems to form complexes (Figure 1A), neither the number of proteins involved nor whether those are homo-complexes can be determined with a density gradient. Super-resolution imaging or structural analyses are needed to support these claims. 

      We agree that it will be very interesting to study the nature of the detected endogenous complexes in detail and we will consider this for any follow-up study, as structural analyses were out of scope for the revision of the presented manuscript. To address the issue, we mentioned that the calculation of about eight DVL2 molecules per complex is based on the assumption of homotypic complexes (results p. 4) and we discussed, why we think that homotypic complexes are the most likely assumption based on the currently available (limited) data (discussion p. 8).

      (2) Follow-up analyses of the relevance of the DVL2 domains solely rely on overexpressed proteins. However, there were previous questions arising from o/e studies that prompted the focus on endogenous, physiologically relevant DVL interactions, clustering, and condensate formation.

      Although the title, conclusions, and relevance all point to the importance of this study for understanding endogenous complexes, only Figures 1A and B deal with endogenous DVL2. 

      We think that the biochemical detection of endogenous DVL2 complexes itself represents a valuable contribution to the understanding of endogenous DVL clustering, especially (i) since it is still lively discussed in the field whether and to which extent endogenous DVL assemblies exist (see introduction) and (ii) since recent studies addressing this issue rely on fluorescent tagging of the endogenous protein, which, among all benefits, harbors the risk to artificially affect DVL assembly. The follow-up analysis predominantly strengthens this key finding through (i) associating the detected complexes with established (DEP domain) and newly mapped (LCR4) DVL2 interaction sites, which we think is crucial to validate our biochemical approach, and (ii) linking the complexes with condensate formation and pathway activation for functional insights.

      In addition, we performed new experiments with re-expression of DVL2 and our key mutants at close to endogenous expression levels in DVL1/2/3 knockout cells, supporting a physiological relevant role of our findings (new Figs. 7G and S5K, please also see point (5) below).

      (3) Mutants lacking activity/complex formation, e.g. DVL2_1-418, may need further validation. For instance, DVL2_1-506 (same mutant but with DEP) seems to form condensates and it is functional in WNT signalling (King et al., 20223). These differences could be caused by the lack of DEP domain in this particular construct and/or folding differences. 

      We would definitely expect that DVL2 1-506 exhibits increased condensate formation and pathway activation as compared to DVL2 1-418, since the DEP domain was thoroughly characterized as interaction domain in the Bienz lab and the Gammons lab (see references), which we confirmed in our assays (Fig. 2B-D). However, as the DEP domain is an established DVL2 interaction site, we were not interested to further characterize the DEP domain but to explain the marked difference in complex formation between DVL2 ∆DEP and 1-418 (Fig. 2A-C), which could not be associated with any known DVL2 interaction site and which we finally mapped to CFR (Fig. 4A-D). 

      Since fusion of the newly-characterized interaction site CFR to DVL2 1-418 (1-418+CFR) rescued complex formation, condensate formation and signaling activity (Fig. 3B-E and Fig. 4C, D), we think that the lacking activity/complex formation of DVL2 1-418 is more likely due to missing interaction sites than due to folding problems. However, as it is hard to exclude folding differences of deletion mutants, we confirmed the CFR activity through loss-of-function experiments in the context of fulllength DVL2 with minimal point mutations (Fig. 7A-G, VV,FF). 

      (4) The key mutants, DeltaCFR and VV/FF only show mild phenotypes. The authors' results suggest that these regions contribute but are not necessary for 1) complex formation (Density gradient Figures 7A and B), condensate formation (Figures 7C and D), and WNT activity (Figure 7E). Of note Figure 7C shows examples for the mutants with no condensates while the qualification indicates that 50% of the cells do have condensates. 

      Condensate formation and Wnt pathway activation by DVL VV-AA FF-AA were reduced by more than 50% as compared to WT (Fig. 7D, E). We consider these marked differences, since loss of function always ranges between 0% and 100%. In newly performed experiments in DVL1/2/3 knockout cells, the differences were even more pronounced, see point (5) below.

      Yes, Fig. 7C shows an example to qualitatively visualize the change in condensate formation, while Fig. 7D provides the corresponding quantification allowing quantitative assessment of the differences.

      (5) Most of the o/e analyses (including all reporter assays) should be performed in DVL1-3 KO cells in order to explore specifically the behaviour of the investigated mutants. 

      As suggested, we employed DVL1/2/3 knockout cells for performing reporter assays (T-REx DVL1/2/3 triple knockout cells and T-REx DVL1/2/3 RNF43 ZNRF3 penta knockout cells, which are even more sensitive towards DVL re-expression as they lack RNF43/ZNRF3-mediated degradation of DVL activating receptors; both cell lines from the Bryja lab). Here, we focused on key mutants in the context of full-length DVL2, as they are closest to the physiologic situation. Upon overexpression, DVL2 VV-AA FF-AA and DVL2 ∆CFR showed markedly reduced pathway activation as compared to WT DVL2 (new Figs. 7F and S5J). Especially in the DVL1/2/3 triple knockout cells, DVL2 VV-AA FF-AA hardly activated the pathway and was as inactive as the established M2 mutant (new Fig. 7F). Moreover, re-expression at close to endogenous expression levels revealed that DVL2 VV-AA FF-AA less efficiently rescued Wnt3a-induced pathway activation as compared to WT (Figs. 7G and S5K).

      (6) How comparable are condensates found in the cytoplasm (usually for wt DVL) with those located in the nucleus (DEP mutants)? 

      In principal, cytosolic condensates could differ from nuclear condensates due to various reasons, such as e.g. different protein concentration, different availability of interaction partners or different biochemical/biophysical properties (please also see point 1 of reviewer 1). In our condensatechallenging experiments (osmotic shock, 1,6-hexandiol), cytosolic condensates of full-length DVL2 and nuclear condensates of DVL2 mutants behaved quite similar (Fig. 6A-C).

      We are confident that the differences between different DEP mutants in our mapping experiments are not due to nuclear localization but reflect the overall condensation capacity because later proofof-concept experiments demonstrated that CFR, which was identified in these mapping experiments, contributes to cytosolic condensate formation in the context of full-length DVL2 (Fig. 7C, D). Moreover, a new analysis focusing only on cells with cytosolic condensates, which can also be observed for DEP mutants to a low extent, revealed similar differences between key DEP mutants as observed before (Fig. S3E, F; for details please also see point 1 of reviewer 1).

      Several studies in the last two decades have analysed the relevance of DVL homo - and heteroclustering by relying on overexpressed proteins. Recent studies also explored the possibility of DVL undergoing liquid-liquid phase separation following similar principles. As highlighted by the authors in the introduction, there is a need to understand DVL dynamics under endogenous/physiological conditions. Recent super-resolution studies aimed at that question by characterising endogenously edited DVL2. The authors seemed to aim in the same direction with their initial findings (Figure 1A) but quickly moved to o/e proteins without going back to the initial question. This reviewer thinks that to support their conclusions and advance in this important question, the authors should introduce the relevant mutations in the endogenous locus (e.g. by Cas9+ donor template encoding the required 3' exons, as done by others before for WNT components, including DVL2) and determine their impact in the above-indicated processes.

      We agree that genomic editing of the DVL2 locus would be the cleanest system to study the relevance of CFR at endogenous expression levels. As we did not have the resources to generate the suggested cells, we, as an alternative, transiently re-expressed DVL2 and the respective mutants at low levels that were really close to the endogenous expression levels in DVL1/2/3 triple knockout cells (Fig. S5K). These experiments revealed that DVL2 VV-AA FF-AA less efficiently rescued Wnt3ainduced pathway activation as compared to DVL2 WT (Fig. 7G).

    2. eLife Assessment

      This valuable study contributes to the understanding of phase separation in Dishevelled (DVL) proteins, by investigating the endogenous complexes of DVL2 using ultracentrifugation and contrasting them with DVL1 and DVL3 behaviour and the functional validation of the DVL2 intrinsically disordered regions mediating the protein condensate. The study includes a solid characterisation of several overexpression constructs, including in KO cells. However, investigations of the roles of the described DVL2 regions at the endogenous level remain to be carried out.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to identify which regions of DVL2 contribute to its endogenous/basal clustering, as well as the relevance of such domains to condensate/phase separation and WNT activation.

      Strengths:

      A strength of the study is the focus on endogenous DVL2 to set up the research questions, as well as the incorporation of various techniques to tackle it. I found also quite interesting that DVL2-CFR addition to DVL1 increased its MW in density gradients.

      Weaknesses:

      The authors have addressed important drawbacks regarding the overexpression experiments, most notably by including DVL tKO cells in collaboration with Vita. I think that this part has clearly improved.

      Unfortunately, I still stand with my key concern: at this stage in the field, with many papers on DVL over expression, there is a clear need to address how endogenous DVL behaves. While the attempts to o/e low levels of DVL mutants in tKO cells are useful for validation experiments, the manuscript does not -in my opinion - address the requirements of DVL2 condensation for WNT signalling. Of note, several of the described effects are partial, including in tKO cells, often indicating that the targeted domains contribute, but are not required, for these processes. I understand that generating endogenous tagged lines or targeting specific endogenous domains is not trivial. But, as indicated in both initial reviews, I think that monitoring endogenous proteins is required to fully address the proposed research question.

      In my opinion, the current manuscript A) shows that endogenous DVL2 forms large complexes in a higher proportion as DVL1/3, B) identifies and describes a couple of motifs that contribute to clustering and signalling in overexpressed DVL, including in tKO cells* C) shows that one of those motifs (CFR) rewires o/e DVL1 into behaving similarly as DVL2.

      *On a minor note, I am not sure how DVL tKO cells partially react to Wnt3a in Figure 7G

    1. eLife Assessment

      This work describes a highly complex automated algorithm for analyzing vascular imaging data from two-photon microscopy. This tool has the potential to be extremely valuable to the field and to fill gaps in knowledge of hemodynamic activity across a regional network. The biological application provided, however, has several problems that make many of the scientific claims in the paper incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      In the manuscript the authors describe a new pipeline to measure changes in vasculature diameter upon opt-genetic stimulation of neurons.<br /> The work is useful to better understand the hemodynamic response on a network /graph level.

      Strengths:

      The manuscript provides a pipeline that allows to detect changes in the vessel diameter as well as simultaneously allows to locate the neurons driven by stimulation.<br /> The resulting data could provide interesting insights into the graph level mechanisms of regulating activity dependent blood flow.

      Weaknesses:

      (1) The manuscript contains (new) wrong statements and (still) wrong mathematical formulas.<br /> (2) The manuscript does not compare results to existing pipelines for vasculature segmentation (opensource or commercial).<br /> Comparing performance of the pipeline to a random forest classifier (illastik) on images that are not preprocessed (i.e. corrected for background etc.) seems not a particularly useful comparison.<br /> (3) The manuscript does not clearly visualize performance of the segmentation pipeline (e.g. via 2d sections, highlighting also errors etc.). Thus, it is unclear how good the pipeline is, under what conditions it fails or what kind of errors to expect.<br /> (4) The pipline is not fully open-source due to use of matlab. Also, the pipeline code was not made available during review contrary to the authors claims (the provided link did not lead to a repository). Thus, the utility of the pipeline was difficult to judge.

      Detailed remarks to the revision and new manuscript:

      - Generalizability: The authors addressed the point of generalizability by applying the pipeline to other data sets. This demonstrates that their pipeline can be applied to other data sets and makes it more useful.<br /> However, from the visualizations it's unclear to see the performance of the pipeline, where the pipelines fails etc. The 3d visualizations are not particularly helpful in this respect .<br /> In addition, the dice measure seems quite low, indicating roughly 20-40% of voxels do not overlap between inferred and ground truth. I did not notice this high discrepancy earlier. A through discussion of the errors appearing in the segmentation pipeline would be necessary in my view to better asses the quality of the pipeline.

    3. Reviewer #2 (Public review):

      The authors have addressed most of my concerns sufficiently. There are still a few serious concerns I have. Primarily, the temporal resolution of the technique still makes me dubious about nearly all of the biological results. It is good that the authors have added some vessel diameter time courses generated by their model. But I still maintain that data sampling every 42 seconds - or even 21 seconds - is problematic. First, the evidence for long vascular responses is lacking. The authors cite several papers:

      Alarcon-Martinez et al. 2020 show and explicitly state that their responses (stimulus-evoked) returned to baseline within 30 seconds. The responses to ischemia are long lasting but this is irrelevant to the current study using activated local neurons to drive vessel signals.<br /> Mester et al. 2019 show responses that all seem to return to baseline by around 50 seconds post-stimulus.<br /> O'Herron et al. 2022 and Hartmann et al. 2021 use opsins expressed in vessel walls (not neurons as in the current study) and directly constrict vessels with light. So this is unrelated to neuronal activity-induced vascular signals in the current study.

      There are other papers including Vazquez et al 2014 (PMID: 23761666) and Uhlirova et al 2016 (PMID: 27244241) and many others showing optogenetically-evoked neural activity drives vascular responses that return back to baseline within 30 seconds. The stimulation time and the cell types labeled may be different across these studies which can make a difference. But vascular responses lasting 300 seconds or more after a stimulus of a few seconds are just not common in the literature and so are very suspect - likely at least in part due to the limitations of the algorithm.

      Another major issue is that the time courses provided show that the same vessel constricts at certain points and dilates later. So where in the time course the data is sampled will have a major effect on the direction and amplitude of the vascular response. In fact, I could not find how the "response" window is calculated. Is it from the first volume collected after the stimulation - or an average of some number of volumes? But clearly down-sampling the provided data to 42 or even 21 second sampling will lead to problems. If the major benefit to the field is the full volume over large regions that the model can capture and describe, there needs to be a better way to capture the vessel diameter in a meaningful way.

      It still seems possible that if responses are bi-phasic, then depth dependencies of constrictors vs dilators may just be due to where in the response the data are being captured - maybe the constriction phase is captured in deeper planes of the volume and the dilation phase more superficially. This may also explain why nearly a third of vessels are not consistent across trials - if the direction the volume was acquired is different across trials, different phases of the response might be captured.

      I still have concerns about other aspects of the responses but these are less strong. Particularly, these bi-phasic responses are not something typically seen and I still maintain that constrictions are not common. The authors are right that some papers do show constriction. Leaving out the direct optogenetic constriction of vessels (O'Herron 2022 & Hartmann 2021), the Alarcon-Martinez et al. 2020 paper and others such as Gonzales et al 2020 (PMID: 33051294) show different capillary branches dilating and constricting. However, these are typically found either with spontaneous fluctuations or due to highly localized application of vasoactive compounds. I am not familiar with data showing activation of a large region of tissue - as in the current study - coupled with vessel constrictions in the same region. But as the authors point out, typically only a few vessels at a time are monitored so it is possible - even if this reviewer thinks it unlikely - that this effect is real and just hasn't been seen.

      I also have concerns about the spatial resolution of the data. It looks like the data in Figure 7 and Supplementary Figure 7 have a resolution of about 1 micron/pixel. It isn't stated so I may be wrong. But detecting changes of less than 1 micron, especially given the noise of an in vivo prep (brain movement and so on), might just be noise in the model. This could also explain constrictions as just spurious outputs in the model's diameter estimation. The high variability in adjacent vessel segments seen in Figure 6C could also be explained the same way, since these also seem biologically and even physically unlikely.

      I still think the difference in distance-to-nearest-neuron between dilators and constrictors is insignificant. These points were not addressed - the difference in neuronal density between cortical layers and the ~ 5 micron difference in this parameter between dilators and constrictors. Given the concerns raised above, there is very little confidence in even knowing which vessels constricted and which dilated.

      All-in-all, I think this is potentially a very useful pipeline for automating numerous tasks which are very time consuming and vulnerable to subjective judgements (which leads to reproducibility problems and others). However, I think the challenge of capturing large volumes at high speed and with high resolution is very real and hasn't been adequately accomplished for the claims the authors want to make about their data. It is encouraging that with the right technology, such data could be captured and this pipeline would be excellent for processing it. But given the limitations in the data collection here, I think that many of the biological claims are hard to fully accept.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Review:

      We would like to thank the reviewers for providing constructive feedback on the manuscript. To address their concerns, we have performed additional experiments, analyzed the new data, and revised the manuscript.

      (1) The utility of a pipeline depends on the generalization properties.

      While the proposed pipeline seems to work for the data the authors acquired, it is unclear if this pipeline will actually generalize to novel data sets possibly recorded by a different microscope (e.g. different brand), or different imagining conditions (e.g. illumination or different imagining artifacts) or even to different brain regions or animal species, etc.

      The authors provide a 'black-box' approach that might work well for their particular data sets and image acquisition settings but it is left unclear how this pipeline is actually widely applicable to other conditions as such data is not provided.

      In my experience, without well-defined image pre-processing steps and without training on a wide range of image conditions pipelines typically require significant retraining, which in turn requires generating sufficient amounts of training data, partly defying the purpose of the pipeline.

      It is unclear from the manuscript, how well this pipeline will perform on novel data possibly recorded by a different lab or with a different microscope.

      To address the generalizability of our DL segmentation model, we have performed several validation experiments with deploying our model on out-of-distribution data that 1) had distinct channels  2) were acquired in different species (rat) with a different vascular fluorescent label and a different imaging protocol, and 3) were acquired on a different microscope and with a different vascular label. We first used our model to segment images (507x507um lateral FOV, 170-250 um axial range) from three C57BL/6 mice imaged on the same two-photon fluorescent microscope following the same imaging protocol. The vasculature was labelled by intravenous injection of the Texas Red dextran (70 kDa MW, Thermo Fisher Scientific Inc, Waltham MA), as in the current experiment. In lieu of the EYFP signal from pyramidal neurons that was present in the original data, we added Gaussian noise with a mean and standard deviation identical to the acquired vascular channel in the out-of-distribution dataset. Second, we applied our model to images (507x507um lateral FOV, 300-400 um axial range) from two Fischer rats that were injected with 2000-kDa Alexa680-dextran via a tail vein catheter. These rats were imaged on the same two-photon fluorescence microscope, but with Galvano scanners (instead of resonant scanners). As before, a second channel of Gaussian noise was added to simulate the missing EYFP signal. Finally, we segmented an image of vasculature from an ex-vivo cleared mouse brain (1665x1205x780 um) acquired on a light sheet fluorescence microscope (Miltenyi UltraMicroscope Blaze), with a Lectin-DyLight 649 labelling the vessel walls.  The Dice Score, Precision, Recall, Hausdorff 95%, and Mean surface distance were reported for segmentations of 2PFM data sets, following the generation of ground truth images by assisted manual segmentation in ilastik. Examples of the generated segmentation masks are presented in Supplementary figure 9 for visual comparison. We have described the image pre-processing steps/transforms before model inference in the revised Methods section. In general, should the segmentation results on a data set be deemed unsatisfactory, our model can be further fine-tuned on out-of-distribution data. Furthermore, the image analyses downstream from segmentation are applicable irrespective of the method utilized to arrive at a robust vascular segmentation.

      Author response table 1.

      Dataset performance comparison for UNETR

      (2) Some of the chosen analysis results seem to not fully match the shown data, or the visualization of the data is hard to interpret in the current form.

      We have updated the visualizations to make them more accessible and ensure close correspondence between tables and figures.

      (3) Additionally, some measures seem not fully adapted to the current situation (e.g. the efficiency measure does not consider possible sources or sinks). Thus, some additional analysis work might be required to account for this.

      Thank you for your comment. The efficiency metric was selected as it does not consider sources or sinks. We do agree that accounting for vessel subtypes in the analysis (thus classifying larger vessels as either suppliers/sources or drainers/sinks) would be very useful: notwithstanding, this classification is extremely laborious, as we have noted in our prior work1 . We are therefore leveraging machine learning in a parallel project to afford vessel classification by type. Notwithstanding, the source/sink analysis based on in vivo 2PFM data is confounded by the small FOV.

      (4) The authors apply their method to in vivo data. However, there are some weaknesses in the design that make it hard to accept many of the conclusions and even to see that the method could yield much useful data with this type of application. Primarily, the acquisition of a large volume of tissue is very slow. In order to obtain a network of vascular activity, large volumes are imaged with high resolution. However, the volumes are scanned once every 42 seconds following stimulation. Most vascular responses to neuronal activation have come and gone in 42 seconds so each vessel segment is only being sampled at a single time point in the vascular response. So all of the data on diameter changes are impossible to compare since some vessels are sampled during the initial phase of the vascular response, some during the decay, and many probably after it has already returned to baseline. The authors attempt to overcome this by alternating the direction of the scan (from surface to deep and vice versa). But this only provides two sample points along the vascular response curve and so the problem still remains.

      We thank the Reviewer for bringing up this important point. Although vessels can show relatively rapid responses to perturbation, vascular responses to photostimulation of ChannelRhodopsin-2 in neighbouring neurons are long-lasting: they do not come and go in 42 seconds. To demonstrate this point, we acquired higher temporal-resolution images of smaller volumes of tissue over 5 minutes preceding and 5 minutes following the 5-s photoactivation with the original photostimulation parameters. The imaging protocol was different in that we utilized a piezoelectric motor, a smaller field of view (512um x (80-128)um x (34-73)um), and only 3x frame averaging, resulting in a temporal resolution of 1.57-3.17 seconds per frame. This acquisition was repeated at different cortical depths in three Thy1-ChR2 mice and the vascular radii were estimated using our presented pipeline. Significantly responding vessels here were selected via an F-test of radius estimates before vs. after stimulation. LOESS fits to the time-dependent radius of significantly responding vessels are shown in Supplementary Figure 5. Vessels shorter than 20 um in length were excluded from the analysis so as to focus on vessel segments where averaging the vascular radius over many vertices was possible. A video of one of the acquisitions is shown along with the timecourses of select vessels’ calibre changes in Author response image 1. The vascular calibre changes following photostimulation persisted for several minutes, consistent with earlier observations by us and others2–5. These small-volume acquisitions demonstrated that dilations were repeatedly longer than the 42 seconds (i.e. our original temporal resolution).

      Our temporal sampling was chosen to permit a large field of view acquisition while still being well within the span of the vascular response to look at larger scale vascular coordination that has not previously been studied. The pipeline readily adapts to smaller fields of view at a finer temporal sampling, though such an acquisition precludes the study of the response coordination across hundreds of vessels. While a greater number of baseline frames would help with the baseline variability estimation, maintaining animals under anesthesia during prolonged imaging is exceedingly difficult, precluding us from extending our total acquisition time.

      Author response image 1.

      Estimated vascular radius at each timepoint for select vessels from the imaging stack shown in the following video: https://flip.com/s/kB1eTwYzwMJE

      (5) A second problem is the use of optogenetic stimulation to activate the tissue. First, it has been shown that blue light itself can increase blood flow (Rungta et al 2017). The authors note the concern about temperature increases but that is not the same issue. The discussion mentions that non-transgenic mice were used to control for this with "data not shown". This is very important data given these earlier reports that have found such effects and so should be included.

      We have updated the manuscript to incorporate the data on volumetric scanning in (nontransgenic) C57BL/6 mice undergoing blue light stimulation, with identical parameters as those used in Thy-ChR2 mice (Supplementary Figure 8). As before, responders were identified as vessels that following blue light stimulation showed a radius change greater than 2 standard deviations of their baseline radius standard deviation: their estimated radii changes are shown in Supplementary Figure 8.  There was no statistical difference between the radii distributions of any of the photostimulation conditions and pre-photostimulation baseline.

      (6) Secondly, there doesn't seem to be any monitoring of neural activity following the photo-stimulation. The authors repeatedly mention "activated" neurons and claim that vessel properties change based on distance from "activated" neurons. But I can't find anything to suggest that they know which neurons were active versus just labeled. Third, the stimulation laser is focused at a single depth plane. Since it is single-photon excitation, there is likely a large volume of activated neurons. But there is no way of knowing the spatial arrangement of neural activity and so again, including this as a factor in the analysis of vascular responses seems unjustified.

      Given the high fidelity of Channel-Rhodpsin2 activation with blue light photostimulation found by us and others3, we assume that all labeled neurons within the volume of photostimulation are being activated. Depending on their respective connectivities, their postsynaptic neurons (whether or not they are labeled) may also get activated. We therefore agree with the reviewer that the spatial distribution of neuronal activation is not well defined. The manuscript has been revised to update the terminology from activated to labeled neurons and stress in the Discussion that the motivation for assessing the distance to the closest labeled neuron as one of our metrics is purely to demonstrate the possibility of linking vascular response to activations in their neighbouring neurons and including morphological metrics in the computational pipeline.

      (7) The study could also benefit from more clear illustration of the quality of the model's output. It is hard to tell from static images of 3-D volumes how accurate the vessel segmentation is. Perhaps some videos going through the volume with the masks overlaid would provide some clarity. Also, a comparison to commercial vessel segmentation programs would be useful in addition to benchmarking to the ground truth manual data.

      We generated a video demonstrating the deep-learning model outputs and have made the video available here: https://flip.com/s/_XBs4yVxisNs. We aimed to develop an open-source method for the research community as the vast majority of groups do not have access to commercial software for vessel segmentation.

      (8) Another useful metric for the model's success would be the reproducibility of the vessel responses. Seeing such a large number of vessels showing constrictions raises some flags and so showing that the model pulled out the same response from the same vessels across multiple repetitions would make such data easier to accept.

      We have generated a figure demonstrating the repeatability of the vascular responses following photostimulation in a volume and presented them next to the corresponding raw acquisitions for visual inspection (Supplementary figure 6). It is important to note that there is a significant biological variability in vessels’ responses to repeated stimulation, as described previously 3,6: a well-performing model should be able to quantify biological heterogeneity as it of itself may be of interest. Constrictions have been reported in the literature by our group and others 1,2,4,5,7, though their prevalence has not been systematically studied to date. Concerning the reproducibility of our analysis, we have demonstrated model reproducibility (as a metric of its success) on a dataset where vessels visually appeared to dilate consistently following 452 nm light stimulation: these results are now presented in Supplementary Figure 6 of the revised Manuscript. We thus observed that the model repeatedly detected the vessels - that appeared to dilate on visual inspections - as dilating. Examples of vessels constricting repeatedly were also examined and maximal intensity projections of the vessel before and after photostimulation inspected, confirming their repeated constriction (Author response image 2).

      It is also worth noting that while the presence of the response (defined as change above 2 standard deviations of the radius across baseline frames) was infrequent (2107 vessels responded at least once, out of a total of 10,552 unique vessels imaged), the direction of the response was highly consistent across trials. Given twice the baseline variability as the threshold for response, of the vessels that responded more than once, 31.7% dilated on some trials while constricting on others; 41.1% dilated on each trial; and 27.2% constricted on each trial. (Note that some trials use 1.1 vs. 4.3 mW/mm2 and some have opposite scanning directions).

      Author response image 2.

      Sample capillaries constrictions from maximum intensity projections at repeated time points following optogenetic stimulation. Baseline (pre-stimulation) image is shown on the left and the post-stimulation image, is on the right, with the estimated radius changes listed to the left.

      (9) A number of findings are questionable, at least in part due to these design properties. There are unrealistically large dilations and constrictions indicated. These are likely due to artifacts of the automated platform. Inspection of these results by eye would help understand what is going on.

      Some of the dilations were indeed large in magnitude. We present select examples of large dilations and constrictions ranging in magnitude from 2.08 to 10.80 um for visual inspection (Author response image 3) (for reference, average, across vessel and stimuli, the magnitude of radius changes were 0.32 +/- 0.54 um). Diameter changes above 5 um were visually inspected.

      Author response image 3.

      Additional views of diameter change in maximum intensity projections ranging in magnitude from 2.08 um to 10.80 um.

      (10) In Figure 6, there doesn't seem to be much correlation between vessels with large baseline level changes and vessels with large stimulus-evoked changes. It would be expected that large arteries would have a lot of variability in both conditions and veins much less. There is also not much within-vessel consistency. For instance, the third row shows what looks like a surface vessel constricting to stimulation but a branch coming off of it dilating - this seems biologically unrealistic.

      We now plot photostimulation-elicited vessel-wise radius changes vs. their corresponding baseline radius standard deviations (Author response image 4). The Pearson correlation coefficient between the baseline standard deviation and the radius change was 0.08 (p<1e-5) for  552nm 4.3 mW/mm^2 stimulation,  -0.08 (p<1e-5) for  458nm 1.1 mW/mm^2 stimulation, and -0.04 (p<1e-5) for  458nm 4.3 mW/mm^2 stimulation. For non-control (i.e. blue) photostimulation conditions, the change in the radius is thus negatively correlated to the vessel’s baseline radius standard deviation: this small negative correlation indicates that there is little correlation between vessel radius change and the baseline variability in the vessel radius. Classification of vessels by type (arteries vs. veins) is needed before we can comment on differences between these vascular components. The between-vessel (i.e. between parent vessels and their daughter branches separated by branch points) consistency is explicitly evaluated by the assortativity metric, in Figure 9: vessels do somewhat tend to react similarly to their downstream branches: we observed a mean assortativity of 0.4. As for the instance of a surface vessel constricting while a downstream vessel dilates, it is important to remember that the 2PFM FOV restricts us to imaging a very small portion of the cortical microvascular network: one (among many) daughter vessels showing changes in the opposite direction to the parent vessel is not violating the conservation of mass; in addition, mural cells on adjacent branches can respond differently.

      Author response image 4.

      Vessel radius change elicited by photostimulation vs. baseline radius standard deviation across all vessels. The threshold level for response identification is shown as the black line.

      (11) As mentioned, the large proportion of constricting capillaries is not something found in the literature. Do these happen at a certain time point following the stimulation? Did the same vessel segments show dilation at times and constriction at other times? In fact, the overall proportion of dilators and constrictors is not given. Are they spatially clustered? The assortativity result implies that there is some clustering, and the theory of blood stealing by active tissue from inactive tissue is cited. However, this theory would imply a region where virtually all vessels are dilating and another region away from the active tissue with constrictions. Was anything that dramatic seen?

      The kinetics of the vascular responses are not accessible via the current imaging protocol and acquired data; however, this computational pipeline can readily be adapted to test hypotheses surrounding the temporal evolution of the vascular responses, as shown in Supplementary Figure 2 (with higher temporal-resolution data). Some vessels dilate at some time points and constrict at others as shown in Supplementary Figure 2. As listed in Table 2, 4.4% of all vessels constrict and 7.5% dilate for 452nm stimulation at 4.3 mW/mm^2. There was no obvious spatial clustering of dilators or constrictors: we expect such spatial patterns to be more common with different modes of stimulation and/or in the presence of pathology. The assortativity peaked at 0.4 (quite far from 1 where each vessel’s response exactly matches that of its neighbour).

      (12) Why were nearly all vessels > 5um diameter not responding >2SD above baseline? Did they have highly variable baselines or small responses? Usually, bigger vessels respond strongly to local neural activity.

      In Author response image 5, we now present the stimulation-induced radius changes vs. baseline radius variability across vessels with a radius greater than 5 um. The Pearson correlation between the radius change and the baseline radius standard deviation across time was low: r=0.05 (p=0.5) for  552nm 4.3 mW/mm^2 stimulation,  r=-0.27 (p<1e-5) for  458nm 1.1 mW/mm^2 stimulation, and r=-0.31 (p<1e-5) for 458nm 4.3 mW/mm^2 stimulation. These results demonstrate that the changes following optogenetic stimulation are lower than twice the baseline standard deviation across time for most of these vessels. The pulsatility of arteries results in significant variability in their baseline radius8; in turn, literature to date suggests very limited radius changes in veins. Both of these effects could contribute to the radius response not being detected in many larger vessels.

      Author response image 5.

      The change in the vessel radius elicited by photostimulation vs. baseline vessel radius standard deviation in vessels with a baseline radius greater than 5 um. The threshold level for response identification is shown as the black line.

      References

      (1) Mester JR, Rozak MW, Dorr A, Goubran M, Sled JG, Stefanovic B. Network response of brain microvasculature to neuronal stimulation. NeuroImage. 2024;287:120512. doi:10.1016/j.neuroimage.2024.120512

      (2) Alarcon-Martinez L, Villafranca-Baughman D, Quintero H, et al. Interpericyte tunnelling nanotubes regulate neurovascular coupling. Nature. 2020;kir 2.1(7823):91-95. doi:10.1038/s41586-020-2589-x

      (3) Mester JR, Bazzigaluppi P, Weisspapir I, et al. In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2. NeuroImage. 2019;192:135-144. doi:10.1016/j.neuroimage.2019.01.036

      (4) O’Herron PJ, Hartmann DA, Xie K, Kara P, Shih AY. 3D optogenetic control of arteriole diameter in vivo. Nelson MT, Calabrese RL, Nelson MT, Devor A, Rungta R, eds. eLife. 2022;11:e72802. doi:10.7554/eLife.72802

      (5) Hartmann DA, Berthiaume AA, Grant RI, et al. Brain capillary pericytes exert a substantial but slow influence on blood flow. Nat Neurosci. Published online February 18, 2021:1-13. doi:10.1038/s41593-020-00793-2

      (6) Mester JR, Bazzigaluppi P, Dorr A, et al. Attenuation of tonic inhibition prevents chronic neurovascular impairments in a Thy1-ChR2 mouse model of repeated, mild traumatic brain injury. Theranostics. 2021;11(16):7685-7699. doi:10.7150/thno.60190

      (7) Hall CN, Reynell C, Gesslein B, et al. Capillary pericytes regulate cerebral blood flow in health and disease. Nature. 2014;508(7494):55-60. doi:10.1038/nature13165

      (8) Meng G, Zhong J, Zhang Q, et al. Ultrafast two-photon fluorescence imaging of cerebral blood circulation in the mouse brain in vivo. Proc Natl Acad Sci U S A. 2022;119(23):e2117346119. doi:10.1073/pnas.2117346119

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 207: a superfluous '.' before the references.

      This has been corrected.

      Line 273 ff:

      While the metrics are described in mathematical terms which is very useful, the appearing distances (d) and mathematical symbols are not. While mostly intuitively clear, precise definitions of all symbols introduced should be given to avoid ambiguities.

      The description has been clarified.

      This applies to all formulas appearing in the manuscript and the authors might want to check them carefully.

      We have updated them wherever needed.

      The mean surface distance seems not to reflect the mean MINIMAL surface distance but just the overall mean surface distance. Or a different definition of the appearing symbols is used, highlighting the need for introducing every mathematical symbol carefully.

      The definitions have been updated for clarity, specifying the distinction between Hausdorff 95% distance and mean surface distance.

      Line 284:

      It is unclear to me why center-line detection was performed in MATLAB and not Python. Using multiple languages/software packages and in addition relying on one that is not freely available/open source makes this tool much less attractive as a real open-source tool for the community. The authors stress in the manuscript abstract that their pipeline is an open and accessible tool, the use of MATLAB defies this logic to some extent in my view.

      Centerline detection for large volumetric data is available in Python, see e.g. Scipy packages as well for large data sets via ClearMap or VesselVio.

      We tested the centerline detection in Python, scipy (1.9.3) and Matlab. We found that the Matlab implementation performed better due to its inclusion of a branch length parameter for the identification of terminal branches, which greatly reduced the number of false branches; the Python implementation does not include this feature (in any version) and its output had many more such “hair” artifacts. Clearmap skeletonization uses an algorithm by Palagyi & Kuba(1999) to thin segmentation masks, which does not include hair removal. Vesselvio uses a parallelized version of the scipy implementation of Lee et al. (1994) algorithm which does not do hair removal based on a terminal branch length filter; instead, Vesselvio performs a threshold-based hair removal that is frequently overly aggressive (it removes true positive vessel branches), as highlighted by the authors.

      Moreover, the authors mention that robust center-line detection was critical. In my view, robust center-line extraction typically requires some additional processing of the binarized data, e.g. using a binary smoothing step. Various binary smoothers are available in the literature and as Python code.

      Indeed, binary smoothing was performed: background “holes” located within the vasculature were filled; the masks were dilated (3x) and then eroded to the centreline. Scipy’s binary closing function smoothes the morphology of binary segmentation masks by dilating and then eroding the segmentation masks (as a part of the selected skeletonization algorithm).

      Line 303:

      'RBC' is not defined (red blood cells?)

      This has been updated.

      Line 398:

      pPhotonsimulation -> Photostimulation

      This has been corrected.

      Line 400 ff: Efficiency:

      I am not sure how useful the measure really is without any information about the 'sources' (i.e. arteries) and sinks (i.e. veins) as blood does not need to be moved between any two arbitrary nodes.

      While blood reversals are observed, blood is typically not moved arbitrarily between two arbitrary nodes in capillary networks.

      We agree with the reviewer that classifying the vessels by type is important and are currently working on deep learning-based algorithms for the classification of microvasculature into arterioles and venules for future work.

      In addition, short paths between two nodes with low resistivity will potentially dominate the sum and the authors excluded vessels 10um and above. This threshold seems arbitrary.

      The 10-um diameter threshold was not applied in the computation of the network metrics. The 10-um thresholding was restricted to “capillary” identification in Figure 8: the 10-um cutoff for referring to a vessel as a capillary has long been applied in the literature [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11].

      Figure 3:

      It's unclear what the units are for the Mean Surface and Harsdorf Distances (pixel or um?).

      The units have now been specified (um).

      Figure 4:

      The binarized data, and particularly the crops are difficult to interpret in black and white. It would be much more useful to present the segmentation results in a way that is interpretable (e.g. improving the rendering of the 3d information, particularly in the crops by using shadows or color codes for depth, etc).

      We have updated these visualizations and shaded them based on cortical depth.

      Panel C indicates that the illastik is performing badly due to changes in imagining conditions (much higher background level). As pointed out before, in my view, a reasonable pipeline should start by removing and standardizing background levels as well as dynamic ranges and possibly other artifacts before performing a more detailed analysis. This would also make the pipeline more robust against data from other microscopes etc as only a few preprocessing parameters might need to be adjusted.

      I wonder whether after such a pre-processing step, UNET / UNETR would still perform in a way that was superior to ilastik, as ground truth data was generated with the aid of illastiks initially.

      The Ilastik model is based on semi-automatically generated foreground labels in small batches. We had to break it up into small groups during manual labelling as larger groups were not able to run due to the computational limits of Ilastik. Ilastik is typically trained in an iterative fashion on a few patches at a time because it takes 2-3 hours per patch to train and the resulting model does not generalize on the remaining patches or out-of-distribution data - even with image pre-processing steps. On the reviewer's comment, we did try inputting normalized images into Ilastik, but this did not improve its results. UNET and UNETR inputs have been normalized for signal intensities.

      Typical pre-processing/standard computer vision techniques with parameter tuning do not generalize on out-of-distribution data with different image characteristics, motivating the shift to DL-based approaches.

      Figure 5:

      This is a validation figure that might be better shown in an appendix or as a supplement.

      Since this is a methodological paper, we think it is important to highlight the validation of the proposed method.

      Line 476:

      It's surprising that the number of vessel segments almost doubles when taking the union. Is the number of RBC plugs expected to be so high?

      The etiology of discontinuities includes, but is not limited to, RBC plugs; we expect discontinuities to arise also from a very short pixel dwell time (0.067us) of the resonant scanning and have indeed observed apparent vessel discontinuities on resonant scanning that are not present with Galvano scanning using a pixel dwell time of 2us.

      Section 4.4 / 4.5 :

      The analysis in these sections provides mostly tables with numbers that are more difficult to read and hides possible interesting structures in the distribution of the various measures/quantities. For example, why is 5um a good choice to discriminate between small and large vessels, why not resolve this data more precisely via scatter plots?

      Some distributions are shown in the appendix and could be moved to the main analysis.

      Generally, visualizing the data and providing more detailed insights into the results would make this manuscript more interesting for the general reader.

      The radius of vessel segments drops off after 5.0 um, as shown in Supplementary Figure 4A. The 10-um diameter thresholding is based on prior literature [1], [12], [13], [14], [15], [16], [17], [18], [19] and is used to segregate different vessel types in a conservative manner. The smallest capillaries are expected to have pericytes on their vessel walls whereas arteries are expected to have smooth muscle cells on their vessel walls. These differences in mural cells also may lead to differences in respective vessels’ reactivity.

      The data summarized in Tables 1 and 2 are shown as scatter plots in Figures 8, Supplementary Fig 4 and Supplementary Fig 5.

      Line 556:

      The authors deem a certain change in radius as the relevant measure for responding vessels. They deem a vessel responding if it dilates by twice the std deviation in the radius.

      Based on this measure they find that large vessels rarely respond.

      However, I think this analysis might obscure some interesting effects:

      (1) The standard deviation of the radius depends on the correct estimation of the center point. Given the limited spatial resolution the center point (voxel) obtained from the binarization and skeletonization might not lie in the actual center of the vessel. This effect will be stronger for larger vessels. Center point coordinates should thus be corrected to minimize the std in radius.

      (2) Larger vessels will not necessarily have a perfectly circular shape, and thus the std measure is not necessarily a good measure of 'uncertainty' of estimating the actual radius.

      (3) The above reasons possibly contribute to the fact that from Figure 6 it seems vessels with larger radii have higher std in general (as indicated above some more detailed visualization of the data instead of plain tables could reveal such effects better, e.g. scatter radius vs std). This higher std is making it harder to detect changes in larger vessels. However, with respect to the blood flow, the critical factor is the cross-section of the vessel that scales with the radius squared. Thus, a fixed change in radius for a vessel (say 1um) will induce a larger increase in the flow rate in larger vessels as the change in cross-section is also proportional to the radius of the vessel.

      Thus, larger vessels to be deemed responders should probably have lower thresholds, thresholds should be taken on the cross-section change, or at least thresholds should not be higher for larger vessels as it is the case now using the higher std.

      (1) The radius estimate does not depend on the precise placement of the center point as the radius is not being estimated by the distance from the center point to the boundary of the vessel. Instead, our strategy is to estimate the cross-sectional area (A) of the vessel by the Riemann sum of the sectors with the apex at the center point; the radius is then quoted as sqrt(A/pi) (Supplementary figure 3B). Thus, estimated vessel radius estimates in each cross-sectional plane are then averaged across the cross-sectional planes placed every ~1um along the vessel length. The uncertainty in the cross-sectional plane’s vessel radius, the uncertainty in the vessel radius (upon averaging the cross-sectional planes), and the uncertainty in the radius estimate across repeated measures of a state (i.e. across different samples of the baseline vs, post-photostimulation states) are all reported, and the last one used to define responding vessels.

      To demonstrate the insensitivity to the precise placement of the vessel’s centrepoint, we have jittered the centerline in the perpendicular plane to the vessel tangent plane at each point along the vessel and then estimated the mean radius in 71 cross-sectional planes of larger vessels (mean radius > 5 um). The percent difference in the estimated radius at our selected vessel centrepoints vs. the jittered centrepoints is plotted above. The percent difference in the mean radius estimated was 0.64±3.44%  with 2.45±0.30 um centerpoint jittering. (In contrast, photostimulation was estimated to elicit an average 25.4±18.1% change in the magnitude of the radius of larger vessels, i.e. those with a baseline radius >5um.)

      (2) Indeed, the cross-sectional areas of either large or small vessels are not circles. Consequently, we are placing the vessel boundary, following other published work[20], at the minimum of the signal intensity gradients computed along thirty-six spokes emanating from the centrepoint (cf Figure 2H,K). The cross-sectional area of the vessel in the said cross-sectional plane is then estimated by summing the areas of the sectors flanked by neighbouring spokes. We do not make an assumption about the cross-sectional area being circular. We report radii of circles with the equivalent area as that of the cross-sectional areas merely for ease of communication (as most of the literature to date reports vessel radii, rather than vessel cross-sectional areas.)

      To demonstrate the robustness of this approach, we show the sensitivity of vessel-wise radius estimate on the number of spokes used to estimate the radius in Supplementary Figure 3a. The radius estimate converges after 20 spokes have been used for estimation. Our pipeline utilizes 36 spokes and then excludes minima that lie over 2 STD away from the mean radius estimate across those 36 spokes. With 36 spokes, the vesselwise mean radius estimation was within 0.24±0.62% of the mean of radius estimates using 40-60 spokes.

      (3) Across-baseline sample uncertainty in vessel radius is not dependent on baseline vessel caliber (i.e. this uncertainty is not larger in larger vessels).

      Supplementary Figure 5 shows vessel radius changes for large vessels without a threshold defining responding or non-responding vessels. To explore the dependence of the outcomes on the threshold used to identify the responding vessels, we have explored an alternative strategy, whereby responding small vessels are identified as those vessels that show a post-photostimulation (vs. baseline) radius change of more than 10%. These data are now plotted in Supplementary Figure 10, for capillaries which is in agreement with Figure 8. These points are now also discussed in the Discussion section of the revised manuscript:

      “Additionally, alternative definitions of responding vessels may be useful depending on the end goal of a study (e.g., this could mean selecting a threshold for the radius change based on a percentage change from the baseline level).”

      Section 4.5.1

      Why is the distance to the next neuron a good measure here? If two or more neurons are just a bit further away there will be twice or multiple times the 'load' while the measure would only indicate the distance to the shortest neuron. I wonder how the results change if those 'ensemble' effects are taken into account.

      In this direction, looking for network-level effects with respect to the full spatial organization of the neurons would be very interesting to look at.

      We agree with the review that this question is interesting; however, it is not addressable using present data: activated neuronal firing will have effects on their postsynaptic neighbors, yet we have no means of measuring the spread of activation using the current experimental model.

      Figure 8

      The scatter plots shown are only partly described (e.g. what's the line with error bars in C, why does it only appear for the high-intensity stimulation?).

      Quadratic polynomial fit is shown only in C as the significant response was observed only for this condition, i.e. for the higher intensity blue photostimulation.

      From the scatter plots as shown it is not clear to me why dilations happen on average further away. This might be a density effect not well visible in this representation. The data does not seem to show a clear relationship between neuron distance and Delta R.

      Particularly in the right panel (high stimulation) there seems to be a similar number of close by neurons responding in both directions, but possibly a few more contracting at larger distances?

      So, the overall effect does not seem as 'simple' as suggested in the title of section 4.5.1 in my view, but rather more cells start to contract at larger distances while there seems to be a more intricate balance nearby.

      A more thorough analysis and visualization of the densities etc. might be needed to clarify this point.

      The language has been revised to:

      458-nm photostimulation resulted in a mix of constrictions and dilations with 44.1% of significantly responding vessels within 10 um of a labelled pyramidal neuron constricting and 55.1% dilating, while 53.3% of vessels further than 30 um constricted and 46.7% dilated. The cutoff distances from the closest labelled neuron were based on estimates of cerebral metabolic rate of oxygen consumption that showed a steep gradient in oxygen consumption with distance from arteries, CMRO2 being halved by 30 μm away

      We added a probability density plot for significant constrictors and dilators to Figure 8 and Supplementary Figure 5.

      Figure 8 Panel D / Section 4.5.2

      This is a very interesting result in my view found in this study.

      I am unclear how to interpret the effect. The authors state that dilators tend to be closer to the surface. Looking at the scatter plot (without real density information except the alpha value) it seems again the number of responders in both directions is about the same, but in deeper regions the contraction is just larger? This would be different, than how the authors interpret the data. It is unclear from the provided analysis/plots what is actually the case.

      We added a probability density function plot of the constrictors and dilators, which shows a greater incidence of constrictions (vs. dilations). The text of the paper was then clarified to include the proportion of significant constrictors/ dilators closer than 10 um vs. further than 30 um away from the closest labeled neuron.

      For the analyses above involving $Delta R$ I recommend also look how those results change when looking at changes in cross section instead, i.e. taking into account the actual vessel radius as well as discussed above.

      It would be interesting to speculate here or in the discussion on a reason why vessels in deeper regions might need to contract more?

      Unaddressed is the question if e.g. contraction in a vessel for small stimulation is predictive of contractions for larger stimulation or any other relationships?

      Thank you for your comment. Given its hierarchical organization and high within-vessel response heterogeneity, we believe that the vasculature is best analyzed as a network. Our radius estimates come from averaged cross-sectional estimates allowing us to examine heterogeneity within individual vessel segments.

      The discussion has been updated to include reasons as to why deeper vessels may contract more:

      “As the blue light stimulation power increased, the mean depth of both constricting and dilating vessels increased, likely resulting from higher intensity light reaching ChR2-expressing neurons deeper in the tissue and exciting superficial neurons (and thus their postsynaptic neurons) to a greater level [21], [22]. The blue light would be expected to excite a lower number of neurons farther from the cortical surface at lower powers.”

      Also, how consistent are contractions/dilations observed at a particular vessel etc.

      To look at the consistency of a particular vessel's response to the 1.1 or 4.3 mW/mm^2 blue light photostimulation, we categorized all significant responses as constrictions or dilations, defining a responding vessel as that showing a change that is either > 2 x baseline vessel radius variability or >10% of the vessel’s mean baseline radius.

      Given twice the baseline variability as the threshold for response, of the vessels that responded more than once, 31.7% dilated on some trials while constricting on others; 41.1% dilated on each trial; and 27.2% constricted on each trial. (Note that some trials use 1.1 vs. 4.3 mW/mm2 and some have opposite scanning directions).

      Section 4.5.3

      The results in assortativity are interesting. It would be interesting to look at how the increase in assortativity is mediated. For, example, is this in localized changes in some parts of the graph as visible in A or are there other trends? Do certain sub-graphs that systematically change their radius have certain properties (e.g. do activated neurons cluster there) or are these effects related to some hotspots that also show a coordinated change in control conditions (the assortativity seems not zero there)?

      I already discussed if the efficiency measure is necessarily the best measure to use here without taking into account 'sources' and 'sinks'.

      We plan to address this in future work once we have successfully trained models for the classification of vessels into arteries, veins, and capillaries. Capillaries will be classified based on their branch order from parent arteries to specify where in the network changes are occurring.

      Figure 9

      It's unclear to me why the Ohm symbol needs to be bold?

      It is not bolded (just the font’s appearance).

      Line 707:

      "458-nm photostimulation caused capillaries to dilate when pyramidal neurons were close, and constrict when they were further away."

      In my view, this interpretation is too simple, given the discussion above. A more detailed analysis could clarify this point.

      The discussion on this point has been revised to:

      458-nm photostimulation resulted in a mix of constrictions and dilations, with 44.1% of significantly responding vessels within 10 μm of a labelled pyramidal neuron constricting, and 55.1% dilating; while 53.3% of vessels further than 30 μm constricted and 46.7% dilated. The cutoff distances from the closest labelled neuron were based on estimates of cerebral metabolic rate of oxygen consumption that showed a steep gradient in oxygen consumption with distance from arteries, CMRO2 being halved by 30 μm away [23].

      Line 740:

      "The network efficiency here can be thought of as paralleling mean transit time, i.e., the time it takes blood to traverse the capillary network from the arteries to the veins".

      The network efficiency as defined by the authors seems not to rely on artery/vein information and thus this interpretation is not fully correct in my view.

      The authors might want to reconsider this measure for one that accounts for sources and sinks, if they like to interpret their results as in this line.

      Yes, the efficiency described does not account for sources and sinks. It estimates the resistivity of capillaries, as a proxy for the ease of moving through the observed capillary nexus. Looking at the efficiency metric from graph theory does not require knowledge of the direction of blood flow, and can comment on the resistivity changes across capillary networks.

      For future work, we are investigating methods of classifying vessels as arteries, capillaries, or veins. This type of analysis will provide more detailed information on paths between arteries and veins; it will not provide insight into large-scale network-wide modifications, as those require larger fields of view. 

      Line 754 Pipeline Limitations and Adaptability

      I think the additional 'problem' of generating new training data for novel data sets or data from other microscopes etc should be addressed or the pipeline tested on such data sets.

      Generating training data is typically the biggest time investment when adapting pipelines.

      The generalization properties of the current pipeline are not discussed (e.g. performance on a different microscope / different brain area / different species etc.).

      The public response to reviews has been updated with out-of-distribution data from other imaging protocols, microscopes, and species showing generalizability. These results have also been added to the paper as Supplementary Table 4, and Figure 6. The performance of our pipeline on these out-of-distribution data is now discussed in the updated Discussion section.

      Line 810

      Code availability should be coupled with the publication of this paper as it seems the main contribution. I don't see how the code can be made available after publication only. It should be directly available once the manuscript is published and it could help to make it available to the reviewers before that. It can be updated later of course.

      The code is being made available.

      Reviewer #2 (Recommendations For The Authors):

      This analytical pipeline could be quite useful but it needs to be better demonstrated. If faster volumetric imaging is not possible, perhaps using it over a small volume would still demonstrate its utility at a smaller but more believable scale.

      The higher temporal resolution scans (over smaller tissue volumes) have now been performed and the results of applying our pipeline to these data are summarized in Supplementary Figure 2.

      Using sensory stimuli for neuronal activation might be a better idea than optogenetic stimulation. It isn't necessary but it would avoid the blue light issue.

      The pipeline is readily applicable for analysis of vasoreactivity following different perturbers; however, the robustness of vessels’ response is higher with blue light photostimulation of ChR2 than with sensory stimuli [24]. Notwithstanding, an example of the vascular response to electrical stimulation of the contralateral forepaw is now included in Supplementary Figure 2.

      This tool could be quite useful even without neural activity mapping. It obviously makes it even more powerful, but again, the utility could be demonstrated with just vascular data or even anatomical neuronal data without function.

      We agree with both points, and have emphasized them in the revised discussion section.

      Line 559 says the average capillary diameter change was 1.04 um. The next sentence and the table below all have different values so this is unclear.

      The wording was updated to make this clearer.

      Line 584 - should 458 be 552?

      458 is correct.

      Figure 1 - the schematic doesn't seem right - the 650 LPF with the notches is positioned to pass short light and reflect long wavelengths and the notch bands.

      The figure has been updated to reflect this. The original layout was done for compactness.

      References

      (1) D. A. Hartmann, V. Coelho-Santos, and A. Y. Shih, “Pericyte Control of Blood Flow Across Microvascular Zones in the Central Nervous System,” Annu. Rev. Physiol., vol. 84, no. Volume 84, 2022, pp. 331–354, Feb. 2022, doi: 10.1146/annurev-physiol-061121-040127.

      (2) J. Batista, “An adaptive gradient-based boundary detector for MRI images of the brain,” in 7th International Conference on Image Processing and its Applications, Manchester, UK: IEE, 1999, pp. 440–444. doi: 10.1049/cp:19990360.

      (3) Y. Le, X. Xu, L. Zha, W. Zhao, and Y. Zhu, “Tumor boundary detection in ultrasound imagery using multi-scale generalized gradient vector flow,” J. Med. Ultrason., vol. 42, no. 1, pp. 25–38, Jan. 2015, doi: 10.1007/s10396-014-0559-3.

      (4) X. Ren, “Multi-scale Improves Boundary Detection in Natural Images,” in Computer Vision – ECCV 2008, D. Forsyth, P. Torr, and A. Zisserman, Eds., Berlin, Heidelberg: Springer, 2008, pp. 533–545. doi: 10.1007/978-3-540-88690-7_40.

      (5) C. Grigorescu, N. Petkov, and M. A. Westenberg, “Contour and boundary detection improved by surround suppression of texture edges,” Image Vis. Comput., vol. 22, no. 8, pp. 609–622, Aug. 2004, doi: 10.1016/j.imavis.2003.12.004.

      (6) J. Tang and S. T. Acton, “Vessel Boundary Tracking for Intravital Microscopy Via Multiscale Gradient Vector Flow Snakes,” IEEE Trans. Biomed. Eng., vol. 51, no. 2, pp. 316–324, Feb. 2004, doi: 10.1109/TBME.2003.820374.

      (7) J. Merkow, A. Marsden, D. Kriegman, and Z. Tu, “Dense Volume-to-Volume Vascular Boundary Detection,” in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, S. Ourselin, L. Joskowicz, M. R. Sabuncu, G. Unal, and W. Wells, Eds., Cham: Springer International Publishing, 2016, pp. 371–379. doi: 10.1007/978-3-319-46726-9_43.

      (8) F. Orujov, R. Maskeliūnas, R. Damaševičius, and W. Wei, “Fuzzy based image edge detection algorithm for blood vessel detection in retinal images,” Appl. Soft Comput., vol. 94, p. 106452, Sep. 2020, doi: 10.1016/j.asoc.2020.106452.

      (9) M. E. Martinez-Perez, A. D. Hughes, S. A. Thom, A. A. Bharath, and K. H. Parker, “Segmentation of blood vessels from red-free and fluorescein retinal images,” Med. Image Anal., vol. 11, no. 1, pp. 47–61, Feb. 2007, doi: 10.1016/j.media.2006.11.004.

      (10) A. M. Mendonca and A. Campilho, “Segmentation of retinal blood vessels by combining the detection of centerlines and morphological reconstruction,” IEEE Trans. Med. Imaging, vol. 25, no. 9, pp. 1200–1213, Sep. 2006, doi: 10.1109/TMI.2006.879955.

      (11) A. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever, “Multiscale vessel enhancement filtering,” in Medical Image Computing and Computer-Assisted Intervention — MICCAI’98, W. M. Wells, A. Colchester, and S. Delp, Eds., Berlin, Heidelberg: Springer, 1998, pp. 130–137. doi: 10.1007/BFb0056195.

      (12) K. Bisht et al., “Capillary-associated microglia regulate vascular structure and function through PANX1-P2RY12 coupling in mice,” Nat. Commun., vol. 12, no. 1, p. 5289, Sep. 2021, doi: 10.1038/s41467-021-25590-8.

      (13) Y. Wu et al., “Quantitative relationship between cerebrovascular network and neuronal cell types in mice,” Cell Rep., vol. 39, no. 12, p. 110978, Jun. 2022, doi: 10.1016/j.celrep.2022.110978.

      (14) T. Kirabali et al., “The amyloid-β degradation intermediate Aβ34 is pericyte-associated and reduced in brain capillaries of patients with Alzheimer’s disease,” Acta Neuropathol. Commun., vol. 7, no. 1, p. 194, Dec. 2019, doi: 10.1186/s40478-019-0846-8.

      (15) X. Ren et al., “Linking cortical astrocytic neogenin deficiency to the development of Moyamoya disease–like vasculopathy,” Neurobiol. Dis., vol. 154, p. 105339, Jul. 2021, doi: 10.1016/j.nbd.2021.105339.

      (16) J. Steinman, M. M. Koletar, B. Stefanovic, and J. G. Sled, “3D morphological analysis of the mouse cerebral vasculature: Comparison of in vivo and ex vivo methods,” PLOS ONE, vol. 12, no. 10, p. e0186676, Oct. 2017, doi: 10.1371/journal.pone.0186676.

      (17) A.-A. Berthiaume et al., “Dynamic Remodeling of Pericytes In Vivo Maintains Capillary Coverage in the Adult Mouse Brain,” Cell Rep., vol. 22, no. 1, pp. 8–16, Jan. 2018, doi: 10.1016/j.celrep.2017.12.016.

      (18) S. Katz, R. Gattegno, L. Peko, R. Zarik, Y. Hagani, and T. Ilovitsh, “Diameter-dependent assessment of microvascular leakage following ultrasound-mediated blood-brain barrier opening,” iScience, vol. 26, no. 6, p. 106965, Jun. 2023, doi: 10.1016/j.isci.2023.106965.

      (19) J. Drouin-Ouellet et al., “Cerebrovascular and blood-brain barrier impairments in Huntington’s disease: Potential implications for its pathophysiology,” Ann. Neurol., vol. 78, no. 2, pp. 160–177, Aug. 2015, doi: 10.1002/ana.24406.

      (20) K. P. McDowell, A.-A. Berthiaume, T. Tieu, D. A. Hartmann, and A. Y. Shih, “VasoMetrics: unbiased spatiotemporal analysis of microvascular diameter in multi-photon imaging applications,” Quant. Imaging Med. Surg., vol. 11, no. 3, pp. 969–982, Mar. 2021, doi: 10.21037/qims-20-920.

      (21) E. L. Johnson et al., “Characterization of light penetration through brain tissue, for optogenetic stimulation.” bioRxiv, p. 2021.04.08.438932, Apr. 08, 2021. doi: 10.1101/2021.04.08.438932.

      (22) S. I. Al-Juboori, A. Dondzillo, E. A. Stubblefield, G. Felsen, T. C. Lei, and A. Klug, “Light scattering properties vary across different regions of the adult mouse brain,” PloS One, vol. 8, no. 7, p. e67626, 2013, doi: 10.1371/journal.pone.0067626.

      (23) P. Mächler et al., “Baseline oxygen consumption decreases with cortical depth,” PLOS Biol., vol. 20, no. 10, p. e3001440, Oct. 2022, doi: 10.1371/journal.pbio.3001440.

      (24) J. R. Mester et al., “In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2,” NeuroImage, vol. 192, pp. 135–144, May 2019, doi: 10.1016/j.neuroimage.2019.01.036.

    1. Author response:

      We thank the reviewers for their thoughtful criticisms.  This provisional response addresses what we consider the central critiques, with a full, point-by-point reply to follow with the revised manuscript.  Central critiques concern 1) providing further clarity about the apportionment cost of time, 2) generality & scope, and 3) clarifying the meaning of key equations.

      (1) Apportionment cost

      Reviewers commonly identified a need to provide a concise and intuitive definition of apportionment cost, and to explicitly solve and provide for its mathematical expression. 

      We will add the following definition of apportionment cost to the manuscript: “Apportionment cost is the difference in reward that can be expected, on average, between a policy of taking versus a policy of not taking the considered pursuit, over a time equal to its duration.”  While this difference is the apportionment cost of time, the amount that would be expected over a time equal to the considered pursuit under a policy of not taking the considered pursuit is the opportunity cost of time.  Together, they sum to Time’s Cost.  The above definition of apportionment cost adds to other stated relationships of apportionment cost found throughout the paper (Lines 434,435,447,450). 

      As suggested, we will also add equations of apportionment cost, as below.

      (2) Generality & Scope

      Generality. We will add further examples in support of the generality of these equations for assessing and thinking about the value of initiating a pursuit.  Specifically, this will include 1) illustrating forgo decision making in a world composed of multiple pursuits, as in prey selection, 2) demonstrating and examining worlds in which a sequence of pursuits compose a considered pursuit’s ‘outside’, and 3) clarifying how our framework does contend with variance and uncertainty in reward magnitude and occurrence.

      Scope. In this manuscript, we consider the worth of initiating one or another pursuit having completed a prior one, and not the issue of continuing within a pursuit having already engaged in it.  The worth of continuing a pursuit, as in patch-foraging/give-up tasks, constitutes a third fundamental time decision-making topology which is outside the scope of the current work.  It engages a large and important literature, encompassing evidence accumulation, and requires a paper in its own right that can use the concepts and framework developed here.  We will further consider applying this framework to extant datasets.

      (3) Correction of typographical errors and further explanation of equations.   

      We would like to redress the two typographical errors identified by the reviewers that appeared in the equations on line 277 and on line 306, and provide further explanation to equations that gave pause to the reviewers.

      Typographical errors: 

      The first typographical error in the main text regards equation 2 and will be corrected so that equation 2 appears correctly as…

      Line 277:  

      The second typo regards the definition of the considered pursuit’s reward rate, and will be corrected to appear as…

      Line 306:   

      Regarding equations:

      Cross-reference to equations in the main text refer to equations as they appear in the main text.  Where needed, the appendix in which they are derived is also given.   Equation numbering within the appendices refer to equations as they appear in the appendices.  In the revision, we will refer to all equations that appear in the appendices as Ap.#.#. so as to avoid confusion between referencing equations as they appear in the main text and equations as they appear in the appendices.  

      We would also like to clarify that equation 8, , as we derive, is not new, as it is similarly derived and expressed in prior foundational work by McNamara (1982), which is now so properly attributed. 

      Equation 1 and Appendix 1

      Equation 1 is formulated to calculate the average reward received and average time spent per unit time spent in the default pursuit. So, fi is the encounter rate of pursuit  for one unit of time spent in the default pursuit (lines 259-262). Added to the summation in the numerator, we have the average reward obtained in the default pursuit per unit time and in the denominator we have the time spent in the default pursuit per unit time (1).

      Equation 2 and Appendix 2

      Eq. 2.4 in Appendix 2 calculates the average time spent outside of the considered pursuit, per encounter with the considered pursuit. Breaking down eq. 2.4, the first term in the numerator,

      gives the expected time spent in other pursuits, per unit time spent in the default pursuit, where fi is the encounter rate of pursuit  per unit time spent in the default pursuit, and  is the time required by pursuit i. The second term in the numerator, (1, added outside the summation) simply represents the unit of time spent in the default pursuit, over which the encounter rate of each pursuit is calculated. Together, these represent the total time spent outside the considered pursuit, per unit time spent in the default pursuit. The denominator,

      is the frequency with which the considered pursuit is encountered per unit time spent in the default pursuit, so

      is the average time spent within the default pursuit, per encounter with the considered pursuit. By multiplying the average time spent outside of the considered pursuit per unit time spent in the default pursuit by the average time spent within the default pursuit per encounter with the considered pursuit, we get eq. 2.4, the average time spent outside of the considered pursuit, per encounter with the considered pursuit, which is equal to tout.

                             (eq. 2.4)

    2. eLife Assessment

      This paper undertakes a valuable theoretical treatment of the potential role of foraging-related concepts in several forms of intertemporal choice. While the computational evidence and methodologies employed are novel, some issues with clarity and generality result in incomplete support for the paper's claims.

    3. Reviewer #1 (Public review):

      Summary:

      This theoretical paper addresses how to optimize reward-rate-maximizing decisions in certain foraging-style environments. It presents a series of equations and graphical illustrations for quantities such as reward rates and time-related costs that a decision maker could estimate as a basis for such decisions. One of the main takeaways is that if the hypothetical agent underweights the time spent outside a focal reward pursuit relative to the time spent within it, this can predict a broadly realistic pattern of impatience in two alternative intertemporal choices paired with well-calibrated take-it-or-leave-it decisions. Another takeaway is that if the optimally estimated subjective value of a reward pursuit is plotted as a function of a range of temporal durations, the result resembles a hyperbolic discounting function and is affected in empirically realistic ways by the magnitude and sign of the reward. Thus, the rate-maximization framework might lead to a hypothesis about the basis for the magnitude and sign effects in discounting.

      Strengths:

      The paper makes a useful contribution by broadening the application of reward-rate maximization to time-related decision scenarios. The paper's breadth of scope includes applying the same framework to accept/reject decisions and multi-alternative discounting decisions. The figures take a creative approach to illustrating the internal quantities in the model. It's particularly useful that the paper gives consideration to internal distortions that could give rise to documented anomalies in decision behavior.

      Weaknesses:

      (1) Although there are many citations acknowledging relevant previous work, there often isn't a very granular attribution of individual previous findings to their sources. In the results section, it's sometimes ambiguous when the paper is recapping established background and when it is breaking new ground. For example, around equation 8 in the results (sv = r - rho*t), it would be good to refer to previous places where versions of this equation have been presented. Offhand, McNamara 1982 (Theoretical Population Biology) is one early instance and Fawcett et al. 2012 (Behavioural Processes) is a later one. Line 922 of the discussion seems to imply this formulation is novel here.

      (2) The choice environments that are considered in detail in the paper are very simple. The simplicity facilitates concrete examples and visualizations, but it would be worth further consideration of whether and how the conclusions generalize to more complex environments. The paper considers "forgo" scenario in which the agent can choose between sequences of pursuits like A-B-A-B (engaging with option B at all opportunities, which are interleaved with a default pursuit A) and A-A-A-A (forgoing option B). It considers "choice" scenarios where the agent can choose between sequences like A-B-A-B and A-C-A-C (where B and C are larger-later and smaller-sooner rewards, either of which can be interleaved with the default pursuit). Several forms of additional complexity would be valuable to consider. One would be a greater number of unique pursuits, not repeated identically in a predictable sequence, akin to a prey-selection paradigm. It seems to me this would cause t_out and r_out (the time and reward outside of the focal prospect) to be policy-dependent, making the 'apportionment cost' more challenging to ascertain. Another relevant form of complexity would be if there were variance or uncertainty in reward magnitudes or temporal durations or if the agent had the ability to discontinue a pursuit such as in patch-departure scenarios.

      (3) I had a hard time arriving at a solid conceptual understanding of the 'apportionment cost' around Figure 5. I understand the arithmetic, but it would help if it were possible to formulate a more succinct verbal description of what makes the apportionment cost a useful and meaningful quality to focus on. I think Figure 6C relates to this, but I had difficulty relating the axis labels to the points, lines, and patterned regions in the plot. I also was a bit confused by how the mathematical formulation was presented. As I understood it, the apportionment cost essentially involves scaling the rest of the SV expression by t_out/(t_in + t_out). The way this scaling factor is written in Figure 5C, as 1/(1 + (1/t_out)t_in), seems less clear than it could be. Also, the apportionment cost is described in the text as being subtracted from SV rather than as a multiplicative scaling factor. It could be written as a subtraction, by subtracting a second copy of the rest of the SV expression scaled by t_in/(t_in + t_out). But that shows the apportionment cost to depend on the opportunity cost, which is odd because the original motivation on line 404 was to resolve the lack of independence between terms in the SV expression.

      (4) In the analysis of discounting functions (line 664 and beyond), the paper doesn't say much about the fact that many discounting studies take specific measures to distinguish true time preferences from opportunity costs and reward-rate maximization. In many of the human studies, delay time doesn't preclude other activities. In animal studies, rate maximization can serve as a baseline against which to measure additional effects of temporal discounting. This is an important caveat to claims about discounting anomalies being rational under rate maximization (e.g., line 1024).

      (5) The paper doesn't feature any very concrete engagement with empirical data sets. This is ok for a theoretical paper, but some of the characterizations of empirical results that the model aims to match seem oversimplified. An example is the contention that real decision-makers are optimal in accept/reject decisions (line 816 and elsewhere). This isn't always true; sometimes there is evidence of overharvesting, for example.

      (6) Related to the point above, it would be helpful to discuss more concretely how some of this paper's theoretical proposals could be empirically evaluated in the future. Regarding the magnitude and sign effects of discounting, there is not a very thorough overview of the several other explanations that have been proposed in the literature. It would be helpful to engage more deeply with previous proposals and consider how the present hypothesis might make unique predictions and could be evaluated against them. A similar point applies to the 'malapportionment hypothesis' although in this case there is a very helpful section on comparisons to prior models (line 1163). The idea being proposed here seems to have a lot in common conceptually with Blanchard et al. 2013, so it would be worth saying more about how data could be used to test or reconcile these proposals.

    4. Reviewer #2 (Public review):

      Summary:

      This paper from Sutlief et al. focuses on an apparent contradiction observed in experimental data from two related types of pursuit-based decision tasks. In "forgo" decisions, where the subject is asked to choose whether or not to accept a presented pursuit, after which they are placed into a common inter-trial interval, subjects have been shown to be nearly optimal in maximizing their overall rate of reward. However, in "choice" decisions, where the subject is asked which of two mutually-exclusive pursuits they will take, before again entering a common inter-trial interval, subjects exhibit behavior that is believed to be sub-optimal. To investigate this contradiction, the authors derive a consistent reward-maximizing strategy for both tasks using a novel and intuitive geometric approach that treats every phase of a decision (pursuit choice and inter-trial interval) as vectors. From this approach, the authors are able to show that previously reported examples of sub-optimal behavior in choice decisions are in fact consistent with a reward-maximizing strategy. Additionally, the authors are able to use their framework to deconstruct the different ways the passage of time impacts decisions, demonstrating that time cost contains both an opportunity cost and an apportionment cost, as well as examining how a subject's misestimation of task parameters impacts behavior.

      Strengths:

      The main strength of the paper lies in the authors' geometric approach to studying the problem. The authors chose to simplify the decision process by removing the highly technical and often cumbersome details of evidence accumulation that are common in most of the decision-making literature. In doing so, the authors were able to utilize a highly accessible approach that is still able to provide interesting insights into decision behavior and the different components of optimal decision strategies.

      Weaknesses:

      While the details of the paper are compelling, the authors' presentation of their results is often unclear or incomplete:

      (1) The mathematical details of the paper are correct but contain numerous notation errors and are presented as a solid block of subtle equation manipulations. This makes the details of the authors' approach (the main contribution of the paper to the field) highly difficult to understand.

      (2) One of the main contributions of the paper is the notion that time cost in decision-making contains an apportionment cost that reflects the allocation of decision time relative to the world. The authors use this cost to pose a hypothesis as to why subjects exhibit sub-optimal behavior in choice decisions. However, the equation for the apportionment cost is never clearly defined in the paper, which is a significant oversight that hampers the effectiveness of the authors' claims.

      (3) Many of the paper's figures are visually busy and not clearly detailed in the captions (for example, Figures 6-8). Because of the geometric nature of the authors' approach, the figures should be as clean and intuitive as possible, as in their current state, they undercut the utility of a geometric argument.

      (4) The authors motivate their work by focusing on previously-observed behavior in decision experiments and tell the reader that their model is able to qualitatively replicate this data. This claim would be significantly strengthened by the inclusion of experimental data to directly compare to their model's behavior. Given the computational focus of the paper, I do not believe the authors need to conduct their own experiments to obtain this data; reproducing previously accepted data from the papers the authors' reference would be sufficient.

      (5) While the authors reference a good portion of the decision-making literature in their paper, they largely ignore the evidence-accumulation portion of the literature, which has been discussing time-based discounting functions for some years. Several papers that are both experimentally-(Cisek et al. 2009, Thurs et al. 2012, Holmes et al. 2016) and theoretically-(Drugowitsch et al. 2012, Tajima et al. 2019, Barendregt et al. 22) driven exist, and I would encourage the authors to discuss how their results relate to those in different areas of the field.

    5. Reviewer #3 (Public review):

      Summary:

      The goal of the paper is to examine the objective function of total reward rate in an environment to understand the behavior of humans and animals in two types of decision-making tasks: (1) stay/forgo decisions and (2) simultaneous choice decisions. The main aims are to reframe the equation of optimizing this normative objective into forms that are used by other models in the literature like subjective value and temporally discounted reward. One important contribution of the paper is the use of this theoretical analysis to explain apparent behavioral inconsistencies between forgo and choice decisions observed in the literature.

      Strengths:

      The paper provides a nice way to mathematically derive different theories of human and animal behavior from a normative objective of global reward rate optimization. As such, this work has value in trying to provide a unifying framework for seemingly contradictory empirical observations in literature, such as differentially optimal behaviors in stay-forgo v/s choice decision tasks. The section about temporal discounting is particularly well motivated as it serves as another plank in the bridge between ecological and economic theories of decision-making.

      Weaknesses:

      One broad issue with the paper is readability. Admittedly, this is a complicated analysis involving many equations that are important to grasp to follow the analyses that subsequently build on top of previous analyses.

      But, what's missing is intuitive interpretations behind some of the terms introduced, especially the apportionment cost without referencing the equations in the definition so the reader gets a sense of how the decision-maker thinks of this time cost in contrast with the opportunity cost of time.

      Re-analysis of some existing empirical data through the lens of their presented objective functions, especially later when they describe sources of error in behavior.

    1. eLife Assessment

      This is a fundamental research study which identifies some of the molecular mechanisms underlying the energy costly process of memory consolidation. The strength of evidence is exceptional. The paper should be of broad interest because it establishes a clear mechanistic link between long-term memory processes and the energy-producing machinery in neurons.

    2. Reviewer #1 (Public review):

      Summary:

      This is a detailed description of the role of PKCδ in Drosophila learning and memory. The work is based on a previous study (Placais et al. 2017) that has already shown that for the establishment of long-term memory, the repetitive activity of MP1 dopaminergic neurons via the dopamine receptor DAMB is essential to increase mitochondrial energy flux in the mushroom body. In this paper, the role of PKCδ is now introduced. PKCδ is a molecular link between the dopaminergic system and the mitochondrial pyruvate metabolism of mushroom body Kenyon cells. For this purpose, the authors establish a genetically encoded FRET-based fluorescent reporter of PKCδ-specific activity, δCKAR.

      Strengths:

      This is a thorough study on the long-term memory of Drosophila. The work is based on the extensive, high-quality experience of the senior authors. This is particularly evident in the convincing use of behavioral assays and imaging techniques to differentiate and explore various memory phases in Drosophila. The study also establishes a new reporter to measure the activity of PKCδ - the focus of this study - in behaving animals. The authors also elucidate how recurrent spaced training sessions initiate a molecular gating mechanism, linking a dopaminergic punishment signal with the regulation of mitochondrial pyruvate metabolism. This advancement will enable a more precise molecular distinction of various memory phases and a deeper comprehension of their formation in the future.

      Weaknesses:

      The study offers novel insights into the molecular mechanisms underlying long-term memory formation and presents no apparent weaknesses in either content or methodology.

    3. Reviewer #2 (Public review):

      Summary

      This study deepens the former authors' investigations of the mechanisms involved in gating the long-term consolidation of an associative memory (LTM) in Drosophila melanogaster. After having previously found that LTM consolidation 1. costs energy (Plaçais and Préat, Science 2013) provided through pyruvate metabolism (Plaçais et al., Nature Comm 2017) and 2. is gated by the increased tonic activity in a type of dopaminergic neurons ('MP1 neurons') following only training protocol relevant for LTM, i.e. interspaced in time (Plaçais et al., Nature Neuro 2012), they here dig into the intra-cell signalling triggered by dopamine input and eventually responsible for the increased mitochondria activity in Kenyon Cells. They identify a particular PKC, PKCδ, as a major molecular interface in this process and describe its translocation to mitochondria to promote pyruvate metabolism, specifically after spaced training.

      Methodological approach

      To that end, they use RNA interference against the isozyme PKCδ, in a time-controlled way and in the whole Kenyon cells populations or in the subpopulation forming the α/β lobe. This knock-down decreased the total PKCδ mRNA level in the brain by ca. 30%, and is enough to observe decreased in flies performances for LTM consolidation. Using Pyronic, a sensor for pyruvate for in vivo imaging, and pharmacological disruption of mitochondrial function, the authors then show that PKCδ knock-down prevents high level of pyruvate from accumulating in the Kenyon cells at the time of LTM consolidation, pointing towards a role of PKCδ in promoting pyruvate metabolism. They further identify the PDH kinase PDK as a likely target for PKCδ since knocking down both PKCδ and PDK led to normal LTM performances, likely counterbalancing PKCδ knock-down alone.

      To understand the timeline of PKCδ activation and to visualise its mitochondrial translocation in subpart of Mushroom body lobes they imported in fruitfly the genetically-encoded FRET reporters of PKCδ, δCKAR and mitochondria-δCKAR (Kajimoto et al 2010). They show that PKCδ is activated to the sensor's saturation only after spaced training, and not other types of training that are 'irrelevant' for LTM. Further, adding thermogenetic activation of dopaminergic neurons and RNA interference against Gq-coupled dopamine receptor to FRET imaging, they identify that a dopamine-triggered cascade is sufficient for the elevated PKCδ-activation.

      Strengths and weaknesses

      The authors use a combination of new fluorescent sensors and behavioral, imaging, and pharmacological protocols they already established to successfully identify the molecular players that bridge the requirement for spaced training/dopaminergic neurons MP1 oscillatory activity and the increased metabolic activity observed during long-term memory consolidation.<br /> The study is dense in new exciting findings and each methodological step is carefully designed. The experiments one could think of to make this link have been done in this study and the results seem solid.<br /> The discussion is well conducted, with interesting parallel with mammals, where the possibility that this process takes place as well is yet unknown.

      Impact

      Their findings should interest a large audience:<br /> They discover and investigate a new function for PKCδ in regulating memory processes in neurons in conjunction with other physiological functions, making this molecule a potentially valid target for neuropathological conditions. They also provide new tools in drosophila to measure PKCδ activation in cells. They identify the major players for lifting the energetic limitations preventing the formation of a long-term memory.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      This is a detailed description of the role of PKCδ in Drosophila learning and memory. The work is based on a previous study (Placais et al. 2017) that has already shown that for the establishment of long-term memory, the repetitive activity of MP1 dopaminergic neurons via the dopamine receptor DAMB is essential to increase mitochondrial energy flux in the mushroom body. 

      In this paper, the role of PKCδ is now introduced. PKCδ is a molecular link between the dopaminergic system and the mitochondrial pyruvate metabolism of mushroom body Kenyon cells. For this purpose, the authors establish a genetically encoded FRET-based fluorescent reporter of PKCδspecific activity, δCKAR. 

      Strengths: 

      This is a thorough study of the long-term memory of Drosophila. The work is based on the extensive, high-quality experience of the senior authors. This is particularly evident in the convincing use of behavioral assays and imaging techniques to differentiate and explore various memory phases in Drosophila. The study also establishes a new reporter to measure the activity of PKCδ - the focus of this study - in behaving animals. The authors also elucidate how recurrent spaced training sessions initiate a molecular gating mechanism, linking a dopaminergic punishment signal with the regulation of mitochondrial pyruvate metabolism. This advancement will enable a more precise molecular distinction of various memory phases and a deeper comprehension of their formation in the future. 

      Weaknesses: 

      Apart from a few minor technical issues, such as the not entirely convincing visualisation of the localisation of a PKCδ reporter in the mitochondria, there are no major weaknesses. Likewise, the scientific classification of the results seems appropriate, although a somewhat more extensive discussion in relation to Drosophila would have been desirable.

      We are very grateful for this very positive appreciation of our work. Following this comment, we have revised our manuscript to bring more compelling evidence of the mitochondrial localization of the PKCδ reporter. We also developed the discussion of our results with respect to the Drosophila learning and memory literature.

      Reviewer #2 (Public Review):

      Summary 

      This study deepens the former authors' investigations of the mechanisms involved in gating the longterm consolidation of an associative memory (LTM) in Drosophila melanogaster. After having previously found that LTM consolidation 1. costs energy (Plaçais and Préat, Science 2013) provided through pyruvate metabolism (Plaçais et al., Nature Comm 2017) and 2. is gated by the increased tonic activity in a type of dopaminergic neurons ('MP1 neurons') following only training protocol relevant for LTM, i.e. interspaced in time (Plaçais et al., Nature Neuro 2012), they here dig into the intra-cell signalling triggered by dopamine input and eventually responsible for the increased mitochondria activity in Kenyon Cells. They identify a particular PKC, PKCδ, as a major molecular interface in this process and describe its translocation to mitochondria to promote pyruvate metabolism, specifically after spaced training. 

      Methodological approach 

      To that end, they use RNA interference against the isozyme PKCδ, in a time-controlled way and in the whole Kenyon cell populations or in the subpopulation forming the α/β lobe. This knock-down decreased the total PKCδ mRNA level in the brain by ca. 30%, and is enough to observe decreased in flies performances for LTM consolidation. Using Pyronic, a sensor for pyruvate for in vivo imaging, and pharmacological disruption of mitochondrial function, the authors then show that PKCδ knockdown prevents a high level of pyruvate from accumulating in the Kenyon cells at the time of LTM consolidation, pointing towards a role of PKCδ in promoting pyruvate metabolism. They further identify the PDH kinase PDK as a likely target for PKCδ since knocking down both PKCδ and PDK led to normal LTM performances, likely counterbalancing PKCδ knock-down alone. 

      To understand the timeline of PKCδ activation and to visualise its mitochondrial translocation in a subpart of Mushroom body lobes they imported in fruitfly the genetically-encoded FRET reporters of PKCδ, δCKAR, and mitochondria-δCKAR (Kajimoto et al 2010). They show that PKCδ is activated to the sensor's saturation only after spaced training, and not other types of training that are 'irrelevant' for LTM. Further, adding thermogenetic activation of dopaminergic neurons and RNA interference against Gq-coupled dopamine receptor to FRET imaging, they identify that a dopamine-triggered cascade is sufficient for the elevated PKCδ-activation. 

      Strengths and weaknesses 

      The authors use a combination of new fluorescent sensors and behavioral, imaging, and pharmacological protocols they already established to successfully identify the molecular players that bridge the requirement for spaced training/dopaminergic neurons MP1 oscillatory activity and the increased metabolic activity observed during long-term memory consolidation. 

      The study is dense in new exciting findings and each methodological step is carefully designed. Almost all possible experiments one could think of to make this link have been done in this study, with a few exceptions that do not prevent the essential conclusions from being drawn. 

      The discussion is well conducted, with interesting parallels with mammals, where the possibility that this process takes place as well is yet unknown. 

      Impact 

      Their findings should interest a large audience: 

      They discover and investigate a new function for PKCδ in regulating memory processes in neurons in conjunction with other physiological functions, making this molecule a potentially valid target for neuropathological conditions. They also provide new tools in drosophila to measure PKCδ activation in cells. They identify the major players for lifting the energetic limitations preventing the formation of a long-term memory. 

      We warmly thank Reviewer #2 for the enthusiastic assessment of our work. There were no specific point to address in the Public Review.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have a few comments that could help improve the paper and help the reader navigate the detailed analysis.

      (1) Perhaps the authors could add a sentence or two in the intro about the different PKC genes in Drosophila and whether they are expressed in the MB.

      We thank Reviewer #1 for this suggestion. We now describe in the introduction the various subfamilies of PKCs downstream of Gq signaling , the Drosophila members of those different PKC subfamilies, and their expression in the brain. 

      (2) Italicise Drosophila throughout the text.

      We have done this correction.

      (3) In Figure 1, you could change the scheme in Figure F-H and have the timeline always start after training. Then you could see that the training varies in time (perhaps provide the exact duration for each training protocol) and the test interval is constant. Why is it actually measured in a time window and not at an exact time?

      This is indeed a good suggestion to clarify the presentation of our results. We changed the timelines schemes in all the figures with the t=0 starting at the end of the conditioning. Indeed, each conditioning protocol has a different duration as represented on these timelines: as one-cycle training lasts 5 min, 5x massed training has a duration of 20 min, and 5x spaced training takes 1 hours and 30 min to be completed, with its 15 min intertrial intervals. In vivo imaging experiments are performed during a certain time window after conditioning during which, according to our previous experience, the activity of MP1 dopamine neurons after spaced training remains constant (Plaçais et al., 2012). This offers the practical advantage that we can image several flies after a given training session, instead of having to perform many consecutive conditioning protocols.  

      (4) In Figure 2 you could show the massed training data from the supplement. This is very similar to what is shown in Figure 1. Are there also imaging experiments on massed training?

      The reason why massed training data was initially displayed in the supplementary data is that α/β neurons are known to be crucial for LTM formation but are not required for memory formed after massed training, so that the absence of effect was somehow expected. Nonetheless, we performed δCKAR imaging in α/β neurons after 5x massed training and found that PKCδ activity was not increased post-conditioning as expected (Figure 2C). This experiment was performed in parallel of additional data after 5x spaced conditioning δCKAR imaging in α/β neurons as a positive control (these new data were added to the Figure 2B). Following Reviewer #1’s suggestion, all data investigating the effect of PKCδ in α/β neurons are now displayed on Figure 2.

      (5) Figure 3: I am not sure if the blue curve in Figure A really represents an upregulated pyruvate flux compared to the control (mentioned in line 210). It may be the case initially, but it is clearly below the control after 40s. Why is that?

      This visual effect is due to the fact that PDBu injection in itself increases the pyruvate level in MB neurons (independently of its effect on PKCδ), before sodium azide injection. As a result, the baseline of the PDBu treated flies is above the DMSO control flies when sodium azide is injected, which results in the fact that the pyronic sensor saturates quicker and therefore reaches its plateau before the control when traces where normalized right before sodium azide injection. 

      That being said, the measure of the slope in itself following sodium azide injection is not affected by these differences, and is always measured between 10 and 70% of the plateau. 

      Given this remark, and another comment from Reviewer#2 about this experiment, we removed the panel 3A and present only the complete recording of this experiment, that is now displayed on Figure 3 – figure supplement 1C.

      (6) For me, the localisation of the mitochondrial reporter in the mitochondria is not clear. The image in the supplement is not sufficient to show this clearly. What is missing here is a co-staining in the same brain of UAS-mito-δCKAR and a mitochondrial marker to label the mitochondria and the reporter at the same time in the same animal.

      We agree with Reviewer #1’s remark and added new data to make this point more convincing. As suggested, we co-expressed mito-δCKAR with the mitochondrial reporter mito-DsRed in MB neurons (Lutas et al., 2012). We observed a clear colocalization of both signals by performing confocal imaging in the MB neurons somas, indicating that mito-δCKAR is indeed addressed to mitochondria (Figure 4 – figure supplement 1B and 2). 

      (7) Are there controls that the MB expression of the reporters in the flies does not influence the learning ability? In order to make statements about the physiology of the cells, it must also be shown that the cells still have normal activity and allow learning behaviour comparable to wild-type flies.

      This is indeed an important control that we added in the revised version. We tested the memory after 5x spaced, 5x massed and 1x training of flies expressing in the MB the various imaging probes used in our study (cyto-δCKAR, mito-δCKAR and Pyronic). Memory performance was similar to controls in all cases (Figure 1 – figure supplement 1E).  

      (8) Perhaps the authors could go into more detail on two points in the discussion and shorten the comprehensive comparison to the vertebrate system somewhat. It would be nice to know how the local transfer from the peduncle to the vertical lobus is supposed to take place. What is the mechanism here? Any suggestions from the literature? It would also be useful to mention the compartmentalisation of the MB and how the information can overcome these boundaries from the peduncle to the vertical lobe.

      We now elaborate on this question in the discussion (lines 368-386). To sum up, given that the compartmentalization of the MBs is anatomically defined by the presence of specific subset of MBON and DAN cell types (forming different information-processing units), rather than by physical boundaries per se, we can consider two main hypotheses to explain PKCδ activation transfer from the peduncle to the lobes: passive diffusion of activated PKCδ, or mitochondrial motility that would displace PKCδ from its place of first activation. We indeed found that mitochondrial motility was occurring upon 5x spaced conditioning for LTM formation (Pavlowsky et al. 2024).

      In principle, one could also consider that PKCδ could be activated in the lobes by a relaying neuron. The MVP2 neuron (aka MBON-γ1>pedc) presents dendrites facing MP1 and makes synapses with the α/β neurons at the level of the α and β lobes, which makes it a good candidate. Furthermore, as we show that PKCδ activation in the lobes requires DAMB (Figure 4C, Figure 5A-B, Figure 5 – figure supplement 1), one could imagine the following activation loop: MP1 activates the MB neurons via DAMB, that activate MVP2 at the level of the peduncle, which activates in turn the MB neurons at the level of the lobes. However, we did not retain this hypothesis, because MVP2 is GABAergic, which makes it highly unlikely to be able to activate a kinase like PKCδ.

      Regarding the comparative discussion with mammalian systems, we appreciate Reviewer #1’s remark that it may appear too detailed, but given that Reviewer #2 (public comment) highlighted the ‘interesting parallel with mammals’ in our discussion, we finally chose not to reduce this part in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Fig 1G: is there a decrease in PKCδ activation after mass training as compared to the control, indicating an inhibitory mechanism onto PKCδ following mass training? Or is this an artifact of the PDBu application procedure in the control group? 

      We thank Reviewer #2 for this careful comment. The dent in the timetrace following PDBu application after massed training (Figure 1G) is indeed an artifact due to the manual injection of the drug. But we would like to emphasize that what matters in the determination of PKCδ activity is the level of the baseline before PDBu application after normalization to the final plateau, so that variation around the injection time do not impact the result of the analysis. Moreover, in the revised version, we performed a similar series of experiments, using an α/β neuron-specific driver (Figure 2C). In this series of experiments, there were limited injection artefacts, and we obtained the same conclusion as Figure 1G that PKCδ activity is left unchanged by 5x massed conditioning. 

      Fig 3A: I suggest moving this panel in the supplement: I found it difficult to process the effect of PDBu that is unspecific to PKCδ and that leads to a different plateau because of a different baseline. It would be better explained in more detail in the supplement, especially given that the 3B panel can lead to a similar conclusion and does not have this specificity problem. Up to the authors.

      We thank Reviewer #2 for this feedback. We followed the suggestion and now only display the full recording of this experiment on Figure 3 – figure supplement 1C.

      Fig 3C: To go further, one wonders if knocking-down PDK would act as a switch for gating LTM formation, i.e. if done during a 1x training or a 5x massed training would it gate long-term consolidation?

      This is indeed an excellent suggestion. We performed this experiment and showed that in flies expressing the PDK RNAi in adult MB neurons, only one cycle of training was sufficient to induce longterm memory formation (Figure 3A), instead of the 5 spaced cycles normally required. This confirms the model we previously established in Plaçais et al. 2017, where long-term memory formation was observed upon PDK MB knock-down after 2 cycles of spaced training. This new result goes further in characterizing this facilitation effect, now showing that even a single cycle is sufficient. Altogether these data show that mitochondrial metabolic activation is the critical gating step in long-term memory formation. Spaced training achieves this activation through PDK inhibition, mediated by PKCδ.

      What is the level of mRNA in this construct? I don't see a quantification, can you justify it?

      We thank Reviewer #2 for this remark. This PDK RNAi had been used in a previous work in pyruvate imaging experiment, where it successfully boosted mitochondrial pyruvate uptake. But indeed we had not validated it at the mRNA level. In the revised version of the present manuscript, we now confirm by RT-qPCR that the PDK RNAi efficiently downregulates PDK expression in neurons (Figure 3 – figure supplement 1A).

      Fig. 4C: Is PKCδ activation increase in Vertical lobe DAMB-dependent? One wonders, because MP1 may somehow activate other neurons that could reach this part of the Kenyon Cells. I do not see in the results what could disprove this possibility. The mechanism linking DAMB activation in the peduncle and PKCδ activation in the VL is mysterious, see also Fig. 5.

      This is a very sound remark. In the revised version we have checked whether PKCδ activation in the vertical lobes is also dependent on DAMB.  We performed thermogenetic activation of MP1 neurons and imaged mito-δCKAR signal in the vertical lobes upon DAMB MB knock-down. We found that as for the peduncle, DAMB was required for PKCδ mitochondrial activation (Figure 4C, right panel). This experiment was performed in parallel with similar measurements in flies that did not express DAMB RNAi, as a positive control (these new control data were added to the Figure 4C, left panel).

      This result supports a model where dopamine from MP1 neurons directly acts on Kenyon cells, even for PKCδ activation in the vertical lobes. Thus, this advocates for a diffusion of DAMB-activated PKCδ from the peduncle to the vertical lobes, either by passive diffusion or by mitochondrial motility - two hypotheses that we added in the discussion. 

      Fig. 5: If MP1 neurons release dopamine only to the peduncle, how do you expect PKCδ to be translocated to mitochondria all the way to the vertical lobe? Also is it specific to the vertical lobe and not found in the medial lobe?

      Investigating the spatial distribution of PKCδ is, once again, a very sound suggestion. We re-analyzed our dataset of the mito-δCKAR signal after spaced training for peduncle measurement, as the imaging plane also included the β lobe. We found that PKCδ is also activated at that level, and that its activation also depends on DAMB (Figure 5 – figure supplement 1). We also performed additional pyruvate measurements in the medial lobes, and observed that mitochondria pyruvate uptake presents the same extension in time in the medial lobes as in the vertical lobes when comparing spaced training (Figure 6 E-F and Figure 6 – figure supplement 1E-F) to 1x training (Figure 6A-B and Figure 6 – figure supplement 1C-D). Therefore, the metabolic action of PKCδ seems not to be restricted to the vertical lobes, but spreads across the whole axonal compartment.

      Altogether, these data point toward the fact that activated PKCδ diffuse from its point of activation, the peduncle, where dopamine is released by MP1 and DAMB is activated, to both the vertical and medial lobes, either by passive diffusion, or taking advantage of mitochondrial movement that was shown to be triggered by spaced training (Pavlowsky et al. 2024), from the MB neurons somas to the axons. To further characterize the kinetics of PKCδ activation, we measured its activity using the mitoδCKAR sensor at 3 and 8 hours following spaced training. We found that while PKCδ was still active at 3 hours, it was back to its baseline activity level at 8 hours, both at the level of the peduncle and the vertical lobes (Figure 5 C-F). However, at 8 hours, pyruvate metabolism is still upregulated in the lobes, which indicates that an additional mechanism is relaying PKCδ action to maintain the high energy state of the MBs at later time points. As we propose in the revised discussion, the mitochondrial motility hypothesis makes sense here (Pavlowsky et al. 2024), as the progressive increase in the number of mitochondria in the lobes would be able to sustain high mitochondrial metabolism beyond PKCδ activation at 8 hours post-conditioning. This new result and its implications open exciting perspectives for future research about the different mitochondrial regulations occurring after spaced training, their organization over time and their interactions.

      Fig.7:  PDK written in yellow is almost invisible

      This has been changed.

    1. eLife Assessment

      This manuscript provides important new insight into the mechanisms underlying seasonal physiology, using medaka fish as a functional genetic model that naturally exhibits photoperiodic responses. The authors provide a range of data that implicate agrp1 in feeding regulation in response to photoperiod and reproductive status. This paper provides solid evidence connecting the effects of long and short photoperiods on the food intake of female medaka fish and egg production. It will be of relevance for biologists interested in understanding the molecular and cellular underpinnings of environmental effects on animal biology.

    2. Reviewer #1 (Public review):

      Summary:

      The authors use the teleost medaka as an animal model to study the effect of seasonal changes in day-length on feeding behaviour and oocyte production. They report a careful analysis of how day-length affects female medakas and a thorough molecular genetic analysis of genes potentially involved in this process. They show a detailed analysis of two genes and include a mutant analysis of one gene to support their conclusions

      Strengths:

      The authors pick their animal model well and exploit the possibilities to examine in this laboratory model the effect of a key environmental influence, namely the seasonal changes of day-length. The phenotypic changes are carefully analysed and well-controlled. The mutational analysis of the agrp1 by a ko-mutant provides important evidence to support the conclusions. Thus this report exceeds previous findings on the function of agrp1 and npyb as regulators of food-intake and shows how in medaka these genes are involved in regulating the organismal response to an environmental change. It thus furthers our understanding of how animals react to key exogenous stimuli for adaptation.

      Weaknesses:

      The authors are too modest when it comes to underscoring the importance of their findings. Previous animal models used to study the effect of these neuropeptides on feeding behaviour have either lost or were most likely never sensitive to seasonal changes of day length. Considering the key importance of this parameter on many aspects of plant and animal life it could be better emphasised that a suitable animal model is at hand that permits this.<br /> The molecular characterization of the agrp1 ko-mutant that the authors have generated lacks some details that would help to appreciate the validity of the mutant phenotype. Additional data would help in this respect.

    3. Reviewer #2 (Public review):

      Summary:

      The authors investigated the mechanisms behind breeding season-dependent feeding behavior using medaka, a well-known photoperiodic species, as a model. Through a combination of molecular, cellular, and behavioral analyses, including tests with mutants, they concluded that AgRP1 plays a central role in feeding behavior, mediated by ovarian estrogenic signals.

      Strengths:

      This study offers valuable insights into the neuroendocrine mechanisms that govern breeding season-dependent feeding behavior in medaka. The multidisciplinary approach, which includes molecular and physiological analyses, enhances the scientific contribution of the research.

      Weaknesses:

      While medaka is an appropriate model for studying seasonal breeding, the results presented are insufficient to fully support the authors' conclusions.

      Specifically, methods and data analyses are incomplete in justifying the primary claims:<br /> - the procedure for the food intake assay is unclear;<br /> - the sample size is very small;<br /> - the statistical analysis is not always adequate.

      Additionally, the discussion fails to consider the possible role of other hormones that may be involved in the feeding mechanism.

    4. Reviewer #3 (Public review):

      Summary:

      Understanding the mechanisms whereby animals restrict the timing of their reproduction according to day length is a critical challenge given that many of the most relevant species for agriculture are strongly photoperiodic. However, the principal animal models capable of detailed genetic analysis do not respond to photoperiod so this has inevitably limited progress in this field. The fish model medaka occupies a uniquely powerful position since its reproduction is strictly restricted to long days and it also offers a wide range of genetic tools for exploring, in depth, various molecular and cellular control mechanisms.

      For these reasons, this manuscript by Tagui and colleagues is particularly valuable. It uses the medaka to explore links bridging photoperiod, feeding behaviour, and reproduction. The authors demonstrate that in female, but not male medaka, photoperiod-induced reproduction is associated with an increase in feeding, presumably explained by the high metabolic cost of producing eggs on a daily basis during the reproductive period. Using RNAseq analysis of the brain, they reveal that the expression of the neuropeptides agrp and npy that have been previously implicated in the regulation of feeding behaviour in mice are upregulated in the medaka brain during exposure to long photoperiod conditions. Unlike the situation in mice, these two neuropeptides are not co-expressed in medaka neurons, and food deprivation in medaka led to increases in agrp but also a decrease in npy expression. Furthermore, the situation in fish may be more complicated than in mice due to the presence of multiple gene paralogs for each neuropeptide. Exposure to long-day conditions increases agrp1 expression in medaka as the result of increases in the number of neurons expressing this neuropeptide, while the increase in npyb levels results from increased levels of expression in the same population of cells. Using ovariectomized medaka and in situ hybridization assays, the authors reveal that the regulation of agrp1 involves estrogen acting via the estrogen receptor esr2a. Finally, a loss of agrp1 function mutant is generated where the female mutants fail to show the characteristic increase in feeding associated with long-day enhanced reproduction as well as yielding reduced numbers of eggs during spawning.

      Strengths:

      This manuscript provides important foundational work for future investigations aiming to elucidate the coordination of photoperiod sensing, feeding activity, and reproduction function. The authors have used a combination of approaches with a genetic model that is particularly well suited to studying photoperiodic-dependent physiology and behaviour. The data are clear and the results are convincing and support the main conclusions drawn. The findings are relevant not only for understanding photopriodic responses but also provide more general insight into links between reproduction and feeding behaviour control.

      Weaknesses:

      Some experimental models used in this study, namely ovariectomized female fish and juvenile fish have not been analysed in terms of their feeding behaviour and so do not give a complete view of the position of this feeding regulatory mechanism in the context of reproduction status. Furthermore, the scope of the discussion section should be expanded to speculate on the functional significance of linking feeding behaviour control with reproductive function.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors use the teleost medaka as an animal model to study the effect of seasonal changes in day-length on feeding behaviour and oocyte production. They report a careful analysis of how day-length affects female medakas and a thorough molecular genetic analysis of genes potentially involved in this process. They show a detailed analysis of two genes and include a mutant analysis of one gene to support their conclusions

      Strengths:

      The authors pick their animal model well and exploit the possibilities to examine in this laboratory model the effect of a key environmental influence, namely the seasonal changes of day-length. The phenotypic changes are carefully analysed and well-controlled. The mutational analysis of the agrp1 by a ko-mutant provides important evidence to support the conclusions. Thus this report exceeds previous findings on the function of agrp1 and npyb as regulators of food-intake and shows how in medaka these genes are involved in regulating the organismal response to an environmental change. It thus furthers our understanding of how animals react to key exogenous stimuli for adaptation.

      Weaknesses:

      The authors are too modest when it comes to underscoring the importance of their findings. Previous animal models used to study the effect of these neuropeptides on feeding behaviour have either lost or were most likely never sensitive to seasonal changes of day length. Considering the key importance of this parameter on many aspects of plant and animal life it could be better emphasised that a suitable animal model is at hand that permits this. The molecular characterization of the agrp1 ko-mutant that the authors have generated lacks some details that would help to appreciate the validity of the mutant phenotype. Additional data would help in this respect.

      We would like to thank Reviewer #1 for the really constructive advice. In the revised manuscript, we will try to provide more information on the molecular characterization of the agrp1 KO-mutant and to emphasize the importance of our present animal model that permits the analysis of neuropeptide effects on feeding behavior in response to seasonal changes of day length.

      Reviewer #2 (Public review):

      Summary:

      The authors investigated the mechanisms behind breeding season-dependent feeding behavior using medaka, a well-known photoperiodic species, as a model. Through a combination of molecular, cellular, and behavioral analyses, including tests with mutants, they concluded that AgRP1 plays a central role in feeding behavior, mediated by ovarian estrogenic signals.

      Strengths:

      This study offers valuable insights into the neuroendocrine mechanisms that govern breeding season-dependent feeding behavior in medaka. The multidisciplinary approach, which includes molecular and physiological analyses, enhances the scientific contribution of the research.

      Weaknesses:

      While medaka is an appropriate model for studying seasonal breeding, the results presented are insufficient to fully support the authors' conclusions.

      Specifically, methods and data analyses are incomplete in justifying the primary claims:<br /> - the procedure for the food intake assay is unclear;

      - the sample size is very small;

      - the statistical analysis is not always adequate.

      Additionally, the discussion fails to consider the possible role of other hormones that may be involved in the feeding mechanism.

      We would like to thank Reviewer #2 for the helpful comments. As the reviewer suggested, we will try to edit the paragraph describing the procedure for the food intake assay to make it much easier for the readers to understand in the revised manuscript. In Figure 1-Supplementary figure 2, RNAseq was performed to search for the candidate neuropeptides, and that’s why the sample size was the minimum. On the other hand, each group in the other experiments consist of n ≥ 5 samples, which is usually accepted to be an adequate sample size in various studies (cf. Kanda et al., Gen Comp Endocrinol., 2011, Spicer et al., Biol Reprod., 2017). As for the statistical analyses, we will revise our manuscript so that the readers may be convinced with the validity of our statistical analyses.

      Reviewer #3 (Public review):

      Summary:

      Understanding the mechanisms whereby animals restrict the timing of their reproduction according to day length is a critical challenge given that many of the most relevant species for agriculture are strongly photoperiodic. However, the principal animal models capable of detailed genetic analysis do not respond to photoperiod so this has inevitably limited progress in this field. The fish model medaka occupies a uniquely powerful position since its reproduction is strictly restricted to long days and it also offers a wide range of genetic tools for exploring, in depth, various molecular and cellular control mechanisms.

      For these reasons, this manuscript by Tagui and colleagues is particularly valuable. It uses the medaka to explore links bridging photoperiod, feeding behaviour, and reproduction. The authors demonstrate that in female, but not male medaka, photoperiod-induced reproduction is associated with an increase in feeding, presumably explained by the high metabolic cost of producing eggs on a daily basis during the reproductive period. Using RNAseq analysis of the brain, they reveal that the expression of the neuropeptides agrp and npy that have been previously implicated in the regulation of feeding behaviour in mice are upregulated in the medaka brain during exposure to long photoperiod conditions. Unlike the situation in mice, these two neuropeptides are not co-expressed in medaka neurons, and food deprivation in medaka led to increases in agrp but also a decrease in npy expression. Furthermore, the situation in fish may be more complicated than in mice due to the presence of multiple gene paralogs for each neuropeptide. Exposure to long-day conditions increases agrp1 expression in medaka as the result of increases in the number of neurons expressing this neuropeptide, while the increase in npyb levels results from increased levels of expression in the same population of cells. Using ovariectomized medaka and in situ hybridization assays, the authors reveal that the regulation of agrp1 involves estrogen acting via the estrogen receptor esr2a. Finally, a loss of agrp1 function mutant is generated where the female mutants fail to show the characteristic increase in feeding associated with long-day enhanced reproduction as well as yielding reduced numbers of eggs during spawning.

      Strengths:

      This manuscript provides important foundational work for future investigations aiming to elucidate the coordination of photoperiod sensing, feeding activity, and reproduction function. The authors have used a combination of approaches with a genetic model that is particularly well suited to studying photoperiodic-dependent physiology and behaviour. The data are clear and the results are convincing and support the main conclusions drawn. The findings are relevant not only for understanding photopriodic responses but also provide more general insight into links between reproduction and feeding behaviour control.

      Weaknesses:

      Some experimental models used in this study, namely ovariectomized female fish and juvenile fish have not been analysed in terms of their feeding behaviour and so do not give a complete view of the position of this feeding regulatory mechanism in the context of reproduction status. Furthermore, the scope of the discussion section should be expanded to speculate on the functional significance of linking feeding behaviour control with reproductive function.

      We would like to thank Reviewer #3 for the insightful advice. We will try to revise several pertinent sentences describing the ovariectomized female fish and juvenile fish so that our present experimental results will give more complete view of their feeding regulatory mechanism in the context of reproduction status. We will also try to expand and revise the discussion section to incorporate the valuable suggestion of Reviewer #3.

    1. eLife Assessment

      This study presents a valuable conceptual approach that cell lineage can be determined using methylation data. However, the evidence supporting the claims of the author is currently inadequate. If the author could carry out some additional experiments as well as explore alternative explanations for the current data, this approach could be of broad interest to neuroscientists and developmental biologists.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Shibata describes a method to assess rapidly fluctuating CpG sites (fCpGs) from single-cell methylation sequencing (sc-MeSeq) data. Assuming that fCpGs are largely consistent over time with changes induced by inheritable events during replication, the author infers lineage relationships in available brain-derived sc-MeSeq. Supplementing current lineage tracing through genomic and mitochondrial mosaic variants is an interesting concept that could supplement current work or allow additional lineage analysis in existing data.

      However, the author failed to convincingly show the power of fCpG analysis to determine lineages in the human brain. While the correlation with cellular division and distinction of cell types appears plausible and strong, the application to detect specific lineages is less convincing. Aspects of this might be due to a lack of clarity in presentation and erroneous use of developmental concepts. However, without addressing these problems it is challenging for a reader to come to the same conclusions as the author.

      On the flip side, this novel application of fCpGs will allow the re-use of existing sc-MeSeq to infer additional features that were previously unavailable, once the biological relevance has been further elucidated.

      Strengths:

      (1) Novel re-analysis application of methylation data to infer the status of fCpGs and the use as a lineage marker.

      (2) Application of this method to an innovative existing data set to benchmark this framework against existing developmental knowledge.

      Weaknesses:

      (1) Insufficient clarity when presenting results (this includes an incredible shortness of the methods section making an informed assessment very difficult). This makes it hard to fully grasp and evaluate the presented results.

      (2) Inconsistent or erroneous use of neurodevelopmental concepts which hinders appropriate interpretation of the results.

      (3) Lack of consideration for alternative explanations for the observed data (i.e., considering fCpGs as a cellular division clock that diverges over 'time').

    3. Reviewer #2 (Public review):

      The manuscript by Shibata proposed a potentially interesting idea that variation in methylcytosine across cells can inform cellular lineage in a way similar to single nucleotide variants (SNVs). The work builds on the hypothesis that the "replication" of methylcytosine, presumably by DNMT1, is inaccurate and produces stochastic methylation variants that are inherited in a cellular lineage. Although this notion can be correct to some extent, it does not account for other mechanisms that modulate methylcytosines, such as active gain of methylation mediated by DNMT3A/B activity and activity demethylation mediated by TET activity. In some cases, it is known that the modulation of methylation is targeted by sequence-specific transcription factors. In other words, inaccurate DNMT1 activity is only one of the many potential ways that can lead to methylation variants, which fundamentally weakens the hypothesis that methylation variants can serve as a reliable lineage marker. With that being said (being skeptical of the fundamental hypothesis), I want to be as open-minded as possible and try to propose some specific analyses that might better convince me that the author is correct. However, I suspect that the concept of methylation-based lineage tracing cannot be validated without some kind of lineage tracing experiment, which has been successfully demonstrated for scRNA-seq profiling but not yet for methylation profiling (one example is Delgado et al., nature. 2022).

      (1) The manuscript reported that fCpG sites are predominantly intergenic. The author should also score the overlap between fCpG sites and putative regulatory elements and report p-values. If fCpG sites commonly overlap with regulatory elements, that would increase the possibility that these sites being actively regulated by enhancer mechanisms other than maintenance methyltransferase activity.

      (2) The overlap between fCpG and regulatory sequence is a major alternative explanation for many of the observations regarding the effectiveness of using fCpG sites to classify cell types correctly. One would expect the methylation level of thousands of enhancers to be quite effective in distinguishing cell types based on the published single-cell brain methylome works.

      (3) The methylation level of fCpG sites is higher in hindbrain structures and lower in forebrain regions. This observation was interpreted as the hindbrain being the "root" of the methylation barcodes and, through "progressive demethylation" produced the methylation states in the forebrain. This interpretation does not match what is known about methylation dynamics in mammalian brains, in particular, there is no data supporting the process of "progressive demethylation". In fact, it is known that with the activation of DNMT3A during early postnatal development in mice or humans (Lister et al., 2013. Science), there is a global gain of methylation in both CH and CG contexts. This is part of the broader issue I see in this manuscript, which is that the model might be correct if "inaccurate mC replication" is the only force that drives methylation dynamics. But in reality, active enzymatic processes such as the activation of DNMT3A have a global impact on the methylome, and it is unclear if any signature for "inaccurate mC replication" survives the de novo methylation wave caused by DNMT3A activity.

      (3) Perhaps one way the author could address comment 3 is to analyze methylome data across several developmental stages in the same brain region, to first establish that the signal of "inaccurate mC replication" is robust and does not get erased during early postnatal development when DNMT3A deposits a large amount of de novo methylation.

      (4) The hypothesis that methylation barcodes are homogeneous among progenitor cells and more polymorphic in derived cells is an interesting one. However, in this study, the observation was likely an artifact caused by the more granular cell types in the brain stem, intermediate granularity in inhibitory cells, and highly continuous cell types in cortical excitatory cells. So, in other words, single-cell studies typically classify hindbrain cell types that are more homogenous, and cortical excitatory cells that are much more heterogeneous. The difference in cell type granularity across brain structures is documented in several whole-brain atlas papers such as Yao et al. 2023 Nature part of the BICCN paper package.

      (5) As discussed in comment 2, the author needs to assess whether the successful classification of cell types (brain lineage) using fCpG was, in fact, driven by fCpG sites overlapping with cell-type specific regulatory elements.

      (6) In Figure 5E, the author tried to address the question of whether methylation barcodes inform lineage or post-mitotic methylation remodeling. The Y-axis corresponds to distances in tSNE. However, tSNE involves non-linear scaling, and the distances cannot be interpreted as biological distances. PCA distances or other types of distances computed from high-dimensional data would be more appropriate.

    4. Reviewer #3 (Public review):

      Summary:

      In the manuscript entitled "Human Brain Barcodes", the author sought to use single-cell CpG methylation information to trace cell lineages in the human brain.

      Strengths:

      Tracing cell lineages in the human brain is important but technically challenging. Lineage tracing with single-cell CpG methylation would be interesting if convincing evidence exists.

      Weaknesses:

      As the author noted, "DNA methylation patterns are usually copied between cell division, but the replication errors are much higher compared to base replication". This unstable nature of CpG methylation would introduce significant problems in inferring the true cell lineage. The unreliable CpG methylation status also raises the question of what the "Barcodes" refer to in the title and across this study. Barcodes should be stable in principle and not dynamic across cell generations, as defined in Reference#1. It is not convincing that the "dynamic" CpG methylation fits the "barcodes" terminology. This problem is even more concerning in the last section of results, where CpG would fluctuate in post-mitotic cells.

    5. Author response:

      I thank the Senior Editor and the three reviewers for their consideration and careful assessments, which I find fair and justified. I agree the evidence is inadequate that single cell fluctuating CpG DNA methylation allows for human neuron lineage tracing. I agree with Reviewer #1 that fCpGs essentially function as “a cellular division clock that diverges over time”, but that fCpG methylation also records ancestry because cells with more similar patterns should be more related than cells with different patterns. However, as noted, there are alternative explanations that could explain fCpG DNA methylation pattern neuronal differences, or potentially obscure ancestry recorded by replication errors. Lineage tracing with fCpG methylation previously appeared possible in human intestines, endometrium, and blood, and potentially a similar approach could be used to reconstruct human brain cell ancestries.

      I intend to revise the manuscript in a few weeks to address points raised by reviewers. These include a) editing to improve clarity and correct neurodevelopmental concepts, and b) adding a supplement that explains in much more detail how fCpG methylation may record cell divisions and ancestries. As recommended, additional “experiments” will be added including a) an analysis of single cell zygote to inner cell mass data to illustrate how fCpG brain barcode methylation changes between cell divisions very early in development before neurogenesis, and b) an analysis of newly released single cell brain aging data (Chien et al., 2024, Neuron 112, 2524–2539, August 7, 2024) that should help address issues of reproducibility and barcode stability over time. The evidence for lineage tracing will still be incomplete, but the modifications should help support the idea that fCpG methylation can record somatic cell ancestries.

    1. eLife Assessment

      SCARF1 is a scavenger membrane-bound receptor that binds modified versions of lipoproteins and has a significant role in maintaining lipid homeostasis. This useful study reports the crystal structure of SCARF1 and identifies putative binding sites for modified lipoproteins. Supported by a convincing set of experimental approaches, this study advances our knowledge of how scavenger receptors clear modified lipoproteins to maintain lipid homeostasis.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study provides an incremental advance to the scavenger receptor field by reporting the crystal structures of the domains of SCARF1 that bind modified LDL such as oxidized LDL and acylated LDL. The crystal packing reveals a new interface for the homodimerization of SCARF1. The authors characterize SCARF1 binding to modified LDL using flow cytometry, ELISA, and fluorescent microscopy. They identify a positively charged surface on the structure that they predict will bind the LDLs, and they support this hypothesis with a number of mutant constructs in binding experiments.

      Strengths:

      The authors have crystallized domains of an understudied scavenger receptor and used the structure to identify a putative binding site for modified LDL particles. An especially interesting set of experiments is the SCARF1 and SCARF2 chimeras, where they confer binding of modified LDLs to SCARF2, a related protein that does not bind modified LDLs, and use show that the key residues in SCARF1 are not conserved in SCARF2.

      Weaknesses:

      While the data largely support the conclusions, the figures describing the structure are cursory and do not provide enough detail to interpret the model or quality of the experimental X-ray structure data. Additionally, many of the flow cytometry experiments lack negative controls for non-specific LDL staining and controls for cell surface expression of the SCARF constructs. In several cases, the authors interpret single data points as increased or decreased affinity, but these statements need dose-response analysis to support them. These deficiencies should be readily addressable by the authors in the revision.

      The paper is a straightforward set of experiments that identify the likely binding site of modified LDL on SCARF1 but adds little in the way of explaining or predicting other binding interactions. That a positively charged surface on the protein could mediate binding to LDL particles is not particularly surprising. This paper would be of greater importance if the authors could explain the specificity of the binding of SCARF1 to the various lipoparticles that it does or does not bind. Incorporating these mutants into an assay for the biological role of SCARF1 would be powerful.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Wang and colleagues provided mechanistic insights into SCARF1 and its interactions with the lipoprotein ligands. The authors reported two crystal structures of the N-terminal fragments of SCARF1 ectodomain (ECD). On the basis of the structural analysis, the authors further investigated the interactions between SCARF1 and modified LDLs using cell-based assays and biochemical experiments. Together with the two structures and supporting data, this work provided new insights into the diverse mechanisms of scavenger receptors and especially the crucial role of SCARF1 in lipid metabolism.

      Strengths:

      The authors started by determining the crystal structures of two fragments of SCARF1 ECD. The superposition of the two high-resolution structures, together with the predicted model by AlphaFold, revealed that the ECD of SCARF1 adopts a long-curved conformation with multiple EGF-like domains arranged in tandem. Non-crystallographic and crystallographic two-fold symmetries were observed in crystals of f1 and f2 respectively, indicating the formation of SCARF1 homodimers. Structural analysis identified critical residues involved in dimerization, which were validated through mutational experiments. In addition, the authors conducted flow cytometry and confocal experiments to characterize cellular interactions of SCARF1 with lipoproteins. The results revealed the vital role of the 133-221aa region in the binding between SCARF1 and modified LDLs. Moreover, four arginine residues were identified as crucial for modified LDL recognition, highlighting the contribution of charge interactions in SCARF1-lipoprotein binding. The lipoprotein binding region is further validated by designing SCARF1/SCARF2 chimeric molecules. Interestingly, the interaction between SCARF1 and modified LDLs could be inhibited by teichoic acid, indicating potential overlap in or sharing of binding sites on SCARF1 ECD.

      The author employed a nice collection of techniques, namely crystallographic, SEC, DLS, flow cytometry, ELISA, and confocal imaging. The experiments are technically sound and the results are clearly written, with a few concerns as outlined below. Overall, this research represents an advancement in the mechanistic investigation of SCARF1 and its interaction with ligands. The role of scavenger receptors is critical in lipid homeostasis, making this work of interest to the eLife readership.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Wang et. al. described the crystal structures of the N-terminal fragments of Scavenger receptor class F member 1 (SCARF1) ectodomains. SCARF1 recognizes modified LDLs, including acetylated LDL and oxidized LDL, and it plays an important role in both innate and adaptive immune responses. They characterized the dimerization of SCARF1 and the interaction of SCARF1 with modified lipoproteins by mutational and biochemical studies. The authors identified the critical residues for dimerization and demonstrated that SCARF1 may function as homodimers. They further characterized the interaction between SCARF1 and LDLs and identified the lipoprotein ligand recognition sites, the highly positively charged areas. Their data suggested that the teichoic acid inhibitors may interact with SCARF1 in the same areas as LDLs.

      Strengths:

      The crystal structures of SCARF1 were high quality. The authors performed extensive site-specific mutagenesis studies using soluble proteins for ELISA assays and surface-expressed proteins for flow cytometry.

      Weaknesses:

      (1) The schematic drawing of human SCARF1 and SCARF2 in Fig 1A did not show the differences between them. It would be useful to have a sequence alignment showing the polymorphic regions.

      The schematic drawing in Fig.1A is to give a brief idea about the two molecules, the sequence alignment may take too much space in the figure. A careful alignment between SCARF1 and SCARF2 can be found in Ref. 24 (Ishii, et al., J Biol Chem, 2002. 277, 39696-702) an also mentioned in p.4.

      (2) The description of structure determination was confusing. The f1 crystal structure was determined by SAD with Pt derivatives. Why did they need molecular replacement with a native data set? The f2 crystal structure was solved by molecular replacement using the structure of the f1 fragment. Why did they need to use EGF-like fragments predicted by AlphaFold as search models?

      The crystal structure of f1 was first determined by SAD using Pt derivatives, but soaking of Pt reduced the resolution of the crystals, therefore we use this structure as a search model for a native data set that had higher resolution for further refinement. For the structural determination of f2, the molecular replacement using f1 structure was not able to show the initial density of the extra region in f2 (residues 133-209), which was missing in f1. Therefore, the EGF-like domains of SCARF1 modeled by AlphaFold were applied as search models for this region (p.18).

      (3) It's interesting to observe that SCARA1 binds modified LDLs in a Ca2+-independent manner. The authors performed the binding assays between SCARF1 and modified LDLs in the presence of Ca2+ or EDTA on Page 9. However, EDTA is not an efficient Ca2+ chelator. The authors should have performed the binding assays in the presence of EGTA instead.

      The binding assays in the presence of EGTA are included in the revised manuscript (Fig. S7) (p.9), which also suggest that SCARA1 binds OxLDL in a Ca2+-independent manner.

      (4) The authors claimed that SCARF1Δ353-415, the deletion of a C-terminal region of the ectodomain, might change the conformation of the molecule and generate hinderance for the C-terminal regions. Why didn't SCARF1Δ222-353 have a similar effect? Could the deletion change the interaction between SCARF1 and the membrane? Is SCARF1Δ353-415 region hydrophobic?

      The truncation mutants were constructed to roughly locate the binding region of lipoproteins on SCARF1, and the overall results showed that the sites might locate at the region of 133-221. Mutant Δ222-353 may also affect the conformation, but it still had binding with OxLDL like wild type, suggesting the binding sites were retained in this mutant. Mutant Δ353-415 showed a reduction of binding, implying that the binding sites might be retained but binding was affected, we think it might be due to the conformational change that could reduce the binding or accessibility of lipoproteins. Since this region locates closer to the membrane, it’s possible that it may change the interaction with the membrane. In the AF model, Δ353-415 region does not seem to be more hydrophobic than other regions (Fig. S2C).

      (5) What was the point of having Figure 8? Showing the SCARF1 homodimers could form two types of dimers on the membrane surface proposed? The authors didn't have any data to support that.

      Fig. 8 shows a potential model of the SCARF1 dimers on the cell surface by combining the structural information from crystals and AF predictions. The two dimers in the figure are identical but with different viewing angles. The lipoprotein binding sites are also indicated (Fig. 8).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors need to show examples of the electron density for both structures.

      Electron density examples of the two structures are shown in Fig. S2A.

      Figure 1)

      The figure does not show enough details of the structure. The text mentions hydrogen-bond and disulfide bonds that stabilize the loops, these should be shown.

      Disulfide bonds of the two structures are shown in Fig. 1.

      Figure 2)

      D) The full gel should be shown.

      E) Rather than just relying on changes in gel filtration elution volumes, the authors do the appropriate experiment and measure the hydrodynamic radius of the WT and mutant ectodomains by DLS. However, they need to show plots of the size distribution, not just mean radial values, in order to show if the sample is monodisperse.

      The full gel and plots of DLS are shown in Fig. S3A-B.

      Figure 3)

      I have concerns about the rigor of the experiments in panels A-D. The authors include a non-transfected control but do not appear to have treated non-transfected cells with the lipoproteins to evaluate the specificity of binding. Every cell binding assay (flow  or confocal) must show the data from non-transfected cells treated with each lipoprotein, as each lipoprotein species could have a unique non-specific binding pattern. The authors show these controls in Figure 6, but these controls are necessary in every experiment.

      In Fig. 3A, since several lipoproteins were included in the figure, we use non-transfected cells without lipoprotein treatment as a negative control. The OxLDL or AcLDL treated non-transfected cells were also used as negative controls and shown in Fig. 3B-C. LDL, HDL or OxHDL may have their own non-specific binding patterns, the treatment of LDL, HDL or OxHDL with the transfected cells all gave negative results (Fig. 3A and D).

      Cell-surface of the SCARF1 variants is a major concern. The constructs the authors use are tagged with a GFP on the cytosolic side. However, the Methods to do indicate if they gate on GFP+ transfected cells for analytical flow. Such gating may have been used because the staining experiments in Figures 3 and 4 show uniform cell populations, whereas the staining done with an anti-SCARF1 Ab in S4 shows most of the cells not expressing the protein on the surface. Please clarify.

      Data for the anti-SCARF1 Ab assay is gated for GFP in the revised Fig. S4, and  the non-transfected cells are included as a control.

      The authors must demonstrate cell-surface staining with an epitope tag on the extracellular side and clarify if the analyzed cells are gated for surface expression. The anti-SCARF antibody used in S4 may not recognize the truncated or mutant SCARFs equally. Cell-surface expression in the flow experiments cannot be inferred from confocal experiments because the flow experiments have a larger quantitative range.

      Anti-SCARF1 antibody assay provides an estimation of the surface expression of the proteins. If the epitope of the antibody was mutated or removed in the mutants, most likely it would lose binding activity. Including an epitope tag on the ectodomain could be an option, but if truncation or mutation changes the conformation of the ectodomain, the accessibility of the epitope may also be affected, and addition of an extra sequence or domain, such as an epitope tag, may affect the surface expression of proteins sometimes.

      In several places, the authors infer increased or decreased affinity from mean fluorescent intensity values of a single concentration point without doing appropriate dose-curves. These experiments need to be done or else the mentions of changes in apparent affinities should be removed.

      We add a concentration for the WT interaction with OxLDL (Fig. S6, p.9) and the manuscript is also modified accordingly.

      Figure 7

      The concentration of teichoic acid used to inhibit modified LDL binding should be indicated and a dose-curve analysis should be done comparing teichoic acid to some non-inhibitory bacterial polymer.

      The concentration of teichoic acids used in the inhibition assays is 100 mg/ml (p.21). Unfortunately, we don’t have other bacterial polymers in the lab and not sure about the potential inhibitory effects.

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      (1) The SCARF1 ECD contains three N-linked glycosylation sites (N289, N382, N393). It remains unclear whether these modifications are involved in SCARF1 binding to modified LDLs. Is it possible to design some experiments to investigate the effect of N-glycans on the recognition of modified LDLs? In particular, N382 and N393 are included in 353-415aa and the truncation mutant of SCARF1Δ353-415aa resulted in reduced binding with OxLDL in Fig.3G. Or whether the reduced binding is only due to the potential conformational changes caused by the deletion of the C-terminal region of the ECD?

      A previous study regarding the N-glycans (N289, N382, N393) of SCARF1 (ref.17) has shown that they may affect the proteolytic resistance, ligand-binding affinity and subcellular localization of SCARF1, which is not quite surprising as lipoproteins are large particles, the N-glycans on the surface of SCARF1 could affect accessibility or affinity for lipoproteins. But the exact roles of each glycan could be difficult to clarify as they might also be involved in protein folding and trafficking.

      The reduction of the binding of OxLDL for the mutant SCARF1 Δ353-415aa may be due to the conformational change or the loss of the glycans or both.

      (2) The authors speculated that the dimeric form of SCARF1 may be more efficient in recognizing lipoproteins on the cell surface. Please highlight the critical region/sites for ligand binding in Figure 8 and discuss the structural basis of dimerization improving the binding.

      The binding sites for lipoproteins on SCARF1 are indicated in Fig. 8. According to our data, it might be possible the conformation of the dimeric form of SCARF1 makes it more accessible to the ligands on the cell surface as implied by flow cytometry (p.14-15), but still needs further evidence on this.

      (3) Could the two salt bridges (D61-K71, R76-D98) observed in f1 crystals be found in f2 crystals? They seemed to be a little far from the defined dimeric interface (F82, S88, Y94) and how important are these to SCARF1 dimerization?

      The two salt bridges observed in f1 crystal are not found in f2 crystal (distances are larger than 5.0 Å), suggesting they are not required for dimerization (p. 7-8), but may be helpful in some cases.

      (4) The monomeric mutants (S88A/Y94A, F82A/S88A/Y94A) exhibited opposite affinity trends to OxLDL in ELISA and flow cytometry. The authors proposed steric hinderance of the dimers coated onto the plates as the potential explanation for this observation. However, the method of ELISA stated that OxLDLs, instead of SCARF1 ECD, were coated onto the plates. So what's the underlying reason for the inconsistency in different assays?

      Thanks. ELISA was done by coating OxLDLs on the plates as described in the Methods. But still, a dimeric form of SCARF1 may only bind one OxLDL coated on the plates due to steric hinderance. We correct this on p.12.

      Minor points:

      (1) Figure 2D and Figure S3 - please label the molecular weight marker on the SEC traces to indicate the native size of various purified proteins.

      The elution volume of SEC not only reflects the molecular weight, but it’s also affected by the conformation or shape of protein. The ectodomain of SCARF1 has a long curved conformation, the elution volumes of the monomeric or dimeric forms of SCARF1 do not align well with the standard molecular weight marker and elute much earlier in SEC. We include the standard molecular weight marker in Fig. S3C-D.

      (2) Could the authors provide SEC profiles of f1 and f2 that were used in crystallographic study?

      The SEC profiles of f1 and f2 for crystallization are shown in Fig. S5 (p.6).

      (3) The legend of Figure 3A states that the NC in flow cytometry assay represents the non-transfected cells, but please confirm whether the NC in Fig. 3A-C corresponds to non-transfected cells or no lipoprotein.

      NC in Fig. 3A represents the non-transfected cells, and no lipoproteins were added in this case as several lipoproteins are included in Fig. 3A. The lipoprotein (OxLDL or AcLDL) treated non-transfected cells (NC) were shown in Fig. 3B-C as negative controls.

    3. Reviewer #1 (Public Review):

      Summary:

      This manuscript provides a solid advance to the scavenger receptor field by reporting the crystal structures of the domains of SCARF1 that bind modified LDL such as oxidized LDL and acylated LDL. The crystal packing reveals a new interface for homodimerization of SCARF1. The authors characterize SCARF1 binding to modified LDL using flow cytometry, ELISA, and fluorescent microscopy. They identify a positively-charged surface on the structure that they predict will bind the LDLs, and they support this hypothesis with several mutant constructs in binding experiments.

      Strengths:

      The authors have crystallized domains of an understudied scavenger receptor and used the structure to identify a putative binding site for modified LDL particles. An especially strong set of experiments are binding studies with chimeras of SCARF1 and SCARF2, where they show gain-of-function results (binding of modified LDLs) by SCARF2, a related protein that does not normally bind modified LDLs. The paper is a straightforward set of experiments that identify the likely binding site of modified LDL on SCARF1

      Weaknesses:

      In the current revision, the authors addressed my technical concerns.<br /> Two remaining considerations that may limit the broader impact of this paper are 1) that it does not explain the structural basis for specificity of the binding of SCARF1 to various lipoproteins (i.e. why SCARF1 binds oxLDL and AcLDL but not LDL or HDL) and 2) a lack of a biological assay to interpret the functional consequences of the SCARF1 mutants. These may be addressed in future work.

    1. eLife Assessment

      The article uses a cell-based model to investigate how mutations and cells spread throughout a tumour. The paper uses published data and the proposed model to understand how growth and death mechanisms lead to the observed data. This work provides an important insight into the early stages of tumour development. From the work provided here, the results are convincing, using a thorough analysis.

    2. Reviewer #1 (Public review):

      Summary:

      Arman Angaji and his team delved into the intricate world of tumor growth and evolution, utilizing a blend of computer simulations and real patient data from liver cancer.

      Strengths:

      Their analysis of how mutations and clones are distributed within tumors revealed an interesting finding: tumors don't just spread from their edges as previously believed. Instead, they expand both from within and the edges simultaneously, suggesting a unique growth mode. This mode naturally indicates that external forces may play a role in cancer cells dispersion within the tumor. Moreover, their research hints at an intriguing phenomenon - the high death rate of progenitor cells and extremely slow pace in growth in the initial phase of tumor expansion. Understanding this dynamic could significantly impact our comprehension of cancer development.

      Weaknesses:

      It's important to note, however, that this study relies on specific computer models, metrics derived from inferred clones, and a limited number of patient data. While the insights gained are promising, further investigation is essential to validate these findings. Nonetheless, this work opens up exciting avenues for comprehending the evolution of cancers.

      Comments on revised submission:

      The authors have effectively addressed my concerns. This revision is excellent.

    3. Reviewer #2 (Public review):

      Summary:

      The article uses a cell-based model to investigate how mutations and cells spread throughout a tumour. The paper uses published data and the proposed model to understand how growth and death mechanisms lead to the observed data. This work provides an insight into the early stages of tumour development. From the work provided here, the results are solid, showing a thorough analysis. The article is well written and presents a very suitable and rigorous analysis to describe the data. The authors did a particularly nice job of the discussion and decision of their "metrics of interest", though this is not the main aim of this work.

      Strengths:

      Due to the particularly nice and tractable cell-based model, the authors are able to perform a thorough analysis to compare the published data to that simulated with their model. They then used their computational model to investigate different growth mechanisms of volume growth and surface growth. With this approach, the authors are able to compare the metric of interest (here, the direction angle of a new mutant clone, the dispersion of mutants throughout the tumour) to quantify how the different growth models compare to the observed data. The authors have also used inference methods to identify model parameters based on the data observed. The authors performed a rigorous analysis and have chosen the metrics in an appropriate manner to compare the different growth mechanisms.

      Context:

      Improved mechanistic understanding into the early developmental stages of tumours will further assist in disease treatment and quantification. Understanding how readily and quickly a tumour is evolving is key to understanding how it will develop and progress. This work provides a solid example as to how this can be achieved with data alongside simulated models.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reports:

      In the public reports there is only one point we would like to discuss. It concerns our use of a computational model to analyse spatial tumour growth. Citing from the eLife assessment, which reflects several comments of the referees:

      The paper uses published data and a proposed cell-based model to understand how growth and death mechanisms lead to the observed data. This work provides an important insight into the early stages of tumour development. From the work provided here, the results are solid, showing a thorough analysis. However, the work has not fully specified the model, which can lead to some questions around the model’s suitability.

      The observables we use to determine the (i) growth mode and the (ii) dispersion of cells are modelindependent. The method to determine the (iii) rate of cell death does not use a spatial model. Throughout, our computational model of spatial growth is not used to analyze data. Instead, it is used to check that the observables we use can actually discriminate between different growth modes given the limitations of the data. We have expanded the description of the computational model in the revised version, and have released our code on Github. However, the conclusions we reach do not rely on a computational model. Instead, where we estimate parameters, we use population dynamics as described in section S5. The other observables are parameter free and model-independent. We view this as a strength of our approach.

      Recommendations for the authors:

      Reviewer #1:

      (1.1) In Figure 1, the data presented by Ling et al. demonstrate a distinctive “comb” pattern. While this pattern diverges from the conventional observations associated with simulated surface growth, it also differs from the simulated volume growth pattern. Is this discrepancy attributable to insufficient data? Alternatively, could the emergence of such a comb-like structure be feasible in scenarios featuring multiple growth centers, wherein clones congregate into spatial clusters?

      We are unsure what you are referring to. One possibility is you refer to the honey-comb structure formed by the samples of the Ling et al. data shown in Fig. 1A of the main text. This is an artefact arising from the cutting of the histological cut into four quadrants, see Fig. S1 in the SI of Ling et al. The perceived horizontal and vertical “white lines” in our Fig. 1A stems from the lack of samples near the edges of these quadrants. We have added this information to the figure caption.

      An alternative is you are referring to the peaks in Fig 2A of the main text. The three of these peaks indeed stem from individual clones. We have placed additional figures in the SI (S2 B and S2 C) to disentangle the contribution from different clones. The peaks have a simple explanation: each clone contributes the same weight to the histogram. If a clone only has few offspring, this statistical weight is concentrated on a few angles only, see SI Figure S2 B.

      (1.2) I am not sure why there are two sections about “Methods” in the main text: Line 50 as well as Line 293. Furthermore, the methods outlined in the main paper lack the essential details necessary for readers to navigate through the critical aspects of their analysis. While these details are provided in the Supplementary Information, they are not adequately referenced within the methods section of the main text. I would recommend that the authors revise the method sections of the main text to include pertinent descriptions of key concepts or innovations, while also directing readers to the corresponding supplementary method section for further elucidation.

      We have merged the Section “Materials and Methods” at the end of the main text with the SI description of the data in SI 4.2 and placed a reference to this material in the main body.

      (1.3) The impact of the particular push method (proposed in the model) on the resultant spatial arrangement of clones remains unclear. For instance, it’s conceivable that employing a different pushing method (for example, with more strict constraints on direction) could yield a varied pattern of spatial diversity. Furthermore, there is ambiguity regarding the criteria for determining the sequence of the queue housing overlapping cells.

      Regarding the off-lattice dynamics we use, there are indeed many variants one could use. In nonexhaustive trials, we found that the details of the off-lattice dynamics did not affect the results. The reason may be that at each computational step, each cell only moves a very small amount, and differences in the dynamics tend to average out over time.

      We deliberately do not give constraints on the direction. Such constraints emerge in lattice-based models (when preferred directions arise from the lattice symmetry), but these are artifacts of the lattice.

      At cell division the offspring is placed in a random direction next to the parent regardless of whether this introduces an overlap. Cells then push each other along the axis connecting their two centers of mass – unlike in lattice based models a sequence of pushes does not propagate through the tumor straight away but sets off of a cascade of pushes. Equal pushing of two cells (i.e. two initial displacements as opposed to pushing one of the two) results in the same patterns of directed, low dispersion surface and undirected, high dispersion volume growth but is much harder computationally as it reintroduces overlaps that have been resolved in the previous step.

      We have rewritten the description of the pushing queue in the SI Section 1. The choice of the pushing sequence is somewhat arbitrary but we found that it also has no noticable effect on the growth mode. Maybe putting it in contrast to depth-first approaches helps to illustrate this: We tried two queueing schemes for iterating through overlapping cells, width-first and depth-first. In both cases, we begin by scanning a given cell’s (the root’s) neighborhood for overlaps and shuffle the list of overlapping neighbours. In a width-first approach we then add this list to the queue. Subsequent iterations append their lists of overlapping cells to the queue, such that we always resolve overlaps within the neighborhood of the root first. A depth-first approach follows a sequence of pushes by immediately checking a pushed cell’s neighborhood for new overlaps and adding these to the front of the queue (which works more like a stack then). This can be efficiently implemented by recursion but has no noticeable performance advantage and results in the same patterns of directed, low dispersion surface and undirected, high dispersion volume growth. In our opinion the width-first approach of first resolving overlaps in the immediate neighborhood is more intuitive, which is why we adopted it for our simulation model.

      (1.4) For the example presented in S5.1, how can the author identify from genomic data that mutation 3 does not replace its ancestral clade mutation 2? In other words, if mutation 2, 3 and 4 are linked meaning clone 4 survives but 2 and 3 dies, how does one know if clone 3 dies before clone 2? I understand that this is a conceptual example, but if one cannot identify this situation from the real data, how can the clade turnover be computed?

      Thank you for this comment, which points to an error of ours in the turnover example of the SI: Clade 3 does in fact replace 2 and contributes to the turnover! (The algorithm correctly annotated clade 3 as orphaned and computes a turnover of 3/15 for this example). We have corrected this.

      In this example, it does not matter for the clade turnover whether clone 3 dies before clone 2. As long as its ancestor (clone 2) becomes extinct it adds to the clade turnover. The term “replaces” applies to the clade of 3 which has a surviving subclone and thereby eventually replaces clade 2. The clade turnover its solely based on the presence of the mutations (which define their clade) and not on the individual clones.

      (1.5) After reviewing reference 24 (Li et al.), I noticed that the assertions made therein contradict the findings presented in S3 (Mutation Density on Rings). Specifically, Li et al. state that “peripheral regions not only accumulated more mutations, but also contained more changes in genes related to cell proliferation and cell cycle function” (Page 6) and “Phylogenetic trees show that branch lengths vary greatly with the long-branched subclones tending to occur in peripheral regions” (Page 4). However, upon re-analysis of their data, the authors demonstrated a decrease in mutation density near the surface. It is crucial to comprehend the underlying cause of such a disparity.

      The reason for this disparity is the way Li et al. labelled samples as belonging to peripheral or central regions of the tumour. We have added a new figure in the SI to show this: Fig. S14 shows the number of mutations found in samples of Li et al. against their distances from the centre, along with the classification of samples as center/periphery given in Li et al. In the case of tumor T1, the classification of a sample in reference Li et al. does not agree with the distance from the center: samples classified as core are often more distant from the center than those classified as peripheral. Furthermore, Lewinsohn et al. (see below) show in their Fig. 5 that samples classified as ‘center’ by Li all fall into a single clade, and we believe this affects all results derived from this classification. For this reason, we do not consider the classification in reference 24 (Li et al.) further. We now briefly discuss this in Section S3.3.

      (1.6) The authors consider coinciding mutations to occur when offspring clades align with an ancestral clade. Nevertheless, since multiple mutations can arise simultaneously in a single generation (such as kataegis), it becomes essential to discern its impact on clade turnover and, consequently, the estimation of d/b.

      The mutational signatures found here show no sign of kataegis. Also, the number of polymorphic sites in the whole-exome data is small and the mutations are uniformly spread across the exome. The point is well taken, however, the method requires single mutations per generation. In practice, this can be achieved by subsampling a random part of the genome or exome (see [45]). We tested this point by processing the data from only a fraction of the exome; this did not change the results. In particular, Figure S30 shows the turnover-based inference for different subsampling rates L of the Ling et al. data. Subsampling of sites reduces the exome-wide mutation rate, the inferred rate scales linearly with L, as expected.

      (1.7) I could not understand Step 2 in Section S2.1, an illustration may be helpful.

      We have added figure S2 explaining the directional angle algorithm to Section S2.1 in the supplementary information.

      (1.8) Figure S2, does a large rhoc lead to volume growth rather than surface growth, not the other way around?

      Thank you for catching this mix-up!

      Reviewer #2

      I do have a few minor comments/questions, but I am confident the authors will be able to address them appropriately.

      (2.1) Line 56: I am not sure what the units of “average read depth 74X” is in terms of SI units?

      This number gives the number of sequence reads covering a particular nucleotide and is dimensionless. We have added this information.

      (2.2) Lines 63 - 68: I am unsure what is meant by the terms “T1 of ca.” and “T2 of ca.”. Can these also be explained/defined please?

      These refer to the approximate (circa) diameters of tumor 1 and tumor 2 in the data by Li et al. We have expanded the abbreviations.

      (2.3) Line 69: I would like to see a more extensive description of the cell-based model here in the main text, such as how do the cells move. Moreover, do cells have a finite reach in space, do they have a volume/area?

      We have expanded the model description in the main body of the paper and placed information there that previously was only in the SI.

      (2.4) Line 76: You have said cells can “push” one another in your model. Do they also “pull” one another? Cell adhesion is know to contribute to tumour integrity - so this seems important for a model of this nature.

      We have not implemented adhesive forces between pairs of cells so far. This would cause a higher pressure under cell growth (which can have important physiological consequences). However, the hard potential enforcing a distance between adjacent cells would still lead to cells pushing each other apart under population growth, so we expect to see the dispersion effect we discuss even when there is adhesion.

      (2.5) Line 80-81: “due to lack of nutrient”. Is nutrient included in this work? It is my understanding it is not. No problem if so, it is just that this line makes it seem like it is and important. If it is not, the authors should mention this in the same sentence.

      Thank you for pointing out this source of misunderstanding, your understanding is correct and we have modified the text to remove the ambiguity.

      (2.6) Line 94-95: Since you are interested in tissue growth, recent work has indicated how the cell boundary (and therefore tissue boundary) description influences growth. Please also be sure to indicate this when you describe the model.

      We presume you refer to the recent paper by Lewinsohn et al. (Nature Ecology and Evolution, 2023), which reports a phylogenetic analysis based on the Li et al. data. Lewinsohn et al. find that cells near the tumour boundary grow significantly faster than those in the tumour’s core. This is at variance with what we find; we were not aware of this paper at the time of submission. We now refer to this paper in the main text, and also have included a new section S3.4 in the SI accounting for this discrepancy. If you refer to a different paper, please let us know.

      Briefly, we repeat the analysis of Lewinsohn et al., using their algorithm on artificial data generated by our model under volume growth. Samples were placed precisely like they were placed in the tumor analyzed by Li et al. We find that, even though the data was generated by volume growth, the algorithm of Lewinsohn et al. finds a signal of surface growth, in many cases even stronger compared to the signal which Lewinsohn et al. find in the empirical data. We have added subsection S3.4 with new figure S15 in the Supplementary Information.

      (2.7) Line 107: “thus no evidence for enhanced cell growth near the edge of the tumour”. It is unclear to me how this tells us information relative to the tumour edge. It seems to me this is an artifact that at the edge of the tumour, there are less cells to compare with? Could you please expand on this a bit?

      The direction angles tell us if new mutations arise predominantly radially outwards. With this observable, surface growth would lead to a non-uniform distribution of these angles even if we restrict the analysis to samples from the interior of the tumor (which, under surface growth, was once near the surface). So the effect is not linked to fewer cells for comparison. Also, we have checked the direction angles in simulations under different growth modes with the samples placed in the same way as in the data (see Figs. S3 and S4 right panels). We have expanded the text in the main text, section Results accordingly.

      (2.8) I really enjoyed the clear explanation between lines 119 and 122 regarding cell dispersion!

      Thank you!

      (2.9) Figure 2B: Since you are looking at a periodic feature in theta, I would have expected the distribution to be periodic too, and therefore equal at theta=-180=180. Can you explain why it is different, please? Interestingly, you simulated data does seem to obey this!

      The distribution of theta is periodic but the binning and midpoints of bins were chosen badly. We have replotted the diagram with bin boundaries that handle the edge-points -180/180 correctly. Thank you for pointing this out.

      (2.10) Figure 3B: This plot does not have a title. Also, what do the red vertical lines in plots 3B, 3C and 3D indicate?

      We have added the title. The red lines indicate the expectation values of the distributions.

      (2.11) Figure 4: I am unsure how to read the plot in 4B. Also, what does the y-axis represent in 4C and 4D?

      We have added explanations for 4B and have placed the labels for 4C and 4D in the correct position on the y-axes.

      (2.12) Lines 194-199: you discuss your inferred parameters here, but you do not indicate how you inferred these parameters. May you please briefly mention how you inferred these, please?

      These were inferred using the turnover method explained in the paragraph above, we have expanded the information. A full account is given in the SI Section S5.

      (2.13) Line 258-260: “... mutagen (aristolochic acid) found in herbal traditional Chinese medicine and thought to cause liver cancer.” I do not see what this sentence adds to the work. Could you please be clearer with the claim you are making here?

      Mutational signatures allow to infer underlying mutational processes. The strongest signature found in the data is associated with a mutagen that has in the past been used in traditional Chinese medicines. The patients from whom the tumours were biopsied were from China, so past exposure to this potent mutagen is possible. We are not making a big claim here, the mutational signature of aristolochic acid and its cancerogenic nature has been well studied and is referenced here. The result is interesting in our context because in one of the datasets (Li et al.) the signature is present in early (clonal) mutations but absent in later ones, allowing to make inferences from present data on the past. We have added the information that the patients were from China.

      (2.14) In your Supplementary Information, S1, I believe your summation should not be over i, as you state in the following it is over cells within 7 cell radii. Please fix this by possibly defining a set which are those within 7 cell radii.

      We have done this.

    1. eLife Assessment

      This important study advances our understanding of maladaptive innate immune training. The experimental evidence supporting the conclusions is convincing with only a few clarifications required. The work will be of high interest to both researchers in the trained immunity field and clinician scientists.

    2. Reviewer #1 (Public review):

      Summary:

      The concept that trained immunity, as defined, can be beneficial to subsequent immune challenges is important in the broad context of health and disease. The significance of this manuscript is the finding that trained immunity is actually a two-edged sword, herein, detrimental in the context of LPS-induced Acute Lung Injury that is mediated by AMs.

      Strengths:

      Several lines of evidence in different mouse models support this conclusion. The postulation that differences in immune responses in individuals are linked to differences in the mycobiome and consequent B-glucan makeup is provocative.

      Weaknesses:

      The findings that the authors state are relevant to sepsis, are actually confined to a specific lung injury model and not classically-defined sepsis. In addition, the ontogeny of the reprogrammed AMs is uncertain. Links in the proposed signaling pathways need to be strengthened.

    3. Reviewer #2 (Public review):

      Summary:

      Prével et al. present an in vivo study in which they reveal an interesting aspect of β-glucan, a known inducer of enhanced immune responses termed trained immunity in sterile inflammation. The authors can show, that β-glucan's can reprogram alveolar macrophages (AMs) in the lungs through neutrophils and IFNγ signaling and independent of Dectin1. This reprogramming occurs at both transcriptional and metabolic levels. After β-glucan training, LPS-induced sterile inflammation exacerbated acute lung injury via enhanced immunopathology. These findings highlight a new aspect of β-glucan's role in trained immunity and its potential detrimental effects when enhanced pathogen clearance is not required.

      Strengths:

      (1) This manuscript is well-written and effectively conveys its message.

      (2) The authors provide important evidence that β-glucan training is not solely beneficial, but depending on the context can also enhance immunopathology. This will be important to the field for two reasons. It shows again, that trained immunity can also be harmful. Jentho et al. 2021 have already provided further evidence for this aspect. And it highlights anew that LPS application is an insufficient infection model.

      Weaknesses:

      (1) Only a little physiological data is provided by the in vivo models.

      (2) The effects in histology appear to be rather weak.

    1. eLife Assessment

      This useful manuscript shows a set of interesting data including the first cryo-EM structures of human PIEZO1 as well as structures of disease-related mutants in complex with the regulatory subunit MDFIC, which generate different inactivation phenotypes. The molecular basis of PIEZO channel inactivation is of great interest due to its association with several pathologies. This manuscript provides some structural insights that may help to ultimately build a molecular picture of PIEZO channel inactivation. While the structures are of use and clear conformational differences can be seen in the presence of the auxiliary subunit MDFIC, the strength of the evidence supporting the conclusions of the paper, especially the proposed role for pore lipids in inactivation, is incomplete and there is a lack of data to support them.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript by Shan, Guo, Zhang, Chen et al., shows a raft of interesting data including the first cryo-EM structures of human PIEZO1. Clearly, the molecular basis of PIEZO channel inactivation is of great interest and as such this manuscript provides some valuable extra information that may help to ultimately build a molecular picture of PIEZO channel inactivation. However, the current manuscript though does not provide any compelling evidence for a detailed mechanism of PIEZO inactivation.

      Strengths:

      This manuscript documents the first cryo-EM structures of human PIEZO1 and the gain of function mutants associated with hereditary anaemia. It is also the first evidence showing that PIEZO1 gain of function mutants are also regulated by the auxiliary subunit MDFIC.

      Weaknesses:

      While the structures are interesting and clear differences can be seen in the presence of the auxiliary subunit MDFIC the major conclusions and central tenets of the paper, especially a role for pore lipids in inactivation, lack data to support them. The post-translational modification of PIEZOs auxiliary subunit MDFIC is not modelled as a covalent interaction.

    3. Reviewer #2 (Public review):

      Summary:

      Mechanically activated ion channels PIEZOs have been widely studied for their role in mechanosensory processes like touch sensation and red blood cell volume regulation. PIEZO in vivo roles are further exemplified by the presence of gain-of-function (GOF) or loss-of-function (LOF) mutations in humans that lead to disease pathologies. Hereditary xerocytosis (HX) is one such disease caused due to GOF mutation in Human PIEZO1, which are characterized by their slow inactivation kinetics, the ability of a channel to close in the presence of stimulus. But how these mutations alter PIEZO1 inactivation or even the underlying mechanisms of channel inactivation remains unknown. Recently, MDFIC (myoblast determination family inhibitor proteins) was shown to directly interact with mouse PIEZO1 as an auxiliary subunit to prolong inactivation and alter gating kinetics. Furthermore, while lipids are known to play a role in the inactivation and gating of other mechanosensitive channels, whether this mechanism is conserved in PIEZO1 is unknown. Thus, the structural basis for PIEZO1 inactivation mechanism, and whether lipids play a role in these mechanisms represent important outstanding questions in the field and have strong implications for human health and disease.

      To get at these questions, Shan et al. use cryogenic electron microscopy (Cryo-EM) to investigate the molecular basis underlying differences in inactivation and gating kinetics of PIEZO1 and human disease-causing PIEZO1 mutations. Notably, the authors provide the first structure of human PIEZO1 (hPIEZO1), which will facilitate future studies in the field. They reveal that hPIEZO1 has a more flattened shape than mouse PIEZO1 (mPIEZO1) and has lipids that insert into the hydrophobic pore region. To understand how PIEZO1 GOF mutations might affect this structure and the underlying mechanistic changes, they solve structures of hPIEZO1 as well as two HX-causing mild GOF mutations (A1988V and E756del) and a severe GOF mutation (R2456H). Unable to glean too much information due to poor resolution of the mutant channels, the authors also attempt to resolve MCFIC-bound structures of the mutants. These structures show that MDFIC inserts into the pore region of hPIEZO1, similar to its interaction with mPIEZO1, and results in a more curved and contracted state than hPIEZO1 on its own. The authors use these structures to hypothesize that differences in curvature and pore lipid position underlie the differences in inactivation kinetics between wild-type hPIEZO1, hPIEZO1 GOF mutations, and hPIEZO1 in complex with MDFIC.

      Strengths:

      This is the first human PIEZO1 structure. Thus, these studies become the stepping stone for future investigations to better understand how disease-causing mutations affect channel gating kinetics.

      Weaknesses:

      Many of the hypotheses made in this manuscript are not substantiated with data and are extrapolated from mid-resolution structures.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors used structural biology approaches to determine the molecular mechanism underlying the inactivation of the PIEZO1 ion channel. To this end, the authors presented structures of human PIEZO1 and its slow-inactivating mutants. The authors also determined the structures of these PIEZO1 constructs in complexes with the auxiliary subunit MDFIC, which substantially slows down PIEZO1 inactivation. From these structures, the authors suggested an anti-correlation between the inactivation kinetics and the resting curvature of PIEZO1 in detergent. The authors also observed a unique feature of human PIEZO1 in which the lipid molecules plugged the channel pore. The authors proposed that these lipid molecules could stabilize human PIEZO1 in a prolonged inactivated state.

      Strengths:

      Notedly, this manuscript reported the first structures of a human PIEZO1 channel, its channelopathy mutants, and their complexes with MDFIC. The evidence that lipid molecules could occupy the channel pore of human PIEZO1 is solid. The authors' proposals to correlate PIEZO1 resting curvature and pore-resident lipid molecules with the inactivation kinetics are novel and interesting.

      Weaknesses:

      However, in my opinion, additional evidence is needed to support the authors' proposals.

      (1) The authors determined the apo structure of human PIEZO1, which showed a more flattened architecture than that of the mouse PIEZO1. Functionally, the inactivation kinetics of human PIEZO1 is faster than its mouse counterpart. From this observation (and some subsequent observations such as the complex with MDFIC), the authors proposed the anti-correlation between curvature and inactivation kinetics. However, the comparison between human and mouse PIEZO1 structure might not be justified. For example, the human and mouse structures were determined in different detergent environments, and the choice of detergent could influence the resting curvature of the PIEZO structures.

      (2) Related to point 1), the 3.7 Å structure of the A1988V mutant presented by the authors showed a similar curvature as the WT but has a slower inactivating kinetics.

      (3) Related to point 1), the authors stated that human PIEZO1 might not share the same mechanism as mouse PIEZO1 due to its unique properties. For example, MDFIC only modifies the curvature of human PIEZO1, and lipid molecules were only observed in the pore of the human PIEZO1. Therefore, it may not be justified to draw any conclusions by comparing the structures of PIEZO1 from humans and mice.

      (4) Related to point 1), it is well established that PIEZO1 opening is associated with a flattened structure. If the authors' proposal were true, in which a more flattened structure led to faster inactivation, we would have the following prediction: more opening is associated with faster inactivation. In this case, we would expect a pressure-dependent increase in the inactivation kinetics. Could the authors provide such evidence, or provide other evidence along this direction?

      (5) In Figure S2, the authors showed representative experiments of the inactivation kinetics of PIEZO1 using whole-cell poking. However, poking experiments have high cell-to-cell variability. The authors should also show statics of experiments obtained from multiple cells.

      (6) In Figure 2 and Figure 5, when the authors show the pore diameter, it could be helpful to also show the side chain densities of the pore lining residues.

      (7) The authors observed pore-plugging lipids in slow inactivating conditions such as channelopathy mutations or in complex with MDFIC. The authors propose that these lipid molecules stabilize a "deep resting state" of PIEZO1, making it harder to open and harder to inactivate once opened. This will lead to the prediction that the slow-inactivating conditions will lead to a higher activation threshold, such as the mid-point pressure in the activation curve. Is this true?

    1. eLife Assessment

      This valuable paper seeks to determine the role of endogenous CNS hemoglobin in protecting mitochondrial homeostasis in hypoxia. There is merit in the work, although it remains incomplete as there is a question as to the validity of the hypoxia model as relevant to CNS-specific ischemia/hypoxia that should be considered. In particular, a whole-body hypoxia model may liberate exosomes from other hypoxic organs, which should be addressed by the authors. Overall, this work has the potential to be of broad interest to the neuroscience and hypoxia communities.

    2. Reviewer #1 (Public Review):

      Summary:

      This study investigates the hypoxia rescue mechanisms of neurons by non-neuronal cells in the brain from the perspective of exosomal communication between brain cells. Through multi-omics combined analysis, the authors revealed this phenomenon and logically validated this intercellular rescue mechanism under hypoxic conditions through experiments. The study proposed a novel finding that hemoglobin maintains mitochondrial function, expanding the conventional understanding of hemoglobin. This research is highly innovative, providing new insights for the treatment of hypoxic encephalopathy.

      Overall, the manuscript is well organized and written, however, there are some minor/major points that need to be revised before this manuscript is accepted.

      Major points:

      (1) Hypoxia can induce endothelial cells to release exosomes carrying hemoglobin, however, how neurons are able to actively take up these exosomes? It is possible for other cells to take up these exosomes also? This point needs to be clarified in this study.

      (2) The expression of hemoglobin in neurons is important for mitochondrial homeostasis, but its relationship with mitochondrial homeostasis needs to be further elucidated in the study.

    3. Reviewer #2 (Public Review):

      Summary:

      This is an interesting study with a lot of data. Some of these ideas are intriguing. But a few major points require further consideration.

      Major points:

      (1) What disease is this model of whole animal hypoxia supposed to mimic? If one is focused on the brain, can one just use a model of focal or global cerebral ischemia?

      (2) If this model subjects the entire animal to hypoxia, then other organs will also be hypoxic. Should one also detect endothelial upregulation and release of extracellular vesicles containing hemoglobin mRNA in non-CNS organs? Where do these vesicles go? Into blood?

      (3) What other mRNA are contained in the vesicles released from brain endothelial cells?

      (4) Where do the endothelial vesicles go? Only to neurons? Or to other cells as well?

      (5) Neurons can express endogenous hemoglobin. Is it useful to subject neurons to hypoxia and then see how much the endogenous mRNA goes up? How large is the magnitude of endogenous hemoglobin gene upregulation compared to the hypothesized exogenous mRNA that is supposed to be donated from endothelial vesicles?

      (6) Finally, it may be useful to provide more information and data to explain how the expression of this exogenous endothelial-derived hemoglobin binds to neuronal mitochondria to alter function.

    1. eLife Assessment

      This manuscript reports on the effects of a single dose of methamphetamine vs placebo on a probabilistic reversal learning task with different levels of noise, in a large group of young healthy volunteers. The paper is well written and the methods are rigorous. The findings are valuable and have theoretical or practical implications for a subfield. The strength of the evidence is solid, with the methods, data, and analyses broadly supporting the claims with only minor weaknesses.

    2. Reviewer #1 (Public review):

      The authors examine how probabilistic reversal learning is affected by dopamine by studying the effects of methamphetamine (MA) administration. Based on prior evidence that the effects of pharmacological manipulation depend on baseline neurotransmitter levels, they hypothesized that MA would improve learning in people with low baseline performance. They found this effect, and specifically found that MA administration improved learning in noisy blocks, by reducing learning from misleading performance, in participants with lower baseline performance. The authors then fit participants' behavior to a computational learning model and found that an eta parameter, responsible for scaling learning rate based on previously surprising outcomes, differed in participants with low baseline performance on and off MA.

      Questions:

      (1) It would be helpful to confirm that the observed effect of MA on the eta parameter is responsible for better performance in low baseline performers. If performance on the task is simulated for parameters estimated for high and low baseline performers on and off MA, does the simulated behavior capture the main behavioral differences shown in Figure 3?

      (2) In Figure 4C, it appears that the main parameter difference between low and high baseline performance is inverse temperature, not eta. If MA is effective in people with lower baseline DA, why is the effect of MA on eta and not IT?

      Also, this parameter is noted as temperature but appears to be inverse temperature as higher values are related to better performance. The exact model for the choice function is not described in the methods.

    3. Reviewer #2 (Public review):

      Summary:

      Kirschner and colleagues test whether methamphetamine (MA) alters learning rate dynamics in a validated reversal learning task. They find evidence that MA can enhance performance for low-performers and that the enhancement reflects a reduction in the degree to which these low-performers dynamically up-regulate their learning rates when they encounter unexpected outcomes. The net effect is that poor performers show more volatile learning rates (e.g. jumping up when they receive misleading feedback), when the environment is actually stable, undermining their performance over trials.

      Strengths:

      The study has multiple strengths including large sample size, placebo control, double-blind randomized design, and rigorous computational modeling of a validated task.

      Weaknesses:

      The limitations, which are acknowledged, include that the drug they use, methamphetamine, can influence multiple neuromodulatory systems including catecholamines and acetylcholine, all of which have been implicated in learning rate dynamics. They also do not have any independent measures of any of these systems, so it is impossible to know which is having an effect.

      Another limitation that the authors should acknowledge is that the fact that participants were aware of having different experiences in the drug sessions means that their blinding was effectively single-blind (to the experimenters) and not double-blind. Relatedly, it is difficult to know whether subjective effects of drugs (e.g. arousal, mood, etc.) might have driven differences in attention, causing performance enhancements in the low-performing group. Do the authors have measures of these subjective effects that they could include as covariates of no interest in their analyses?

    4. Reviewing Editor (Public Review):

      Summary:

      In this well-written paper, a pharmacological experiment is described in which a large group of volunteers is tested on a novel probabilistic reversal learning task with different levels of noise, once after intake of methamphetamine and once after intake of placebo. The design includes a separate baseline session, during which performance is measured. The key result is that drug effects on learning rate variability depend on performance in this separate baseline session.

      The approach and research question are important, the results will have an impact, and the study is executed according to current standards in the field. Strengths include the interventional pharmacological design, the large sample size, the computational modeling, and the use of a reversal-learning task with different levels of noise.

      (i) One novel and valuable feature of the task is the variation of noise (having 70-30 and 80-20 conditions). This nice feature is currently not fully exploited in the modeling of the task and the data. For example, recently reported new modeling approaches for disentangling two types of uncertainty (stochasticity vs volatility) could be usefully leveraged here (by Piray and Daw, 2021, Nat Comm). The current 'signal to noise ratio' analysis that is targeting this issue relies on separately assessing learning rates on true reversals and learning rates after misleading feedback, in a way that is experimenter-driven. As a result, this analysis cannot capture a latent characteristic of the subject's computational capacity.

      (ii) An important caveat is that all the drug x baseline performance interactions, including for the key computational eta parameter did not reach the statistical threshold, and only tended towards significance.

      (iii) Both the overlap and the differences between the current study and previous relevant work (that is, how this goes beyond prior studies in particular Rostami Kandroodi et al, which also assessed effects of catecholaminergic drug administration as a function of baseline task performance using a probabilistic reversal learning task) are not made explicit, particularly in the introduction.

      (iv) In the discussion, it is stated that the existing literature has, to date, overlooked baseline performance effects, but this is not true in the general sense, given that an accumulating number of studies have shown that the effects of drugs like MA depend on baseline performance on working memory tasks, which often but certainly not always correlates positively with performance on the task under study.

    1. eLife Assessment

      This study presents a useful characterisation of the topographical organisation of the human pulvinar, an associative thalamic subregion crucial for visual perception and attention. The evidence supporting the conclusions is solid given the multimodal validation and replication across datasets, although even higher-resolution imaging data would have strengthened the study. The manuscript would also be strengthened by clarifying how the work extends previous assessments of thalamic connectivity and expanding the results with a more digested interpretation of the findings and validation of the segmentation quality. With these components strengthened, the work would be of interest to neuroscientists, neurologists, and neuropsychiatrists working on pulvinar functioning in health and disease.

    2. Reviewer #1 (Public review):

      Summary:

      The current work explored the link between the pulvinar intrinsic organisation and its functional and structural connectivity patterns of the cortex using different dimensional reduction techniques. Overall they find relationships between pulvinar-cortical organization and cortico-cortical organization, and little evidence for clustered organization. Moreover, they investigate PET maps to understand how neurotransmitter/receptor distributions vary within the pulvinar and along its structural and functional connectivity axes.

      Strengths:

      There is a replication dataset and different modalities are compared against each other to understand the structural and functional organisation of the pulvinar complex.

      Weaknesses:

      (1) What is the motivation of the study and how does this work extend previous assessments of the organization of the complete thalamus within the gradient framework?

      (2) Why is the current atlas chosen for the delineation of the pulvinar and individualised maps not considered? Given the size of the pulvinar, more validation of the correctness of the atlas may be helpful.

      (3) Overall the study feels a little incremental and a repetition of what others have done already in the thalamus. It would be good to know how focussing only on the pulvinar changes interpretation, for example by comparing thalamic and pulvinar gradients?

      (4) Could it be that the gradient patterns stem from lacking anatomical and functional resolutions (or low SNR) therefore generating no sharp boundaries?

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to explore and better understand the complex topographical organization of the human pulvinar, a brain region crucial for various high-order functions such as perception and attention. They sought to move beyond traditional histological subdivisions by investigating continuous 'gradients' of cortical connections along the dorsoventral and mediolateral axes. Using advanced imaging techniques and a comprehensive PET atlas of neurotransmitter receptors, the study aimed to identify and characterize these gradients in terms of structural connections, functional coactivation, and molecular binding patterns. Ultimately, the authors targeted to provide a more nuanced understanding of pulvinar anatomy and its implications for brain function in both healthy and diseased states.

      Strengths:

      A key strength of this study lies in the authors' effort to comprehensively combine multimodal data, encompassing both functional and structural connectomics, alongside the analysis of major neurotransmitter distributions. This approach enabled a more nuanced understanding of the overarching organizational principles of the pulvinar nucleus within the broader context of whole-brain connectivity. By employing cortex-wide correlation analyses of multimodal embedding patterns derived from 'gradients,' which provide spatial maps reflecting the underlying connectomic and molecular similarities across voxels, the study offers a thorough characterization of the functional neuroanatomy of the pulvinar.

      Weaknesses:

      Despite its strengths, the current manuscript falls short in presenting the authors' unique perspectives on integrating the diverse biological principles derived from the various neuroimaging modalities. The findings are predominantly reported as correlations between different gradient maps, without providing the in-depth interpretations that would allow for a more comprehensive understanding of the pulvinar's role as a central hub in the brain's network. Another limitation of the study is the lack of clarity regarding the application of pulvinar and its subnuclei segmentation maps to individual brains prior to BOLD signal extraction and gradient reconstruction. This omission raises concerns about the precision and reproducibility of the findings, leaving their robustness less transparently evaluable.

    4. Reviewer #3 (Public review):

      Summary of the Study:

      The authors investigate the organization of the human pulvinar by analyzing DWI, fMRI, and PET data. The authors explore the hypothesis of the "replication principle" in the pulvinar.

      Strengths and Weaknesses of the Methods and Results:

      The study effectively integrates diverse imaging modalities to provide a view of the pulvinar's organization. The use of analysis techniques, such as diffusion embedding-driven gradients combined with detailed interpretations of the pulvinar, is a strength.

      Even though the study uses the best publicly available resolution possible with current MR-technology, the pulvinar is densely packed with many cell bodies, requiring even higher spatial resolution. In addition, the model order selection of gradients may vary with the acquired data quality. Therefore, the pulvinar's intricate organization needs further exploration with even higher spatial resolution to capture gradients closer to the biological organization of the pulvinar.

      Appraisal of the Study's Aims and Conclusions:

      The authors delineate the gradient organization of the pulvinar. The study provides a basis for understanding the pulvinar's role in mediating brain network communication.

      Impact and Utility of the Work:

      This work contributes to the field by offering insights into pulvinar organization.

    1. eLife Assessment

      This important study uncovers a surprising link between two self-cleaving RNAs that belong to the same structural family. The evidence supporting the main conclusions is convincing and based on extensive biochemical and bioinformatic analysis. This research will be of broad interest to RNA molecular biologists and biochemists.

    2. Reviewer #1 (Public review):

      Summary:

      The overall analysis and discovery of the common motif is important and exciting. Very few human/primate ribozymes have been published and this manuscript presents a detailed analysis of two of them. The minimized domains appear to be some of the smallest known self-cleaving ribozymes.

      Strengths:

      The manuscript is rooted in deep mutational analysis of the human OR4K15 and LINE1 ribozymes and subsequently in modeling of their active site based on the closely-related core of the TS ribozyme. The experiments support the HTS findings and provide convincing evidence that the ribozymes are structurally related to the core of the TS ribozyme, which has not been found in primates prior to this work.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The overall analysis and discovery of the common motif are important and exciting. Very few human/primate ribozymes have been published and this manuscript presents a relatively detailed analysis of two of them. The minimized domains appear to be some of the smallest known self-cleaving ribozymes.

      Strengths:

      The manuscript is rooted in deep mutational analysis of the OR4K15 and LINE1 and subsequently in modeling of a huge active site based on the closely-related core of the TS ribozyme. The experiments support the HTS findings and provide convincing evidence that the ribozymes are structurally related to the core of the TS ribozyme, which has not been found in primates prior to this work.

      Weaknesses:

      (1) Given that these two ribozymes have not been described outside of a single figure in a Science Supplement, it is important to show their locations in the human genome, present their sequence and structure conservation among various species, particularly primates, and test and discuss the activity of variants found in non-human organisms. Furthermore, OR4K15 exists in three copies on three separate chromosomes in the human genome, with slight variations in the ribozyme sequence. All three of these variants should be tested experimentally and their activity should be presented. A similar analysis should be presented for the naturally-occurring variants of the LINE1 ribozyme. These data are a rich source for comparison with the deep mutagenesis presented here. Inserting a figure (1) that would show the genomic locations, directions, and conservation of these ribozymes and discussing them in light of this new presentation would greatly improve the manuscript. As for the biological roles of known self-cleaving ribozymes in humans, there is a bioRxiv manuscript on the role of the CPEB3 ribozyme in mammalian memory formation (doi.org/10.1101/2023.06.07.543953), and an analysis of the CPEB3 functional conservation throughout mammals (Bendixsen et al. MBE 2021). Furthermore, the authors missed two papers that presented the discovery of human hammerhead ribozymes that reside in introns (by de la PeÃ{plus minus}a and Breaker), which should also be cited. On the other hand, the Clec ribozyme was only found in rodents and not primates and is thus not a human ribozyme and should be noted as such.

      We thank this Reviewer for his/her input and acknowledgment of this work. To improve the manuscript, we have included the genomic locations in Figure 1A, Figure 6A and Figure 6C. And we have tested the activity of representative variants found in the human genome and discussed the activity of the variants in other primates. All suggested publications are now properly cited.

      Line 62-66: It has been shown that single nucleotide polymorphism (SNP) in CPEB3 ribozyme was associated with an enhanced self-cleavage activity along with a poorer episodic memory (14). Inhibition of the highly conserved CPEB3 ribozyme could strengthen hippocampal-dependent long-term memory (15, 16). However, little is known about the other human self-cleaving ribozymes.

      Line 474-501: Homology search of two TS-like ribozymes. To locate close homologs of the two TS-like ribozymes, we performed cmsearch based on a covariance model (38) built on the sequence and secondary structural profiles. In the human genome, we got 1154 and 4 homolog sequences for LINE-1-rbz and OR4K15-rbz, respectively. For OR4K15-rbz, there was an exact match located at the reverse strand of the exon of OR4K15 gene (Figure 6A). The other 3 homologs of OR4K15-rbz belongs to the same olfactory receptor family 4 subfamily K (Figure 6C). However, there was no exact match for LINE-1-rbz (Figure 6A). Interestingly, a total of 1154 LINE-1-rbz homologs were mapped to the LINE-1 retrotransposon according to the RepeatMasker (http://www.repeatmasker.org) annotation. Figure 6B showed the distribution of LINE-1-rbz homologs in different LINE-1 subfamilies in the human genome. Only three subfamilies L1PA7, L1PA8 and L1P3 (L1PA7-9) can be considered as abundant with LINE-1-rbz homologs (>100 homologs per family). The consensus sequences of all homologs obtained are shown in Figure 6D. In order to investigate the self-cleavage activity of these homologs, we mainly focused on the mismatches in the more conserved internal loops. The major differences between the 5 consensus sequences are the mismatches in the first internal loop. The widespread A12C substitution can be found in majority of LINE-1-rbz homologs, this substitution leads to a one-base pair extension of the second stem (P2) but almost no activity (RA’: 0.03) based on our deep mutational scanning result. Then we selected 3 homologs without A12C substitution for LINE-1-rbz for in vitro cleavage assay (Figure 6E). But we didn’t observe significant cleavage activity, this might be caused by GU substitutions in the stem region. For 3 homologs of OR4K15-rbz, we only found one homolog of OR4K15 with pronounced self-cleavage activity (Figure 6F). In addition, we performed similar bioinformatic search of the TS-like ribozymes in other primate genomes. Similarly, the majority (15 out of 18) of primate genomes have a large number of LINE-1 homologs (>500) and the remaining three have essentially none. However, there was no exact match. Only one homolog has a single mutation (U38C) in the genome assembly of Gibbon (Figure S15). The majority of these homologs have 3 or more mismatches (Figure S15). For OR4K15-rbz, all representative primate genomes contain at least one exact match of the OR4K15-rbz sequence.

      Line 598-602: According to the bioinformatic analysis result, there are some TS-like ribozymes (one LINE-1-rbz homolog in the Gibbon genome, and some OR4K15-rbz homologs) with in vitro cleavage activity in primate genomes. Unlike the more conserved CPEB3 ribozyme which has a clear function, the function of the TS-like ribozymes is not clear, as they are not conserved, belong to the pseudogene or located at the reverse strand.

      (2) The authors present the story as a discovery of a new RNA catalytic motif. This is unfounded. As the authors point out, the catalytic domain is very similar to the Twister Sister (or "TS") ribozyme. In fact, there is no appreciable difference between these and TS ribozymes, except for the missing peripheral domains. For example, the env33 sequence in the Weinberg et al. 2015 NCB paper shows the same sequences in the catalytic core as the LINE1 ribozyme, making the LINE1 ribozyme a TS-like ribozyme in every way, except for the missing peripheral domains. Thus these are not new ribozymes and should not have a new name. A more appropriate name should be TS-like or TS-min ribozymes. Renaming the ribozymes to lanterns is misleading.

      Although we observed some differences in mutational effects, we agree with the reviewer that it is more appropriate to call them TS-like ribozymes. We have replaced all “lantern ribozyme” with “TS-like ribozyme” as suggested.

      (3) In light of 2) the story should be refocused on the fact the authors discovered that the OR4K15 and LINE1 are both TS-like ribozymes. That is very exciting and is the real contribution of this work to the field.

      We thank this Reviewer for their acknowledgement of this work. To improve the manuscript, we have re-named the ribozymes as suggested.

      (4) Given the slow self-scission of the OR4K15 and LINE1 ribozymes, the discussion of the minimal domains should be focused on the role of peripheral domains in full-length TS ribozymes. Peripheral domains have been shown to greatly speed up hammerhead, HDV, and hairpin ribozymes. This is an opportunity to show that the TS ribozymes can do the same and the authors should discuss the contribution of peripheral domains to the ribozyme structure and activity. There is extensive literature on the contribution of a tertiary contact on the speed of self-scission in hammerhead ribozymes, in hairpin ribozyme it's centered on the 4-way junction vs 2-way junction structure, and in HDVs the contribution is through the stability of the J1/2 region, where the stability of the peripheral domain can be directly translated to the catalytic enhancement of the ribozymes.

      We appreciate your question and the valuable suggestions provided. We have included the citations and discussion about the peripheral domains in other ribozymes.

      Line 570-576: Thus, a more sophisticated structure along with long-range interactions involving the SL4 region in the twister sister ribozyme must have helped to stabilize the catalytic region for the improved catalytic activity. Similarly, previous studies have demonstrated that peripheral regions of hammerhead (49), hairpin (50) and HDV (51, 52) ribozymes could greatly increase their self-cleavage activity. Given the importance of the peripheral regions, absence of this tertiary interaction in the TS-like ribozyme may not be able to fully stabilize the structural form generated from homology modelling.

      (5) The argument that these are the smallest self-cleaving ribozymes is debatable. LÃ1/4nse et al (NAR 2017) found some very small hammerhead ribozymes that are smaller than those presented here, but the authors suggest only working as dimers. The human ribozymes described here should be analyzed for dimerization as well (e.g., by native gel analysis) particularly because the authors suggest that there are no peripheral domains that stabilize the fold. Furthermore, Riccitelli et al. (Biochemistry) minimized the HDV-like ribozymes and found some in metagenomic sequences that are about the same size as the ones presented here. Both of these papers should be cited and discussed.

      We apologize for any confusion caused by our previous statement. To clarify, we highlighted “35 and 31 nucleotides only” because 46 and 47 nt contain the variable hairpin loops which are not important for the catalytic activity. By comparing the conserved segments, the TS-like ribozyme discussed in this paper is the shortest with the simplest secondary structure. And we have replaced the terms “smallest” and “shortest” with “simplest” in our manuscript. The title has been changed to “Minimal twister sister (TS)-like self-cleaving ribozymes in the human genome revealed by deep mutational scanning”. All the publications mentioned have been cited and discussed. Regarding possible dimerization, we did not find any evidence but would defer it to future detailed structural analysis to be sure.  

      Line 605-608: Previous studies also have revealed some minimized forms of self-cleaving ribozymes, including hammerhead (19, 53) and HDV-like (54) ribozymes. However, when comparing the conserved segments, they (>= 36 nt) are not as short as the TS-like ribozymes (31 nt) found here.

      (6) The authors present homology modeling of the OR4K15 and LINE1 ribozymes based on the crystal structures of the TS ribozymes. This is another point that supports the fact that these are not new ribozyme motifs. Furthermore, the homology model should be carefully discussed as a model and not a structure. In many places in the text and the supplement, the models are presented as real structures. The wording should be changed to carefully state that these are models based on sequence similarity to TS ribozymes. Fig 3 would benefit from showing the corresponding structures of the TS ribozymes.

      We thank the reviewer for pointing these out and we have already fixed them. We have replaced all “lantern ribozyme” with “TS-like ribozyme” as suggested. The term “Modelled structures” were used for representing the homology model. And we have included the TS ribozyme structure in Fig 3.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript applies a mutational scanning analysis to identify the secondary structure of two previously suggested self-cleaving ribozyme candidates in the human genome. Through this analysis, minimal structured and conserved regions with imminent importance for the ribozyme's activity are suggested and further biochemical evidence for cleavage activity are presented. Additionally, the study reveals a close resemblance of these human ribozyme candidates to the known self-cleaving ribozyme class of twister sister RNAs. Despite the high conservation of the catalytic core between these RNAs, it is suggested that the human ribozyme examples constitute a new ribozyme class. Evidence for this however is not conclusive.

      Strengths:

      The deep mutational scanning performed in this study allowed the elucidation of important regions within the proposed LINE-1 and OR4K15 ribozyme sequences. Part of the ribozyme sequences could be assigned a secondary structure supported by covariation and highly conserved nucleotides were uncovered. This enabled the identification of LINE-1 and OR4K15 core regions that are in essence identical to previously described twister sister self-cleaving RNAs.

      Weaknesses:

      I am skeptical of the claim that the described catalytic RNAs are indeed a new ribozyme class. The studied LINE-1 and OR4K15 ribozymes share striking features with the known twister sister ribozyme class (e.g. Figure 3A) and where there are differences they could be explained by having tested only a partial sequence of the full RNA motif. It appears plausible, that not the entire "functional region" was captured and experimentally assessed by the authors.

      We thank this Reviewer for his/her input and acknowledgment of this work. Because a similar question was raised by reviewer 1, we decided to name the ribozymes as TS-like ribozymes. Regarding the entire regions, we conducted mutational scanning experiments at the beginning of this study. The relative activity distributions (Figure 1B, 1C) have shown that only parts of the sequence contributes to the self-cleavage activity. That is the reason why we decided to focus on the parts of the sequence afterwards.

      They identify three twister sister ribozymes by pattern-based similarity searches using RNA-Bob. Also comparing the consensus sequence of the relevant region in twister sister and the two ribozymes in this paper underlines the striking similarity between these RNAs. Given that the authors only assessed partial sequences of LINE-1 and OR4K15, I find it highly plausible that further accessory sequences have been missed that would clearly reveal that "lantern ribozymes" actually belong to the twister sister ribozyme class. This is also the reason I do not find the modeled structural data and biochemical data results convincing, as the differences observed could always be due to some accessory sequences and parts of the ribozyme structure that are missing.

      We appreciate the reviewer for raising this question. As we explained in the last question, we now called the ribozymes as TS-like ribozymes. We also emphasize that the relative activity data of the original sequences have indicated that the other part did not make any contribution to the activity of the ribozyme. The original sequences provided in the Science paper (Salehi-Ashtiani et al. Science 2006) were generated from biochemical selection of the genomic library. It did not investigate the contribution of each position to the self-cleavage activity.

      Highly conserved nucleotides in the catalytic core, the need for direct contacts to divalent metal ions for catalysis, the preference of Mn2+ oder Mg2+ for cleavage, the plateau in observed rate constants at ~100mM Mg2+, are all characteristics that are identical between the proposed lantern ribozymes and the known twister sister class.

      The difference in cleavage speed between twister sister (~5 min-1) and proposed lantern ribozymes could be due to experimental set-up (true single-turnover kinetics?) or could be explained by testing LINE-1 or OR4K15 ribozymes without needed accessory sequences. In the case of the minimal hammerhead ribozyme, it has been previously observed that missing important tertiary contacts can lead to drastically reduced cleavage speeds.

      We thank the reviewer for this question. We now called the ribozymes as TS-like ribozymes. As we explained in the last question, the relative activity data of the original sequences have proven that the other part did not make any contribution to the activity of the ribozyme. Moreover, we have tested different enzyme to substrate ratios to achieve single turn-over kinetics (Figure S13). The difference in cleavage speed should be related to the absence of peripheral regions which do not exist in the original sequences of the LINE-1 and OR4K15 ribozyme. We have included the publications and discussion about the peripheral domains in other ribozymes.

      Line 458-463: The kobs of LINE-1-core was ~0.05 min-1 when measured in 10mM MgCl2 and 100mM KCl at pH 7.5 (Figure S13). Furthermore, the single-stranded ribozymes exhibited lower kobs (~0.03 min-1 for LINE-1-rbz) (Figure S14) when comparing with the bimolecular constructs. This confirms that the stem loop region SL2 does not contribute much to the cleavage activity of the TS-like ribozymes.

      Line 570-576: Thus, a more sophisticated structure along with long-range interactions involving the SL4 region in the twister sister ribozyme must have helped to stabilize the catalytic region for the improved catalytic activity. Similarly, previous studies have demonstrated that peripheral regions of hammerhead (49), hairpin (50) and HDV (51, 52) ribozymes could greatly increase their self-cleavage activity. Given the importance of the peripheral regions, absence of this tertiary interaction in the TS-like ribozyme may not be able to fully stabilize the structural form generated from homology modelling.

      Reviewer 2: ( Recommendations For The Authors):

      Major points

      It would have made it easier to connect the comments to text passages if the submitted manuscript had page numbers or even line numbers.

      We thank the reviewer for pointing this out and we have already fixed it.

      In the introduction: "...using the same technique, we located the functional and base-pairing regions of..." The use of the adjective functional is imprecise. Base-paired regions are also important for the function, so what type of region is meant here? Conserved nucleotides?

      We thank the reviewer for pointing this out. We were describing the regions which were essential for the ribozyme activity. And we have defined the use of “functional region” in introduction.

      Line 95: we located the regions essential for the catalytic activities (the functional regions) of LINE-1 and OR4K15 ribozymes in their original sequences.

      In their discussion, the authors mention the possible flaws in their 3D-modelling in the absence of Mg2+. Is it possible to include this divalent metal ion in the calculations?

      We thank the reviewer for this question. Currently, BriQ (Xiong et al. Nature Communications 2021) we used for modeling doesn’t include divalent metal ion in modeling.

      Xiong, Peng, Ruibo Wu, Jian Zhan, and Yaoqi Zhou. 2021. “Pairing a High-Resolution Statistical Potential with a Nucleobase-Centric Sampling Algorithm for Improving RNA Model Refinement.” Nature Communications 12: 2777. doi:10.1038/s41467-021-23100-4.

      Abstract:

      It is claimed that ribozyme regions of 46 and 47 nt described in the manuscript resemble the shortest known self-cleaving ribozymes. This is not correct. In 1988, hammerhead ribozymes in newts were first discovered that are only 40 nt long.

      We apologize for any confusion caused by our previous statement. To clarify, we highlighted “35 and 31 nucleotides only” as 46 and 47 nt contain the variable hairpin loops which are not important for the catalytic activity. By comparing the conserved segments, the TS-like ribozyme discussed in this paper is the shortest with the simplest secondary structure. And we have replaced the terms “smallest” and “shortest” with “simplest” in our manuscript. The title has been changed to “Minimal TS-like self-cleaving ribozyme revealed by deep mutational scanning”.

      The term "functional region" is, to my knowledge, not a set term when discussing ribozymes. Does it refer to the catalytic core, the cleavage site, the acid and base involved in cleavage, or all, or something else? Therefore, the term should be 1) defined upon its first use in the manuscript and 2) probably not be used in the abstract to avoid confusion to the reader.

      We apologize for any confusion caused by our previous statement. To clarify, we have changed the term “functional region” in abstract. And we have defined the use of “functional region” in introduction.

      Line 34-37: We found that the regions essential for ribozyme activities are made of two short segments, with a total of 35 and 31 nucleotides only. The discovery makes them the simplest known self-cleaving ribozymes. Moreover, the essential regions are circular permutated with two nearly identical catalytic internal loops, supported by two stems of different lengths.

      Line 95: we located the regions essential for the catalytic activities (the functional regions) of LINE-1 and OR4K15 ribozymes in their original sequences.

      The choice of the term "non-functional loop" in the abstract is a bit unfortunate. The loop might not be important for promoting ribozyme catalysis by directly providing, e.g. the acid or base, but it has important structural functions in the natural RNA as part of a hairpin structure.

      We thank the reviewer for pointing this out and we have re-phrased the sentences.

      Line 33-34: We found that the regions essential for ribozyme activities are made of two short segments, with a total of 35 and 31 nucleotides only.

      Line 283: Removing the peripheral loop regions (Figures 1B and 1C) allows us to recognize that the secondary structure of OR4K15-rbz is a circular permutated version of LINE-1-rbz.

      Results:

      Please briefly explain CODA and MC analysis when first mentioned in the results (Figure (1) The more detailed explanation of these terms for Figure 2 could be moved to this part of the results section (including explanations in the figure legend).

      We thank the reviewer for pointing this out and we included a brief explanation.

      Line 150-154: CODA employed Support Vector Regression (SVR) to establish an independent-mutation model and a naive Bayes classifier to separate bases paired from unpaired (26). Moreover, incorporating Monte-Carlo simulated annealing with an energy model and a CODA scoring term (CODA+MC) could further improve the coverage of the regions under-sampled by deep mutations.

      Please indicate the source of the human genomic DNA. Is it a patient sample, what type of tissue, or is it an immortalized cell line? It is not stated in the methods I believe.

      We thank the reviewer for pointing this out. According to the original Science paper (Salehi-Ashtiani et al. Science 2006), the human genomic DNA (isolated from whole blood) was purchased from Clontech (Cat. 6550-1). In our study, we directly employed the sequences provided in Figure S2 of the Science paper for gene synthesis. Thus, we think it is unnecessary to mention the source of genomic DNA in the methods section of our paper.  

      Please also refer to the methods section when the calculation of RA and RA' values is explained in the main text to avoid confusion.

      We thank the reviewer for pointing this out and we have fixed it.

      Line 207-208: Figure 2A shows the distribution of relative activity (RA’, measured in the second round of mutational scanning) (See Methods) of all single mutations

      For OR4K15 it is stated that the deep mutational scanning only revealed two short regions as important. However, there is another region between approx. 124-131 nt and possibly even at positions 47 and 52 (to ~55), that could contribute to effective RNA cleavage, especially given the library design flaws (see below) and the lower mutational coverage for OR4K15. A possible correlation of the mutations in these regions is even visible in the CODA+MC analysis shown in Figure 1D on the left. Why are these regions ignored in ongoing experiments?

      We thank the reviewer for this question. As shown in Table S1, although the double mutation coverage of OR4K15-ori was low (16.2 %), we got 97.6 % coverage of single mutations. The relative activity of these single mutations was enough to identify the conserved regions in this ribozyme. Mutations at the positions mentioned by the reviewer did not lead to large reductions in relative activity. Since the relative activity of the original sequence is 1, we presumed that only positions with average relative activity much lower than 1 might contribute to effective cleavage.

      Regarding the corresponding correlation of mutations in CODA+MC, they are considered as false positives generated from Monte Carlo simulated annealing (MC), because lack of support from the relative activity results.

      Have the authors performed experiments with their "functional regions" in comparison to the full-length RNA or partial truncations of the full-length RNA that included, in the case of OR4K15, nt 47-131? Also for LINE-1 another stem region was mentioned (positions 14-18 with 30-34) and two additional base pairs. Were they included in experiments not shown as part of this manuscript?

      We appreciate the reviewer for raising this question. We only compared the full-length or partial truncations of the LINE-1 ribozyme. Since the secondary structure predicted from OR4K15-ori data was almost the same as LINE-1, we didn’t perform deep mutagenesis on the partial truncation of the OR4K15. However, the secondary structure of OR4K15 was confirmed by further biochemical experiments.   

      Regarding the second question, the additional base pairs were generated by Monte Carlo simulated annealing (MC). They are considered as false positives because of low probabilities and lack of support from the deep mutational scanning results. The appearance of false positives is likely due to the imperfection of the experiment-based energy function employed in current MC simulated annealing. 

      Are there other examples in the literature, where error-prone PCR generates biases towards A/T nucleotides as observed here? Please cite!

      We thank the reviewer for pointing this out and we have included the corresponding citation.

      Line 161-162: The low mutation coverage for OR4K15-ori was due to the mutational bias (27, 28) of error-prone PCR (Supplementary Figures S1, S2, S3 and S4).

      Line 170-171: whose covariations are difficult to capture by error-prone PCR because of mutational biases (27, 28).

      The authors mention that their CODA analysis was based on the relative activities of 45,925 and 72,875 mutation variants. I cannot find these numbers in the supplementary tables. They are far fewer than the read numbers mentioned in Supplementary Table 2. How do these numbers (45,925 and 72,875) arise? Could the authors please briefly explain their selection process?

      We apologize for any confusion caused by our previous statement. Our CODA analysis only utilized variants with no more than 3 mutations. The number listed in the supplementary tables is the total number of the variants. To clarify, we have included a brief explanation for these numbers.

      Line 203-204: We performed the CODA analysis (26) based on the relative activities of 45,925 and 72,875 mutation variants (no more than 3 mutations) obtained for the original sequence and functional region of the LINE-1 ribozyme, respectively.

      What are the reasons the authors assume their findings from LINE-1 can be used to directly infer the structure for OR4K15? (Third section in results, last paragraph)

      We apologize for any confusion caused by our previous statement. We meant to say that the consistency between LINE-1-rbz and LINE-1-ori results suggested that our method for inferring ribozyme structure was reliable. Thus, we employed the same method to infer the structure of the functional region of OR4K15. To clarify, we have re-phrased the sentence.   

      Line 259-261: The consistent result between LINE-1-rbz and LINE-1-ori suggested that reliable ribozyme structures could be inferred by deep mutational scanning. This allowed us to use OR4K15-ori to directly infer the final inferred secondary structure for the functional region of OR4K15.

      There are several occasions where the authors use the differences between the proposed lantern ribozymes and twister sister data as reasons to declare LINE-1 and OR4K15 a new ribozyme class. As mentioned previously, I am not convinced these differences in structure and biochemical results could not simply result from testing incomplete LINE-1 and OR4K15 sequences.

      We apologize for any confusion caused by our previous statement. Despite we observed some differences in mutational effects, we agree with the reviewer that it is not convincing to claim them as a new ribozyme class. We have replaced all “lantern ribozyme” with “TS-like ribozyme” as the reviewer 1 suggested.

      The authors state, that "the result confirmed that the stem loop SL2 region in LINE-1 and OR4K15 did not participate in the catalytic activity". To draw such a conclusion a kinetic comparison between a construct that contains SL2 and does not contain SL2 would be necessary. The given data does not suffice to come to this conclusion.

      We appreciate the reviewer for raising this question. To address this, we performed gel-based kinetic analysis of these two ribozymes (Figure S14).

      Line 458-462: The kobs of LINE-1-core under single-turnover condition was ~0.05 min-1 when measured in 10mM MgCl2 and 100mM KCl at pH 7.5 (Figure S13). Only a slightly lower value of  kobs (~0.03 min-1) was observed for LINE-1-rbz (Figure S14). This confirms that the stem loop region SL2 does not contribute to the cleavage activity of the TS-like ribozymes.

      Construct/Library design:

      The last 31 bp in the OR4K15 ribozyme template sequence are duplicated (Supplementary Table 4). Therefore, there are 2 M13 fwd binding sites and several possible primer annealing sites present in this template. This could explain the lower yield for the mutational analysis experiments. Did the authors observe double bands in their PCR and subsequent analysis? The experiments should probably be repeated with a template that does not contain this duplication. Alternatively, the authors should explain, why this template design was chosen for OR4K15.

      We apologize for this mistake during writing. Our construct design for OR4K15 contains only one M13F binding site. We thank the reviewer for pointing this out and we have fixed the error.

      Figure 5B: Where are the bands for the OR4K15 dC-substrate? They are not visible on the gel, so one has to assume there was no substrate added, although the legend indicates otherwise.

      Also this figure, please indicate here or in the methods section what kind of marker was used. In panels A and B, please label the marker lanes.

      We apologize for this mistake and we have repeated the experiment. The marker lane was removed to avoid confusion caused by the inappropriate DNA marker. 

      The authors investigated ribozyme cleavage speeds by measuring the observed rate constants under single-turnover conditions. To achieve single-turnover conditions enzyme has to be used in excess over substrate. Usually, the ratios reported in the literature range between 20:1 (from the authors citation list e.g.: for twister sister (Roth et al 2014) and hatchet (Li et al. 2015)) or even ~100:1 (for pistol: Harris et al 2015, or others https://www.sciencedirect.com/science/article/pii/S0014579305002061). Can the authors please share their experimental evidence that only 5:1 excess of enzyme over the substrate as used in their experiments truly creates single-turnover conditions?

      We greatly appreciate the Reviewer for raising this question. To address this, we performed kinetic analysis using different enzyme to substrate ratios (Figure S13). There is not too much difference in kobs, except that kobs reach the highest value of 0.048 min-1 when using 100:1 excess of enzyme over the substrate. 

      Line 458-460: The kobs of LINE-1-core under single-turnover condition was ~0.05 min-1 when measured in 10mM MgCl2 and 100mM KCl at pH 7.5 (Figure S13).

      Citations:

      In the introduction citation number 12 (Roth et al 2014) is mentioned with the CPEB3 ribozyme introduction. This is the wrong citation. Please also insert citations for OR4K15 and IGF1R and LINE-1 ribozyme in this sentence.

      We thank the reviewer for pointing this out and we now have fixed it.

      Also in the introduction, a hammerhead ribozyme in the 3' UTR of Clec2 genes is mentioned and reference 16 (Cervera et al 2014) is given, I think it should be reference 9 (Martick et al 2008)

      We thank the reviewer for pointing this out and we now have fixed it.

      In the results section it is stated that, "original sequences were generated from a randomly fragmented human genomic DNA selection based biochemical experiment" citing reference 12. This is the wrong reference, as I could not find that Roth et al 2014 describe the use of such a technique. The same sentence occurs in the introduction almost verbatim (see also minor points).

      We thank the reviewer for pointing this out and we now have fixed it.

      Minor points

      Headline:

      Either use caps for all nouns in the headline or write "self-cleaving ribozyme" uncapitalized

      We thank the reviewer for pointing this out and we now have fixed it.

      Abstract:

      1st sentence: in "the" human genome

      "Moreover, the above functional regions are..." - the word "above" could be deleted here

      "named as lantern for their shape"- it should be "its shape"

      "in term of sequence and secondary structure"- "in terms"

      "the nucleotides at the cleavage sites" - use singular, each ribozyme of this class has only one cleavage site

      We thank the reviewer for pointing these out and we now have fixed them.

      Introduction:

      Change to "to have dominated early life forms"

      Change to "found in the human genome"

      Please write species names in italics (D. melanogaster, B. mori)

      Please delete "hosting" from "...are in noncoding regions of the hosting genome"

      Please delete the sentence fragment/or turn it into a meaningful sentence: "Selection-based biochemical experiments (12).

      Change to "in terms of sequence and secondary structure, suggesting a more"

      Please reword the last sentence in the introduction to make clear what is referred to by "its", e.g. probably the homology model of lantern ribozyme generated from twister sister ribozymes?

      Please refer to the appropriate methods section when explaining the calculation of RA and RA'.

      We thank the reviewer for pointing these out and we now have fixed them.

      The last sentence of the second paragraph in the second section of the results states that the authors confirmed functional regions for LINE-1 and OR4K15, however, until that point the section only presents data on LINE-1. Therefore, OR4K15 should not be mentioned at the end of this paragraph.

      In response to the reviewer's suggestions, we have removed OR4K15 from this paragraph.

      Line 225-228: The consistency between base pairs inferred from deep mutational scanning of the original sequences and that of the identified functional regions confirmed the correct identification of functional regions for LINE-1 ribozyme.

      Change to "Both ribozymes have two stems (P1, P2), to internal loops ..."

      We thank the reviewer for pointing this out and we now have fixed it.

      The section naming the "functional regions" of LINE-1 and OR4K15 lantern ribozymes should be moved after the section in which the circular permutation is shown and explained. Therefore, the headline of section three should read "Consensus sequence of LINE-1 and OR4K15 ribozymes" or something along these lines.

      We thank the reviewer for pointing this out and we now have fixed it.

      Line 308-309: Given the identical lantern-shaped regions of the LINE-1-rbz and OR4K15-rbz ribozyme, we named them twister sister-like (TS-like) ribozymes.

      The statement on the difference between C8 in OR4K15 and U38 in LINE-1 should be further classified. As U38 is only 95% conserved. Is it a C in those other instances or do all other nucleotide possibilities occur? Is the high conservation in OR4K15 an "artifact" of the low mutation rate for this RNA in the deep mutational scanning?

      We thank the reviewer for this question. Yes, the high conservation in OR4K15 an "artifact" of the low mutation rate for this RNA in the deep mutational scanning. That is why RA’ value is more appropriate to describe the conservation level of each position. We also mentioned this in the manuscript:

      Line 287-288: The only mismatch U38C in L1 has the RA’ of 0.6, suggesting that the mismatch is not disruptive to the functional structure of the ribozyme.

      Section five, first paragraph: instead of "two-stranded LINE-1 core" use the term "bimolecular", as it is more commonly used.

      We thank the reviewer for pointing this out and we now have changed it.

      Figure caption 3 headline states "Homology modelled 3D structure..."but it also shows the secondary structures of LINE1, OR4K15 and twister sister examples.

      We thank the reviewer for pointing this out and we now have removed “3D”.

      In Figure 3C, we see a nucleobase labeled G37, however in the secondary structure and sequence and 3D structural model there is a C37 at this position. Please correct the labeling.

      We thank the reviewer for pointing this out and we now have fixed it.

      Section 7 "To address the above question..." please just repeat the question you want to address to avoid any confusion to the reader.

      We thank the reviewer for pointing these out and we have re-phrased this sentence.

      Line 364: Considering the high similarity of the internal loops, we further investigated the mutational effects on the internal loop L1s.

      Please rephrase the sentence "By comparison, mutations of C62 (...) at the cleavage site did not make a major change on the cleavage activity...", e.g. "did not lead to a major change" etc.

      Section 8, first paragraph: This result further confirms that the RNA cleavage in lantern...", please delete "further"

      Change to "analogous RNAs that lacked the 2' oxygen atom in the -1 nucleotide"

      Methods

      Change to "We counted the number of reads of the cleaved and uncleaved..."

      Change to "...to produce enough DNA template for in vitro transcription."

      Change to "The DNA template used for transcription was used..." (delete while)

      We thank the reviewer for pointing these out and we now have fixed them.

      Supplement

      All supplementary figures could use more detailed Figure legends. They should be self-explanatory.

      Fig S1/S2: how is "mutation rate" defined/calculated?

      We thank the reviewer for pointing this out and we now have added a short explanation. The mutation rate was calculated as the proportion of mutations observed at each position for the DNA-seq library.

      Fig S3/S4: axis label "fraction", fraction of what? How calculated?

      We thank the reviewer for pointing this out and we now have added a short explanation. The Y axis “fraction” represents the ratio of each mutation type observed in all variants.

      Fig S5: RA and RA' are mentioned in the main text and methods, but should be briefly explained again here, or it should be clearly referred to the methods. Also, the axis label could be read as average RA' divided by average RA. I assume that is not the case. I assume I am looking at RA' values for LINE-1 rbz and RA values for LINE-1-ori? Also, mention that only part of the full LINE-1-ori sequence is shown...

      We thank the reviewer for pointing this out and we have now added a short explanation. The Y axis represents RA’ for LINE-1-rbz, or RA for LINE-1-ori. The part shown is the overlap region between LINE-1-rbz and LINE-1-ori. We apologize for any confusion caused by our previous statement.

      Fig S9 the magenta for coloring of the scissile phosphate is hard to see and immediately make out.

      We thank the reviewer for pointing this out and we now have added a label to the scissile phosphate.

      Fig S10: Why do the authors only show one product band here? Instead of both cleavage fragments as in Figure 5?

      We thank the reviewer for this question. We purposely used two fluorophores (5’ 6-FAM, 3’ TAMRA) to show the two product bands in Figure 5. In Fig S10, long-time incubation was used to distinguish catalysis based self-cleavage from RNA degradation. This figure was prepared before the purchasing of the substrate used in Figure 5. The substrate strand used in Fig S10 only have one fluorophore (5’ 6-FAM) modification. And the other product was too short to be visualized by SYBR Gold staining.

      Fig S13: please indicate meaning of colors in the legend (what is pink, blue, grey etc.)

      Please change to "RtcB ligase was used to capture the 3' fragment after cleavage...."

      We thank the reviewer for pointing this out and we now have fixed it.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Materials and Methods section:

      Cell gating and FACS sorting strategies need to be explained. There is no figure legend of supplementary figure 4 which is supposed to explain the gating strategy. Please detail the strategy for each cell types.

      Thank you for your suggestion. We have given a detailed description about the gating and FACS sorting strategies for different liver cell types in supplementary figure 1. In addition, flow cytometry plots of CD45+Ly6C-CD64+F4/80+ KCs from Bmp9fl/flBmp10fl/flLrat Cre mouse were also presented in supplementary figure 1.

      The genetic background of the different mouse strains and the age of the mice should be noted on each figure.

      All the mice used in our study are C57BL/6 background (method section). The age of the mice has been described on each figure.

      The Mann Whitney test instead of the two-tailed student's t-test should be used for the different statistical analyses. Why are the expression counts statically analyzed by 2-tailed Student's t test as they were already identified as DE in RNAseq statistical analysis?

      Thank you for your suggestion. Statical methods have been corrected in the revised manuscript.

      What is the age of the mice and how many are used for each bulk RNAseq?

      This information has been added on the corresponding figure legends.

      Figure 1:

      Figure 1a and c: The qPCR data would be much more interesting if presented as DDct and not as relative value as we do not see the mRNA levels of BMP9 and BMP10 in each Bmp9fl/flBmp10fl/flCre mouse. This would allow to compare the mRNA level of BMP9 versus BMP10. This should be changed in all figures.

      The presentation of qPCR data in Figure 1a have been changed, which is allowed to compare the abundance of BMP9 versus BMP10 mRNA. Figure 1c only shows the expression of BMP10, so it is unnecessary to present qPCR data as DDct. In our bulk RNA sequencing data of liver tissues, we found that BMP9 expression counts is higher than that of BMP10, in line with the data from BioGPS.

      Figure 1e (IF) and f (FACS), the quantification of these data should be added as shown in Fig2d. What is the difference between Fig1e and Fig2d as they both seem to show the quantification of F4/80 in CTL versus Bmp9fl/flBmp10fl/flLratCre mice. Are the cells sorted in Fig1f and 1e and suppl Fig1b? if yes please precise the strategy. If they are not gated how can the authors obtain 93% of KC? The reference Tillet et al., JBC 2018 should be added in the discussion of figure 1 as it is the first description of BMP10 in HSC.

      The quantitative data of Figure 1e and 1f have been added in our revised manuscript. Compared with other tissue-resident macrophages, CLEC4F as a KC-specific marker exclusively expressed on KCs. In our previous report (PMID: 34874921), we demonstrated that BMP9/10-ALK1 signal induced the expression of CLEC4F. The data shown in Figure 1e repeated this phenotype that upon loss of BMP9/10-ALK1 signal, liver macrophages did not express CLEC4F. F4/80 in Figure 1e was used as an internal positive control. Fig2d showed the quantification of F4/80 and CD64, two pan-macrophage markers, which was more accurate to measure the number of liver macrophages, especially given that F4/80 mean fluorescence intensity was reduced in liver macrophages of Bmp9fl/flBmp10fl/flLrat Cre mice. Cells in Fig1f, 1e and suppl Fig1b were not sorted and the flow cytometry plots of these cells were pre-gated on live CD45+Ly6C-CD64+F4/80+ liver macrophages. The reference Tillet et al., JBC 2018 has been added in our revised manuscript.

      Supplementary 4 should have a detailed figure legend and should appear before gating experiments. What cell subtype is used for each cell type gating. Please add the exact references of all the antibodies used and if they are fluorescently labeled antibodies. Why is the number of lymphocytes noted and how is it calculated? The gating strategy for the Bmp9fl/flBmp10fl/flLratCre mice should also be showed as the number of FA4/80+ and Tim4+ cells are decreased.

      A detailed figure legend has been added in original supplementary figure 4 that has been moved to supplementary figure 1 in our revised manuscript. The antibodies used in our study were also used in our previous report (PMID: 34874921) and others (PMID: 31561945; PMID: 26813785). Lymphocytes number on flow cytometry plots will automatically appear when we analyze flow cytometry data, so it does not mean that these selected cells are lymphocytes. To avoid the misunderstanding, these words have been deleted. The gating strategy of CD45+Ly6C-CD64+F4/80+ liver macrophages for the Bmp9fl/flBmp10fl/flLrat Cre mice was showed in our revised manuscript (Supplementary Figure 1).

      Figure 2:

      Figure 2a: How many mice were used for bulk RNAseq at what age? Please describe the gating strategy for sorting liver macrophages. The PCA should be shown. The genes represented in Fig2c and cited in the text should be shown on the volcano plot and the heatmap (Timd4, Cdh5, Cd5l). A reference for these KC and monocytic markers should be added in the text.

      Control and Bmp9fl/flBmp10fl/flLrat Cre mice at the age of 8-10 weeks (n=3/group) were used for bulk RNAseq. This information has been added in Figure 2a legend. The PCA, Timd4 gene and references for these KC and monocytic markers have been shown in our revised manuscript according to your suggestion.

      Figure 2b: How are selected the genes represented in the heatmap? The top ones? If it is a KC signature the authors should give a reference for this signature.

      These genes were KC signature genes. The reference (PMID: 30076102) has been given in our revised manuscript.

      Fig2e: Please explain what is the Vav1 promoter and in which cells it will delete Alk1and Smad4? The authors also need to show that Alk1 and Smad4 are indeed deleted in these mice and in which cell subtype (EC and KC?). This is an important point as the authors conclude that other molecular mechanisms than Smad4 signaling may affect the phenotypes of liver macrophages in Bmp9fl/flBmp10fl/flLratCre.

      Cre recombinase of Vav1Cre mice is expressed at high levels in hematopoietic stem cells (PMID: 27185381). This strain is widely used to target all hematopoietic cells with a high efficiency (PMID: 24857755). In our previous report (PMID: 34874921), we demonstrated that Alk1 (Supplemental Figure 6A) and Smad4 (Supplemental Figure 6G) were efficiently deleted in KCs from Alk1fl/flVav1Cre and Smad4fl/flVav1Cre mice, respectively. This sentence and reference have been added in our revised manuscript. Homozygous loss of ALK-1 causes embryonically lethality due to aberrant angiogenesis (PMID: 28213819). EC-specific ALK1 knockout in the mouse through deletion of the ALK1 gene from an Acvrl12loxP allele with the EC-specific L1-Cre line results in postnatal lethality at P5, and mice exhibiting hemorrhaging in the brain, lung, and gastrointestinal tract (PMID: 19805914). In contrast, Alk1fl/flVav1Cre mice generated in our lab did not observe this phenomenon or body weight loss, and still survived at the age of 16 weeks. Thus, we don’t think that ECs can be targeted by Vav1Cre strain, at least in our experimental system.

      Supl Figure 3 (revised Supl Figure 4): The authors need to explain what cell types are affected by Csf1r-Cre and Clec4fDTR. Have the authors tried to perform a similar experiment in Bmp9fl/flBmp10fl/flLratCre? The legend of the Y axis is not clear, why is CD45+ used in the first bar graph while the other two graphs use F4/80+?

      We (PMID: 34874921) and others (PMID: 31587991; PMID: 31561945; PMID: 26813785) have demonstrated that Clec4f specifically expressed on KCs and thus only KCs can be deleted in Clec4fDTR mice after DT injection. CSF1R, also known as macrophage colony-stimulating factor receptor (M-CSFR), is the receptor for the major monocyte/macrophage lineage differentiation factor CSF1. Thus, Csf1r-Cre strain can target monocyte, monocyte-derived macrophage and tissue-resident macrophage including liver, spleen, intestine, heart, kidney, and muscle with a high efficiency (PMID: 29761406). We did not perform a similar experiment in Bmp9fl/flBmp10fl/flLrat Cre mice as we have demonstrated that the differentiation of liver macrophages from Bmp9fl/flBmp10fl/flLrat Cre mice is inhibited. The other two graphs in Supl Figure 4C were obtained from Supl Figure 4B. Flow cytometry plots in Supl Figure 4B are pre-gated on CD45+Ly6C-CD64+F4/80+ liver macrophages, so it is appropriate to use F4/80+ as an internal control.

      Figure 3: Same remarks as in Figure 2. How many mice were used for bulk RNAseq, at what age? The PCA should be shown. How were selected the genes represented in the heatmap? The top ones? A reference should be given for the sinusoidal EC and the continuous EC signatures and large artery signature. Maf and Gata4 should be shown on the volcano plot. A quantification for CD34 IF (Fig3e) as well as for the quantification of the FACS data (Fig 3f) should be added.

      Control and Bmp9fl/flBmp10fl/flLrat Cre mice at the age of 8-10 weeks (n=3/group) were used for bulk RNAseq. According to your suggestion, other revisions have been made.

      Figure 4: A quantification and statistical analysis of Prussian staining area and GS IF should be added not just number of mice which were affected.

      A quantification and statistical analysis of Prussian staining area and GS IF has been added.

      Minor points:

      Few spelling mistakes that should be checked.

      Figure 5a, some bar graphs are missing.

      Spelling mistakes and missing bar graphs in Figure 5a have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      The authors should provide some additional information:

      - Did the single HSC-KO mice for either BMP9 or BMP10 already show partial phenotypes?

      We think that under steady state, the phenotype of KCs and ECs, described in our manuscript, in the livers of single HSC-KO mice for either BMP9 or BMP10 was not altered. However, we don’t know whether the role of BMP9 and BMP10 is still redundant in liver diseases or inflammation, which is worth further studying.

      - The authors should also stain Endomucin, Lyve1, CD32b on liver tissue to assess endothelial zonation/differentiation in addition to FACS analysis.

      In our revised manuscript, we performed immunostaining for Endomucin and Lyve1 and found increased expression of Endomucin and decreased expression of Lyve1 (Figure 3g), suggesting that endothelial zonation/differentiation was disrupt in the liver of Bmp9fl/flBmp10fl/flLrat Cre mice compared to their littermates. We did not stain CD32b expression in the liver section as there is no good antibody against mouse CD32b for frozen sections.

      - Did the authors assess BMP9/BMP10 effects individually and combined in vitro on KC and EC? Are these likely only direct effects or may they also involve each other (i.e. also cross talk between KC and EC in response to BMP9/10?). This could be assessed in co-culture models.

      Using ALK1 reporter mice, we demonstrated that KCs and liver ECs express ALK1.We and others have shown that in vitro stimulation with BMP9/BMP10 can induce the expression of ID1/ID3 and GATA4/Maf in KCs and ECs (PMID: 34874921; PMID: 35364013; PMID: 30964206), respectively. These results suggested that BMP9/BMP10 can directly function on KCs and ECs. Indeed, we are also interested in the crosstalk between KCs and ECs. However, in vitro coculture system can not mimic the interaction between KCs and ECs in the liver as these cells will lose their identity upon their isolation from liver environment. Nevertheless, Bonnardel et al. applied Nichenet bioinformatic analysis to predict that liver ECs provide anchoring site, Notch and CSF1 signal for KCs (PMID: 31561945). Of course, this prediction still needs experimental validation.

      - The abstract should be rephrased and more specific focus on BMP related intercellular crosstalk in the liver and its implications for liver health and disease. At the end of the abstract they should also emphasize for which specific fields/topics/diseases these findings are important.

      Thank you for your suggestion. The abstract has been rephrased and we hope this abstract could satisfy you.

    2. eLife Assessment

      This valuable study delineates the cellular contributions of BMP signaling in liver development and function. The findings are convincing, and the study employs state-of-the-art molecular, genetic, and cellular approaches to demonstrate that hepatic stellate cells play a central role in liver health by mediating cell-to-cell crosstalk via the production of specific BMP proteins. This study will be of interest to scientists interested in developmental biology and organ physiology.

    3. Reviewer #1 (Public review):

      Summary:

      The aim of the present work is to evaluate the role of BMP9 and BMP10 in liver by depleting Bmp9 and Bmp10 from the main liver cell types (endothelial cells (EC), hepatic stellate cells (HSC), Kupffer cells (KC) and hepatocytes (H)) using cell-specific cre recombinases. They show that HSCs are the main source of BMP9 and BMP10 in the liver. Using transgenic ALK1 reporter mice, they show that ALK1, the high affinity type 1 receptor for BMP9 and BMP10, is expressed on KC and EC. They have also performed bulk RNAseq analyses on whole liver, and cell-sorted EC and KC, and showed that loss of Bmp9 and Bmp10 decreased KC signature and that KC are replaced by monocyte-derived macrophages. EC derived from these Bmp9fl/flBmp10fl/flLratCre mice also lost their identity and transdifferentiated into continuous ECs. Liver iron metabolism and metabolic zonation were also affected in these mice. In conclusion, this work supports that BMP9 and BMP10 produced by HSC play a central role in mediating liver cell-cell crosstalk and liver homeostasis.

      Strengths:

      This work further supports the role of BMP9 and BMP10 in liver homeostasis. Using a specific HSC-Cre recombinase, the authors show for the first time that it is the BMP9 and BMP10 produced by HSC that play a central role in mediating liver cell-cell crosstalk to maintain a healthy liver. Although the overall message of the key role of BMP9 in liver homeostasis has been described by several groups, the role of hepatic BMP10 has not been studied before. Thus, one of the novelties of this work is to have used liver cell specific Cre recombinase to delete hepatic Bmp9 and Bmp10. The second novelty is the demonstration of the role of BMP9 and BMP10 in KC Differentiation/homeostasis which has already been slightly addressed by this group by knocking out ALK1, the high affinity receptor of BMP9 and BMP10 (Zhao et al. JCI, 2022).

      Weaknesses:

      This work remains rather descriptive and the molecular mechanisms are barely touched upon and could have been more explored.

    1. eLife Assessment

      This study offers important insights into the generation and maintenance of monosomic yeast lines and is, to our knowledge, the first to evaluate gene expression in yeast monosomies. The research introduces an innovative method to assess epistasis between genes on the same chromosome, providing solid evidence for positive epistatic interactions affecting fitness. Although the authors have substantially improved the methodology and interpretation during revision, questions regarding the interpretation of the transcriptome data have not been completely addressed.

    2. Reviewer #1 (Public review):

      The study by Korona and colleagues presents a rigorous experimental strategy for generating and maintaining a nearly complete set of monosomic yeast lines, thereby establishing a new standard for studying monosomes. Their careful approach in generating and handling monosome yeast lines, coupled with their use of high-throughput DNA sequencing and RNA sequencing, addresses concerns related to genomic instability and is commendable. However, I would like to express my concerns regarding the second part of the study, particularly the calculation of epistasis and the conclusion that vast positive epistatic effects have been observed. I believe that the conclusion of positive epistasis for fitness might be premature due to potential errors in estimating the expected fitness.

      The method used to calculate fitness expectation (1 + sum(di), where di = rDRi - 1) may be inappropriate. The logarithm transformation mentioned by the authors is designed to transform the exponential growth curve into a linear relation for estimating doubling rate, and thus the fitness expectation should be calculated as the product of rDRi values. As an illustration, if gene A exhibits a 20% reduction in fitness when halved (A/-) and gene B exhibits a 30% reduction (B/-), the expected fitness of A/- B/- should be 56%, rather than the 50% estimated in the study. In other words, the formula used by the authors could underestimate the fitness expectation.

      This issue is evident in Figure 2b, where negative values were obtained due to the use of an incorrect formula for estimating fitness expectations. It is worth noting that Figure 2a shows rDR values around one, indicating that no further logarithmic transformation was applied.

      While widespread positive epistasis in yeast has been reported by other studies (e.g., doi: 10.1038/ng.524, but not to the extent reported in this study), the conclusion of the current study might not be sufficiently supported. I recommend that the authors revisit their calculation methods to provide a more convincing conclusion on the presence of positive epistasis for fitness in their dataset. Overall, I appreciate the authors' efforts in this study but believe that addressing these concerns is essential for strengthening the validity of their findings.

      Comments on revised version:

      The authors have adequately addressed all my previous concerns during revision.

    3. Reviewer #2 (Public review):

      This study examines monosomies in yeast in comparison to synthetic lethals resulting from combinations of heterozygous gene deletions that individually have a detrimental effect. The survival of monosomies, albeit with detrimental growth defects, is interpreted as positive epistasis for fitness. Gene expression was examined in monosomies in an attempt to gain insight into why monosomies can survive when multiple heterozygous deletions on the respective chromosome do not. In the RNAseq experiments, many genes were interpreted to be increased in expression and some were interpreted as reduced. Those with the apparent strongest increase were the subunits of the ribosome and those with the apparent strongest decreases were subunits of the proteasome.

      The initiation and interpretation of the results were apparently performed in a vacuum of a century of work on genomic balance. Classical work in the flowering plant Datura and in Drosophila found that changes in chromosomal dosage would modulate phenotypes in a dosage sensitive manner (for references see Birchler and Veitia, 2021, Cytogenetics and Genome Research 161: 529-550). In terms of molecular studies, the most common modulation across the genome for monosomies is an upregulation (Guo and Birchler, Science 266: 1999-2002; Shi et al. 2021, The Plant Cell 33: 917-939).

      It was also apparently performed in a vacuum of results of evolutionary genomics that indicate the classes of genes for which dosage causes fitness consequences. It was from yeast genomics that it was realized that there is a difference in the fate of duplicate genes that are members of molecular complexes following whole genome duplications (WGD) versus small segmental duplications (SSD) with longer retention times from WGD than other genes and an underrepresentation in small scale duplications (e.g. Papp et al. 2003, Nature 424: 194-197; Hakes et al 2007, Genome Biol 8: R209). This pattern arises from negative fitness consequences of deletion of some but not all members of a complex after WGD or the overexpression of individual subunits after SSD (Defoort et al., 2019 Genome Biol Evol 11: 2292-2305; Shi et al., 2020, Mol Biol Evol 37: 2394-2413). In order for this pattern to occur, there must be a reasonably close relationship between mRNA and the respective protein levels. This pattern of retention and underrepresentation has been found throughout eukaryotes (e.g. Tasdighian et al 2017, Plant Cell 29: 2766-2785) indicating that yeast is not an outlier in its behavior.

      In the present yeast study, not only are there apparent increases for ribosomal subunits but also for many genes in the GAAC pathway, the NCR pathway, and Msn2p. The word "apparent" is used because RNAseq studies can only determine relative changes in gene expression (Loven et al., 2012, Cell 151: 476-482). Because aneuploidy can change the transcriptome size in general (Yang et al., 2021, The Plant Cell 33: 1016-1041), it is possible and maybe probable that this occurs in yeast monosomies as well. If there is an increase in the general transcriptome size, then there might not be as much reduction of the proteosome subunits as claimed and the increases might be somewhat less than indicated.

      Indeed, the authors claim that there is an increased cell volume in the monosomies. Given that cell volume correlates very well with the total transcriptome size (Loven et al., 2012, Cell 151: 476-482; Sun et al 2020, Current Biol 30: 1217-1230; Swaffer et al., 2023, Cell 186: 5254-5268), it could well be that there is an increased transcriptome size in the monosomies. Thus, the interpretation of the relative changes from RNAseq is compromised.

      It should be noted that contrary to the claims of the cited paper of Torres et al 2007 (Science 317: 916-924), a reanalysis of the data indicated that yeast disomies have many modulated genes in trans with downregulated genes being more common (Hou et al, 2018, PNAS 115: E11321-E11330). The claim of Torres et al that there are no global modulations in trans is counter to the knowledge that transcription factors are typically dosage sensitive and have multiple targets across the genome. The inverse effect trend is also true of maize disomies (Yang et al., 2021, The Plant Cell 33: 1016-1041), maize trisomies (Shi et al., 2021), Arabidopsis trisomies (Hou et al. 2018), Drosophila trisomies (Sun et al. 2013, PNAS 110: 7383-7388; Sun et al., 2013, PNAS 110: 16514-16519; Zhang et al., 2021, Scientific Reports 11: 19679; Zhang et al., genes 12: 1606) and human trisomies (Zhang et al., 2024, genes 15: 637). Taken as a whole it would seem to suggest that there are many inverse relationships of global gene expression with chromosomal dosage in both yeast disomies and monosomies.

      In a similar vein, the authors cite Muenzner et al 2024, Nature 630 149-157 that there is an attenuation of protein levels from aneuploid chromosomes while the mRNA levels correlate with gene dosage. This interpretation also seems to have been made in a vacuum of the evolutionary genomics data noted above and there was no consideration of transcriptome size change in the aneuploids. Also, Muenzner et al make the remarkable suggestion that there is degradation of overproduced proteins from hyperploidy, but for monosomies there is greater degradation of the proteins from the remainder of the genome.

      To clarify the claims of this study, it would be informative to produce distributions of the various ratios of individual gene expression in monosomy versus diploid as performed by Hou et al. 2018. This will better express the trends of up and down regulation across the genome and whether there are any genes on the varied chromosome that are dosage compensated. The authors claim in the Abstract that "There is no evidence of increased (compensatory) gene expression on the monosomic chromosomes", but then note after describing the increased cell volume of monosomies that this observation likely signals an increased transcriptome size: "Indeed, one explanation for the observed epistasis for viability could be an ample overproduction of all transcripts, so that even those halved by monosomy are sufficiently abundant". It is not clear to this reviewer what conclusions can be made from this work other than the empirical observation that monosomy does not reflect the cumulative effect of multiple haplo-insufficiencies of individual heterozygous deletions and that there are some RELATIVE changes in gene expression, but it is unclear what the ABSOLUTE PER CELL expression is across the whole genome. Clarifying this issue would be important for understanding the nature of any epistasis and fitness consequences.

    4. Reviewer #3 (Public review):

      The current study examined 13 monosomic yeast strains that lost different individual chromosomes. By comparing the fitness of monosomic strains and several heterozygous deletion strains, the authors observed strong positive epistasis for fitness. The transcriptomes of monosomic strains indicated that general gene-dose compensation is not the reason for fitness gains. On the other hand, gene expression of ribosomal proteins was up-regulated and proteasome subunit expression was down-regulated in all tested monosomic strains. The authors speculated that overexpression in combination with decreased degradation of the insufficient proteins might explain the positive epistasis observed in monosomic strains. This study investigates an important biological question and has some interesting results. However, I have some reservations about the data interpretations listed below.

      (1) In Figure 3b (and line 179), the authors stated that those haploinsufficient genes were not transcribed at elevated rates, but almost half of them are in reddish colors (indicating that the expression is higher than 1-fold). Obviously, many haploinsufficient genes are up-regulated in monosomic strains. What the data really show is that the level of overexpression is not correlated with the fitness effect of the deletion (since all the p values are not significant). The authors need to correct their conclusions.<br /> (2) Why are some monosomic strains removed from the transcriptomics analysis, especially when the chromosome IV and XV strains show very strong positive epistasis? The authors need to provide an explanation here.<br /> (3) The authors stated that diploidy observed in chromosome VII and XIII strains were due to endoreplication after losing the marked chromosomes (lines 97 and 117). Isn't chromosome missegregation an equally possible explanation? Since monosomic cells are generated by chromosome missegregation during mitosis, another chromosome missegregation event may occur to rescue the fitness (or viability) of monosomic cells in these strains.

      Comments for the revised version:

      The authors have addressed all my previous concerns and I have no further questions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Response to Reviewer #1 (Public Review):

      The reviewer is correct that the previous explanation of the fitness calculation could be considered insufficient as it was only briefly described in Results. In the revised manuscript, in the "Supplementary Materials" section and then in "Supplementary Text 1", we provide a full definition of the fitness of strains carrying single or multiple mutations and thus show how epistasis was calculated.

      Response to Reviewer #2 (Public Review):

      In our opinion, the reviewer's comments relate to three issues. First, our finding that the level of transcription of the monosomic chromosomes is not upregulated was not compared with the results of other studies, including those in other organisms. Indeed, we did not mention that the gene dosage distortions introduced by aneuploidy are frequently and profoundly compensated in multicellular organisms. We cite the suggested broad and recent review paper in the revised manuscript (line 247). We also removed the somewhat provocative sentence: “The relationship between transcriptome and proteome is generally fixed in yeast”. Regarding this organism, both data and opinions remain indeed conflicting when considering the work with many different yeast strains. But the standard laboratory strains stand out as those where dosage compensation is absent or weak. A paper published a year ago states flatly: "... at least in the strain background used here (authors: BY, the same we use), aneuploidies are transmitted to transcriptome and proteome with a minimum of gene-dosage buffering, rendering aneuploidies discoverable by proteomics" (Messner et al. 2023). A more recent paper reports: "In lab-generated aneuploids, some proteins - especially subunits of protein complexes - show reduced expression, but the overall protein levels correspond to the aneuploid gene dosage" (Muenzner et al. 2024). This "reduced expression" was seen in disomics and was achieved by upregulated proteolysis, whereas we have monosomics and downregulated proteolysis. In summary, we cannot back away from our claim that the biases introduced by monosomy were not compensated. (It is not critical to our paper, we could do it and still leave our main claim about extraordinarily high positive epistasis intact). Muezner and colleagues do report compensation, but in "wild" strains. Our explanation would be that the existing yeast aneuploids are not a random sample of aneuploid mutations. In particular, they could be strains, perhaps relatively rare, in which the genetic background was permissive for aneuploidy from the start or allowed rapid evolution toward tolerance of aneuploidy. Strains with rigid gene-mRNA-protein relationships suffer so much that they perish unless they are shielded from selection, as is possible in the laboratory. The reviewer will know better whether this might also apply to multicellular organisms.

      The second concern is that we did not sufficiently report "... the trends of up- and downregulation across the genome and whether there are any genes on the varied chromosome that are dosage compensated". We believe we have indeed done this, albeit mostly in a simple graphical fashion. For the whole genomes, Datasheet 2 reports the extent of down- or up-regulation for each gene in each strain and highlights those that are statistically significant. We do not analyze the distributions of these deviations because they are relative. They represent individual gene down- and up-regulations within a monosomic transcriptome compared to the corresponding genes in the diploid transcriptome, with the total size of the transcriptomes set equal. Thus, the downs and ups cancel each other out, the left and right sides of the distribution would be equal in their totals, and we have no meaningful expectations about the possible variation in the shapes of the overall distributions or their opposite sides. As for the "varied chromosome", we show that there were extensive down- and up-regulations on the monosomic chromosomes, even though the mean expression for them was half that of the diploid chromosomes. This can be seen in Figure 3B as blue and red colored bars that are present on each monosomic chromosome and intermingled along its length. The purpose of these graphs is to show that even the genes in which the halving of the dose was most damaging to fitness (most negative values of rDR-1) did not tend to be upregulated on average (both blue and red colors are found among them). We consider this an important and original part of our data.

      Finally, the reviewer is concerned that we are only dealing with the relative abundance of mRNA species. He/she suggests that "... an experiment that would clarify the results would be to perform estimates of the total transcriptome size. If the general transcriptome size is indeed increased, the claims of reduced proteosome expression may need to be revised". We followed this advice and extracted transcriptomes from known amounts of yeast cells with known amounts of standard mRNA or "spike" added. We thus seriously considered the reviewer's suggestion, even though it was contrary to our intuition and, we believe, was not confirmed in the additional experiment. The results are reported in the last paragraph of Results and shown in Supplementary Figure S3. Our arguments are listed in that paragraph, so we will not repeat them here.

      Response to Reviewer #3 (Public Review):

      (1) Figure 3b – both its legend and reference to it in the main text are corrected in line with suggestions made by Reviewers #1 and #3.  

      (2) We had to restrict our mRNA analysis to about a half of strains. We decided for purely random selection. It left M4 outside but nevertheless included M2, M10 or M16 representing the strains with especially high level of epistasis. See msc. lines 161-162.

      (3) We agree, and say so in the article, that both the loss and gain of a copy of a chromosome most likely result in errors in mitosis. By "endoreduplication" we mean any event resulting in two chromosomes instead of one, not necessarily additional DNA replication as we now clarify. We also suggest that both loss and endoreduplication occurred in all strains, but in M7 and M13 they happened so close together that we could not isolate the rare monosomic cells from the rapidly spreading revertants (lines 86-91).

      Recommendations for the authors:

      Reply to Reviewer #1 (Recommendations for The Authors):

      The legend to Fig. 3b is hopefully clearer now.

      Reply to Reviewer #2 (Recommendations for The Authors):

      We understand that these points were raised also in the public review so the answer to the latter is also relevant to the recommendations for authors.

      Reply to Reviewer #3 (Recommendations for The Authors):

      (1) The first sentence of this comment may be based on a misinterpretation of our main argument. We believe that the upregulation of ribosomal protein (RP) coding genes was not helpful, but harmful. It was costly because RPs are a large part of the proteome, but it did not help translation because it did not restore the stoichiometry of RPs. This unproductive investment reduced the rate of remaining metabolism, so that other impairments introduced by halving the doses of other genes were no longer critical, and this made them unobservable at the level of phenotype, i.e., produced epistasis. However, both this Reviewer and Reviewer 2 seem to suggest that an entire translational apparatus may have been expanded, compensating for its reduced efficiency (per transcript). Reviewer 2 suggested an mRNA spike as a standard, and we followed this approach as more accessible to us. (We reiterate our claim of good agreement between mRNAs and proteins in the BY strain, supported by two new important papers, line 256-257). The results are reported in the last paragraph of Results. We believe that they indicate a reduction, not an increase, in the translational apparatus (including its parts encoded on the monosomal chromosomes), so that our explanation of positive epistasis remains unchallenged.

      (2) We re-examined the sequences and found that there were heterozygous SNPs in the same gene, RSP5, in several strains. One was a loss of a START codon (M3, M4, M6, M8, M9, M10, M14, M16), always the same. The other was a substitution, always the same, in M5, M11 and M15. There were no mutations in this gene in M1 and M2. We tested our stock haploid strains BY4741 and 4742 and found that they were not mutated. However, we also recovered the specific haploid strains used in the final crosses to construct the diploid strains used to obtain monosomics. Some had one of the two mutations, some were clean. All grew normally, the mutants were similar to the wild types, indicating that the fitness effect of the mutations, even in haploids, was at most partial, since the expected severe effects of RSP5 inactivation were not visible.

      Where do the mutations come from? In previous experiments, we subjected some BY strains to severe selection regimes. As we can now surmise, mutations in RSP5 helped to resist some of them, especially those involving overexpression of selected genes. (We do not summarize here the results of our lengthy review of our notes and the literature leading this explanation to be the most plausible). Unfortunately, we used strains that went through that harsh selection in crosses serving to derive another collection of strains, those used here.

      How critical is it? First, the mutations were heterozygous, which further reduced their apparently weak effects. Second, M1 and M2 were free of them. Third, we tried to get clean monosomics, i.e. with type homozygous for RSP5. We obtained such strains with monosomy as the only change for M9, M10 and M16. The other three attempts did not yield correct M3, M5 and M6, but complex aneuploids. This is normal, as we explain (complain) in Results. We would have to isolate a large number of potential monosomies and then sequence them to show that all exact monosomies can be derived in the absence of mutations in RSP5. We believe that after an effort comparable to that required to obtain the first set of monosomics, we would complete it. For financial and organizational reasons, this is not possible at this time. We do not consider it necessary to complete the revision. Note that of the five mutation-free straight monosomics, M2, M10 and M16 are among the most affected and thus have the highest positive epistasis. Yes, the role of point mutations cannot be excluded for other monosomics, although we strongly believe it is unlikely. But we have removed all our previous claims that our monosomies are certainly not supported by other genetic changes. Most importantly, our main claim of positive epistasis in its purely descriptive genetic sense remains unaffected. The main functional argument also holds: the indiscriminate overproduction of unbalanced RP proteins was so costly that inefficiencies introduced in functional modules other than biosynthesis become much less relevant. Thus, the main message of our work does not depend on the thinkable, in our view unlikely, role of mutations in RSP5.

      We provided this lengthy explanation to show that we cared about the reviewer's comment and tried to deal with it in an honest way. It was a lot of pain and no gain for us, but we are still grateful for the opportunity to re-examine our main claims.  

      (3) The 16 (non-essential) plus 16 (essential) strains were replicated 3 times each. In preliminary experiments, we tested that they were not statistically different (using one-way ANOVA). We considered these 32 strains to have the same genetic background, and thus we considered the 96 estimates homogeneous, except for being influenced only by environmental variation or random error.

      (4) We changed the description of Figure 3b to explain that a particular color shows a range (not its boundary) of log2 fold change (FC) relative to the control.

      (5) Corrected.

      (6) Corrected.

    1. eLife Assessment

      In this valuable study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin that overlaps with the one found in the cold-activated TRPM8 channel. After examining a body of TRP channel cryo-EM structures to identify the conserved site, this study presents convincing electrophysiological evidence supporting the presence of an icilin binding pocket within TRPM4. This study shows that icilin has modulatory effects on the TRPM4 channel and will be of direct interest to those working in the TRP-channel field, but it also has implications for studies of somatosensation, taste, as well as pharmacological targeting of the TRPM subfamily.

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to study whether the cooling agent binding site in TRPM8, which is located between the S1-S4 and the TRP domain, is conserved within the TRPM family of ion channels. They specifically chose the TRPM4 channel as the model system, which is directly activated by intracellular Ca2+. Using electrophysiology, the authors characterized and compared the Ca2+ sensitivity and the voltage-dependence of TRPM4 channels in the absence and presence of synthetic cooling agonist icilin. They also analyzed the mutational effects of residues (A867G and R901H; equivalent mutations in TRPM8 were shown involved in icilin sensitivity) on Ca2+ sensitivity and voltage-dependence of TRPM4 in the absence and presence of Ca2+. Based on the results as well as structure/sequence alignment, the authors concluded that icilin likely binds to the same pocket in TRPM4 and suggested that this cooling agonist binding pocket is conserved in TRPM channels.

      Strengths:

      The authors gave a very thorough introduction of the TRPM channels. They have nicely characterized the Ca2+ sensitivity and the voltage-dependence of TRPM4 channels and demonstrated icilin potentiates the Ca2+ sensitivity and diminishes the outward rectification of TRPM4. These results indicate icilin modulates TRPM4 activation by Ca2+.

      The authors have incorporated additional data analysis and control experiments in the revised manuscript to strengthen their findings. They have well addressed the concerns raised by reviewers in the responses.

      Weaknesses:

      The study is conducted based on an assumption that TRPM4 activation is controlled by Ca2+ binding to a single site in the S1-S4 pocket in each subunit, and the second Ca2+ site in the cytoplasmic MHRs is simplified.

      Despite the technical reasons presented by the authors in the rebuttal, the conclusion of this study can be strengthened if more cooling compounds- the most well-studied natural cooling agonist menthol, and/or other cooling agonists such as WS-12 and/or C3-are tested for their effects on TRPM4 and several other TRPM channels.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this important study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin which overlaps with the one found in the cold-activated TRPM8 channel.

      The authors identify the residues involved in icilin binding by analyzing the existing TRPM8-icilin complex structures and then use their previously published approach of structure-based sequence comparison to compare the icilin binding residues in TRPM8 to other TRPM channels. This approach uncovered that the residues are conserved in a number of TRPM members: TRPM2, TRPM4, and TRPM5. The authors focus on TRPM4, with the rationale that it has the simplest activation properties (a single Ca2+-binding site). Electrophysiological studies show that icilin by itself does not activate TRPM4, but it strongly potentiates the Ca2+ activation of TRPM4, and introducing the A867G mutation (the mutation that renders avian TRPM8 sensitive to icilin) further increases the potentiating effects of the compound. Conversely, the mutation of a residue that likely directly interacts with icilin in the binding pocket, R901H, results in channels whose Ca2+ sensitivity is not potentiated by icilin.

      The data indicate that, just like in TRPV channels, the binding pockets and allosteric networks might be conserved in the TRPM subfamily.

      The data are convincing, and the authors employ good experimental controls.

      We appreciate the supportive feedback of this reviewer.

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to study whether the cooling agent binding site in TRPM8, which is located between the S1-S4 and the TRP domain, is conserved within the TRPM family of ion channels. They specifically chose the TRPM4 channel as the model system, which is directly activated by intracellular Ca2+. Using electrophysiology, the authors characterized and compared the Ca2+ sensitivity and the voltage dependence of TRPM4 channels in the absence and presence of synthetic cooling agonist icilin. They also analyzed the mutational effects of residues (A867G and R901H; equivalent mutations in TRPM8 were shown involved in icilin sensitivity) on Ca2+ sensitivity and voltage-dependence of TRPM4 in the absence and presence of Ca2+. Based on the results as well as structure/sequence alignment, the authors concluded that icilin likely binds to the same pocket in TRPM4 and suggested that this cooling agonist binding pocket is conserved in TRPM channels.

      Strengths:

      The authors gave a very thorough introduction to the TRPM channels. They have nicely characterized the Ca2+ sensitivity and the voltage-dependence of TRPM4 channels and demonstrated icilin potentiates the Ca2+ sensitivity and diminishes the outward rectification of TRPM4. These results indicate icilin modulates TRPM4 activation by Ca2+.

      We appreciate the supportive feedback of this reviewer.

      Weaknesses:

      The reviewer has a few concerns. First, icilin alone (at 25µM) and in the absence of Ca2+ does not activate the TRPM4 channel. Have the authors titrated a wide range of icilin concentrations (without Ca2+ present) for TRPM4 activation? It raises the question that whether icilin is indeed an agonist for TRPM4 channel. This has not been tested so it is unclear. One may argue that icilin needs Ca2+ as a co-factor for channel activation just like in TRPM8 channel. This leads to the second concern, which is a complication in the experimental design and data interpretation. TRPM4 itself requires Ca2+ for activation to begin with, thus it is hard to dissect whether the current observed here for TRPM4 is activated by Ca2+ or by icilin plus its cofactor Ca2+. This is the difference between TRPM8 and TRPM4, as TRPM8 itself is not activated by Ca2+, thus TRPM8 activation is through icilin and Ca2+ acts as a prerequisite for icilin activation.

      We agree that the comparison between TRPM8 and TRPM4 is not perfect because TRPM4 requires Ca2+ for activation, but it is clear that the current activated by Ca2+ in the presence of icilin also involves icilin because it activates at lower Ca2+ concentrations and lower voltages. We have tested icilin at concentrations between 12.5 and 25 µM and at these concentrations icilin does not activate TRPM4 when applied alone, so we have no evidence that it is an agonist. Both of these concentrations are higher than those reported by Chuang et al. to be saturating for TRPM8 in the presence of Ca2+. We haven’t tested icilin at higher concentrations because we wanted to keep the final concentration of DMSO low enough to avoid any effects of the vehicle. We now emphasize this even more clearly in the revised manuscript.

      The results presented in this study are only sufficient to show that icilin modulates the Ca2+-dependent activation of TRPM4 and icilin at best may act as an allosteric modulator for TRPM4 function. One cannot conclude from the current work that icilin is an agonist or even specifically a cooling agonist for TRPM4. Icilin is a cooling agonist for TRPM8, but it does not mean that if icilin modulates TRPM4 activity then it serves as a cooling agonist for TRPM4.

      We agree with these comments, and we believe that the intent of our statements in the manuscript are completely in line with this perspective. We never refer to icilin as a cooling agent for TRPM4 but rather refer to the cooling agent binding pocket in TRPM8 and how that appears to be conserved and functions in TRPM4 to modulate opening of the channel. We have carefully gone through the manuscript to refer directly to icilin by name (rather than as a cooling agent) when referring to its actions on TRPM4 to make sure there is no confusion.

      For the mutation data on A867G, Figure 4A-B, left panels, it looks like A867G has stronger Ca2+ sensitivity compared to the WT in the absence of icilin and the onset of current activation is faster than the WT, or this is simply due to the scale of the data figure are different between A867G and the WT. Overall the mutagenesis data are weak to support the conclusion that icilin binds to the S1-S4 pocket. The authors need to mutate more residues that are involved in direct interaction with icilin based on the available structural information, including but limited to residues equivalent to Y745 and H845 in human TRPM8.

      The A867G mutant does seem to promote opening by Ca2+ in the absence of icilin, and we now comment on this in the manuscript. Having said that, we have not carefully studied the concentration-dependence for activation by Ca2+ because at higher concentrations we see evidence of desensitization. We think Ca2+, icilin and depolarized voltages promote an open state of TRPM4 and the A867G does so as well.

      We respectfully disagree about the strength of mutagenesis results present in our manuscript. We present clear gain and loss of function for two mutants corresponding to influential residues within the cooling agent binding pocket of TRPM8. We agree that Y786 mutations would have been a valuable addition, and our plan was to include mutations of this residue. Unfortunately, both the Y786A and Y786H mutants exhibited rundown to repeated stimulation by Ca2+, making them challenging to obtain reliable results on their effects on modulation by icilin.

      The authors set out to study the conservation of the cooling agonist binding site in TRPM family, but only tested a synthetic cooling agonist icilin on TRPM4. In order to draw a broad conclusion as the title and the discussion have claimed, the authors need to more cooling compounds, including the most well-known natural cooling agonist menthol, and other cooling agonists such as WS-12 and/or C3, and test their effects on several TRPM channels, not just TRPM4. With the current data, the authors need to significantly tone down the claim of a conserved cooling agonist binding pocket in the TRPM family.

      We would have liked to broaden the scope to other ligands that modulate TRPM8 and we agree that including those data would certainly reinforce our conclusions. However, the first author recently moved on to a new faculty position and extending our findings would require enlisting another member of the lab and take away from their independent projects. We also do not agree that this is essential to support any of our conclusions. It is also important to keep in mind that icilin is a high-affinity ligand for TRPM8, such that weaker interactions with TRPM4 can still be readily observed. We think it is likely that lower affinity agonists like menthol might not have sufficient affinity to see activity in TRPM4. This scenario is not unlike our earlier experience with TRPV channels where we succeeded in engineering vanilloid sensitivity into TRPV2 and TRPV3 using the high affinity agonist resiniferatoxin (Zhang et al., 2016, eLife). In the case of TRPV2, another group had made the same quadruple mutant and failed to see activation by capsaicin even though resiniferatoxin also worked in their hands (see Fig. 2 in Yang et al., 2016, PNAS).

      On page 11, the authors suggest based on the current data, that TRPM2 and TRPM5 may also be sensitive to cooling agonists because the key residues are conserved. TRPM2 is the closest homolog to TRPM8 but is menthol-insensitive. There are studies that attempted to convert menthol sensitivity to TRPM2, for example, Bandell 2006 attempted to introduce S2 and TRP domains from TRPM8 into TRPM2 but failed to make TRPM2 a menthol-sensitive channel. The sequence conservation or structural similarity is not sufficient for the authors to suggest a shared cooling agonist sensitivity or even a common binding site in the TRPM2 and TRPM5 channels. Again, as pointed out above, the authors need to establish the actual activation of other TRPM channels by these agonists first, before proceeding to functionally probe whether other TRPM channels adopt a conserved agonist binding site.

      We are somewhat confused by these comments because we do not comment about whether cooling agents can activate TRPM2 or TRPM5. We simply analyzed the structures to make the point that the key residues in the cooling agent binding pocket of TRPM8 are conserved in these other TRPM channels. The Bandell paper is relevant, but it is also possible that they failed to uncover a relationship because they only used an agonist that has relatively low affinity for TRPM8. It would have been interesting to see what they might have found if they had used a high-affinity ligand like icilin instead of a low affinity ligand like menthol.

      Taken together, this current work presents data to show the modulatory effects of icilin on the Ca2+ dependent activation and voltage dependence of the TRPM4 channel.

      We agree.

      Reviewer #3 (Public Review):

      Summary:

      The family of transient receptor potential (TRP) channels are tetrameric cation selective channels that are modulated by a variety of stimuli, most notably temperature. In particular, the Transient receptor potential Melastatin subfamily member 8 (TRPM8) is activated by noxious cold and other cooling agents such as menthol and icilin and participates in cold somatosensation in humans. The abundance of TRP channel structural data that has been published in the past decade demonstrates clear architectural conservation within the ion channel family. This suggests the potential for unifying mechanisms of gating despite their varied modes of regulation, which are not yet understood. To address this question, the authors examine the 264 structures of TRP channels determined to date and observe a potential binding pocket for icilin in multiple members of the Melastatin subfamily, TRPM2, TRPM4, and TRPM5. Interestingly, none of the other Melastatin subfamily members had been shown to be sensitive to icilin apart from TRPM8. Each of these channels is activated by intracellular calcium (Ca2+) and a Ca2+ binding site neighbors the predicted pocket for icilin binding in all cryo-EM structures. The authors examined whether icilin could modulate the activation of TRPM4 in the presence of intracellular Ca2+. The addition of icilin enhances Ca2+-dependent activation of TRPM4, promotes channel opening at negative membrane potentials, and improves the kinetics of opening. Furthermore, mutagenesis of TRPM4 residues within the putative icilin binding pocket predicted to enhance or diminish TRPM4 activity elicit these behaviors. Overall, this study furthers our understanding of the Melastatin subfamily of TRP channel gating and demonstrates that a conserved binding pocket observed between TRPM4 and TRPM8 channel structures can function similarly to regulate channel gating.

      Strengths:

      This is a simple and elegant study capitalizing on a vast amount of high-resolution structural information from the TRP channel of ion channels to identify a conserved binding pocket that was previously unknown in the Melastatin subfamily, which is interrogated by the authors through careful electrophysiology and mutagenesis studies.

      Weaknesses:

      No weaknesses were identified by this reviewer.

      We appreciate the supportive comments of the review.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I don't have any major asks, but a few questions did arise while reading your work.

      (1) You refer multiple times to the VSLD pocket as being "open to the cytoplasm". It is not clear if you are implying that compounds such as icilin access the pocket via the cytoplasm (e.g., permeate the membrane to the cytosol, and then enter the binding site?) Is there data to support this? Some clarification here would be helpful, and perhaps explain if there is any distinction between how calcium might enter the VSLD binding site vs hydrophobic compounds like icilin.

      This is an excellent point. Our reference to “open to the cytoplasm” was for Ca2+ ions and we have no evidence for how icilin enters the cooling agent binding pocket. We had tried to look for evidence that Ca2+ might trap icilin in TRPM4 but at the end of the day the results were not convincing enough to include in the manuscript. We have added data showing that icilin slows deactivation of TRPM4 after removing Ca2+, which is particularly evident in the A867G mutant, but this doesn’t inform on whether Ca2+ can trap icilin. We have added a statement about not knowing how icilin enters or leaves the cooling agent binding pocket in TRPM channels.

      (2) Icilin is referred to as a "cooling compound", but its cooling effects are dependent on its interactions with TRPM8. This might be something to clarify, as it might otherwise be understood that other TRPM channels that interact with icilin also mediate the sensing of cool temperatures.

      This is another excellent point and we have no reason to believe that icilin interacting with any TRPM channel other than TRPM8 mediates cooling sensations. We have added a statement to this effect in the discussion when considering actions of icilin that might be mediated by TRPM4 channels.

      Reviewer #2 (Recommendations For The Authors):

      (1) The title and statements in the results/discussion refer to icilin as a cooling agonist of TRPM4 and binds to a conserved "cooling agonist binding pocket", and the authors suggested a similar role and binding site for icilin in TRPM2 and TRPM5 channel. It is a too broad conclusion that is not fully supported by the current experimental data, which only shows icilin works as a modulator, not an agonist for TRPM4 channel. The authors should change the usage of cooling agonist or conserved cooling agonist binding pocket plus significantly tone down the conclusion of a conserved cooling agonist binding pocket, which is potentially misleading. Alternatively, if the authors insist on using cooling agonist in this context, they should establish the activation of TRPM4, TRPM2, and TRPM5 by icilin as the first step, because the current data only support icilin as a TRPM4 modulator but not an agonist.

      We respectfully don’t agree with this opinion. We show broad conservation of the cooling agent binding pocket in structures of many TRPM channels, and we chose one of them to test for a functional relationship. We think that the title accurately reflects the topic of the paper and does not specify the extent to which functional conservation has been demonstrated and we would like to keep it. The distinction between agonist and modulator is not even germane because icilin is not an agonist of TRPM8 either.

      (2) The manuscript will be strengthened if the authors test additional cooling compounds of TRPM8, including menthol, the menthol analog WS-12, and C3. More importantly, distinct from icilin, these three compounds do not depend on Ca2+ to activate the TRPM8 channel. Thus when testing these compounds on TRPM4, it may reduce the complication of the role of Ca2+, as TRPM4 channel itself requires Ca2+ for activation.

      We restate our response to this point in the public review…

      We would have liked to broaden the scope to other ligands that modulate TRPM8 and we agree that including those data would certainly reinforce our conclusions. However, the first author recently moved on to a new faculty position and extending our findings would require enlisting another member of the lab and taking away from their independent projects. We also do not agree that this is essential to support any of our conclusions. It is also important to keep in mind that icilin is a high-affinity ligand for TRPM8, such that weaker interactions with TRPM4 can still be readily observed. We think it is likely that lower affinity agonists like menthol might not have sufficient affinity to see activity in TRPM4 This scenario is not unlike our earlier experience with TRPV channels where we succeeded in engineering vanilloid sensitivity into TRPV2 and TRPV3 using the high affinity agonist resiniferatoxin (Zhang et al., 2016, eLife). In the case of TRPV2, another group had made the same quadruple mutant and failed to see activation by capsaicin even though resiniferatoxin also worked in their hands (see Fig. 2 in Yang et al., 2016, PNAS).

      (3) The manuscript will be strengthened if the authors test additional residues in the S1-S4 pocket that form direct interactions or are within interacting distances with icilin based on the cryo-EM structures.

      We restate our response to this point in the public review…

      We present clear gain and loss of function for two mutants corresponding to influential residues within the cooling agent binding pocket of TRPM8. We agree that Y786 mutations would have been a valuable addition and our plan was to include mutations of this residue. Unfortunately, both the Y786A and Y786H mutants exhibited rundown, making them challenging to obtain reliable results on their effects on modulation by icilin.

      Furthermore, the ambiguity in the icilin binding pose based on available TRPM8 structures complicates structure-based identification of the most important interacting residues in TRPM8, and we would have needed to functionally validate the effects of any novel mutations we identified in TRPM8 prior to testing them in TRPM4. Instead, we have based our mutagenesis on constructs that have been previously characterized to affect the sensitivity of TRPM8 to cooling agents. A systematic mutagenesis scan of TRPM8 residues predicted to interact differentially with icilin in the two different available binding poses would likely help clarify the true binding pose of icilin and would be an interesting future study.

      Reviewer #3 (Recommendations For The Authors):

      I enjoyed reading this manuscript. It was well-executed and written. It will be interesting to corroborate these findings with a cryo-EM structure of TRPM2, TRPM4, or TRPM5 in the presence of icilin.

      We agree and may pursue these in future studies. This would be particularly interesting given ambiguities in how icilin docks into TRPM8 in previously published structures.

      Minor comments/questions:

      Have the authors considered icilin accessibility to its binding pocket? In other words, could the presence of intracellular Ca2+ inhibit the accessibility of icilin to its binding pocket in TRPM4? It should be a straightforward experiment, I think it would be informative, and could further support the authors' conclusion of the location of the TRPM4 icilin binding pocket.

      We completely agree and we had tried to look for evidence that Ca2+ might trap icilin in TRPM4 but at the end of the day the results were not convincing enough to include in the manuscript. We have added data showing that icilin slows deactivation of TRPM4 after removing Ca2+, which is particularly evident in the A867G mutant, but this doesn’t inform on whether Ca2+ can trap icilin. We have added a statement about not knowing how icilin enters or leaves the cooling agent binding pocket in TRPM channels.

      Figures 7 and 8 are missing the 0 µM Ca2+ control trace in the presence of 25 µM icilin.

      All sample traces from Figures 7 and 8 are shown from a single cell for the sake of comparison (Likewise, the sample traces from Figures 3 and 4 come from a single cell, and the sample traces from Figures 5 and 6 come from a single cell). Unfortunately, we were unable to obtain data from an R901H mutant cell that contained all six conditions we wished to show, and there is no representative trace for 0 µM Ca2+ in the presence of 25 µM icilin for that cell.

      This is up to the discretion of the authors, but perhaps a better way to arrange the paper Figures would be to combine Figures 5-6 and Figures 7-8 and rearrange the data to place some in a supplementary figure (e.g. Figure 5-6 = Figure 5 and Figure 5 - Figure Supplement 1, Figure 7-8 = Figure 6 and Figure 6 - Figure Supplement 1).

      We carefully considered these suggestions and we appreciate the reviewers’ flexibility but would prefer to retain the original arrangement of data in the figures.

      Are there any mutations in the icilin binding pocket in TRPM4, and presumably TRPM2 and TRPM5, that are associated with human disease? This is a question that came to my mind and not one that needs to be addressed in the manuscript.

      This is an interesting point. There are quite a few disease-associated mutants within TRPM4 at positions corresponding to the cooling agent binding pocket in TRPM8. We could not see an appropriate place in the discussion where we could concisely bring this information in so we decided against commenting.

    1. eLife Assessment

      This study demonstrates a novel role for SIRT4; a mitochondrial deacetylase, shown to translocate into nuclei where it regulates RNA alternative splicing by modulating U2AF2 and the gene expression of CCN2 in tubular cells in response to TGF-β. This fundamental work substantially advances our understanding of kidney fibrosis development and offers a potential therapeutic approach. The evidence supporting the conclusions of a SIRT4-U2AF2-CCN2 axis activated by TGF-β is compelling and adds a new layer of complexity to the pathogenesis of chronic kidney disease.

    2. Reviewer #1 (Public review):

      In this manuscript, Yang et al report a novel regulatory role of SIRT4 in the progression of kidney fibrosis. The authors showed that in the fibrotic kidney, SIRT4 exhibited an increased nuclear localization. Deletion of Sirt4 in renal tubule epithelium attenuated the extent of kidney fibrosis following injury, while overexpression of SIRT4 aggravates kidney fibrosis. Employing a battery of in vitro and in vivo experiments, the authors demonstrated that SIRT4 interacts with U2AF2 in the nucleus upon TGF-β1 stimulation or kidney injury and deacetylates U2AF2 at K413, resulting in elevated CCN2 expression through alternative splicing of Ccn2 gene to promote kidney fibrosis. The authors further showed that the translocation of SIRT4 is through the BAX/BAK pore complex and is dependent on the ERK1/2-mediated phosphorylation of SIRT4 at S36, and consequently the binding of SIRT4 to importin α1. This fundamental work substantially advances our understanding of the progression of kidney fibrosis and uncovers a novel SIRT4-U2AF2-CCN2 axis as a potential therapeutic target for kidney fibrosis.

      Comment on revised version:

      In the new version of the manuscript, the authors have addressed most of my concerns . Overall, the authors have done an extensive, well-performed study. The results are convincing, and the conclusions are mostly well supported by the data. The message is interesting to a wider community working on kidney fibrosis, protein acetylation and SIRT4 biology. This work substantially advances our understanding of the mechanism of kidney fibrosis and uncovers a novel SIRT4-U2AF2-CCN2 axis as a potential therapeutic target for kidney fibrosis.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Yang et al. presents a novel and significant investigation into the role of SIRT4 For CCN2 expression in response to TGF-β by modulating U2AF2-mediated alternative splicing and its impact on the development of kidney fibrosis.

      Strengths:

      The authors' main conclusion is that SIRT4 plays a role in kidney fibrosis by regulating CCN2 expression via pre-mRNA splicing. Additionally, the study reveals that SIRT4 translocates from the mitochondria to the cytoplasm through the BAX/BAK pore under TGF-β stimulation. In the cytoplasm, TGF-β activated the ERK pathway and induced the phosphorylation of SIRT4 at Ser36, further promoting its interaction with importin α1 and subsequent nuclear translocation. In the nucleus, SIRT4 was found to deacetylate U2AF2 at K413, facilitating the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Overall, the findings are fully convincing. The current study, to some extent, shows potential importance in this field.

    4. Reviewer #3 (Public review):

      Summary:

      Yang et al reported in this paper that TGF-beta induces SIRT4 activation, TGF-beta activated SIRT4 then modulates U2AF2 alternative splicing, U2AF2 in turn causes CCN2 for expression. The mechanism is described as this: mitochondrial SIRT4 transport into the cytoplasm in response to TGF-β stimulation, phosphorylated by ERK in the cytoplasm, and pathway and then undergo nuclear translocation by forming the complex with importin α1. In the nucleus, SIRT4 can then deacetylate U2AF2 at K413 to facilitate the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Moreover, they used exosomes to deliver Sirt4 antibodies to mitigate renal fibrosis in a mouse model. TGF-beta has been widely reported for its role in fibrosis induction.

      Strengths:

      TGF-beta induction of SIRT4 translocation from mitochondria to nuclei for epigenetics or gene regulation remains largely unknown. The findings presented here that SIRT4 is involved in U2AF2 deacetylation and CCN2 expression are interesting.

      Comments on revised version:

      I went through the revised manuscript and the letter from the authors. I have no further concerns.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Yang et al report a novel regulatory role of SIRT4 in the progression of kidney fibrosis. The authors showed that in the fibrotic kidney, SIRT4 exhibited an increased nuclear localization. Deletion of Sirt4 in renal tubule epithelium attenuated the extent of kidney fibrosis following injury, while overexpression of SIRT4 aggravates kidney fibrosis. Employing a battery of in vitro and in vivo experiments, the authors demonstrated that SIRT4 interacts with U2AF2 in the nucleus upon TGF-β1 stimulation or kidney injury and deacetylates U2AF2 at K413, resulting in elevated CCN2 expression through alternative splicing of Ccn2 gene to promote kidney fibrosis. The authors further showed that the translocation of SIRT4 is through the BAX/BAK pore complex and is dependent on the ERK1/2-mediated phosphorylation of SIRT4 at S36, and consequently the binding of SIRT4 to importin α1. This fundamental work substantially advances our understanding of the progression of kidney fibrosis and uncovers a novel SIRT4-U2AF2-CCN2 axis as a potential therapeutic target for kidney fibrosis.

      Strengths:

      Overall, this is an extensive, well-performed study. The results are convincing, and the conclusions are mostly well supported by the data. The message is interesting to a wider community working on kidney fibrosis, protein acetylation, and SIRT4 biology.

      Weaknesses:

      The manuscript could be further strengthened if the authors could address a few points listed below:

      (1) In the results part 3.9, an in vitro deacetylation assay employing recombinant SIRT4 and U2AF2 should be included to support the conclusion that SIRT4 is a deacetylase of U2AF2. Similarly, an in vitro binding assay can be included to confirm whether SIRT4 and U2AF2 are directly interacted.

      Thank you for your insightful comments and suggestions for improving our manuscript. We appreciate your recommendation to include an in vitro deacetylation assay employing recombinant SIRT4 and U2AF2 to support our conclusion regarding the deacetylase activity of SIRT4 on U2AF2.

      We would like to clarify that the data demonstrating the effect of SIRT4 on U2AF2 acetylation were already included in our original submission. Specifically, Figure 5C illustrates that the TGF-β1-caused decreased acetylation of U2AF2 is attenuated by Sirt4 knockdown. Conversely, overexpression of SIRT4 (SIRT4 OE) enhances the deacetylation process of U2AF2 in the presence of TGF-β1. These results support that SIRT4 is a deacetylase for U2AF2.

      Furthermore, we have already provided evidence of the direct interaction between SIRT4 and U2AF2 through a co-immunoprecipitation (CoIP) assay, which was shown in Figure 5B. This assay confirms the physical interaction between SIRT4 and U2AF2.

      We believe that the existing data sufficiently address the points raised in your comments. We are grateful for the opportunity to clarify these aspects of our study and hope that our response has adequately addressed your concerns.

      (2) In Figure 6D, the Western Blot data using U2AF2-K453Q is confusing and is quite disconnected from the rest of the data and not explained. This data can be removed or explained why U2AF2-K453Q is employed here.

      Thank you for your inquiry regarding the rationale behind the K453Q mutation in our study.

      In the study, we have predicted some acetylation sites. U2AF2-K453Q is another site mutation to mimic a hyperacetylated state of U2AF2, our results indicated that U2AF2 acetylation at K413 had little effects on CNN expression. Therefore, we found that only the U2AF2 acetylation at K413 can regulate CCN2 expression, not acetylation at other sites. In order not to cause ambiguity in the study, we have removed the results of U2AF2-K453Q in our revised manuscript.

      (3) Although ERK inhibitor U0126 blocked the nuclear translocation of SIRT4 in vivo, have the authors checked whether treatment with U0126 could affect the expression of kidney fibrosis markers in UUO mice?

      Thank you for your insightful question regarding the effects of the ERK inhibitor U0126 on the expression of kidney fibrosis markers in UUO mice.

      In our study, we indeed conducted in vivo experiments using U0126 and observed that it effectively ameliorated kidney fibrosis markers, which is consistent with its established role in inhibiting the fibrotic process. Specifically, U0126 treatment significantly suppressed the SIRT4-mediated renal fibrosis, which was evidenced by the reduced expression of fibrosis markers (Author response image 1).

      Author response image 1.

      U0126 treatment alleviates renal fibrosis in UUO mice.

      However, in the initial submission, we chose not to include these results in the main body of the manuscript based on the following reasons: 1) we intent to highlight the inhibitory effects of U0126 on ERK and its subsequent impact on kidney fibrosis might shift the focus of our study away from the central theme of SIRT4's role in renal fibrosis. 2) We aimed to maintain a clear narrative that emphasizes the novel findings related to SIRT4 and its regulation by the ERK pathway.

      Nonetheless, we recognize the importance of these findings and are willing to include the relevant data in the revised manuscript if it aligns with the journal's editorial direction and contributes to the broader understanding of renal fibrosis treatment strategies.

      We appreciate the opportunity to clarify this aspect of our research and are open to further suggestions from the editorial team.

      (4) The format of gene and protein abbreviations in the manuscript should be standardized.

      Thank you for your comment on the formatting of gene and protein abbreviations in our manuscript. We have carefully reviewed our formatting practices and confirmed that we have adhered to the standard conventions as follows:

      (1) Mouse gene names are presented with an initial capital letter and in italics.

      (2) Human gene names are written in uppercase and in italics.

      (3) Protein names are in all capital letters and not italicized.

      We understand the importance of consistency in scientific publications and have ensured that these standards are uniformly applied throughout the revised manuscript. If there were any discrepancies, we have corrected them to maintain the clarity and professionalism.

      We appreciate the opportunity to refine our work and are committed to upholding the standards of scientific communication.

      (5) There are a few grammar issues throughout the manuscript. The English/grammar could be stronger, thus improving the overall accessibility of the science to readers.

      Thank you for bringing the grammar issues to our attention. We have made diligent efforts to revise and improve the manuscript's English and grammar throughout. We have also enlisted the support of a professional language editing service to ensure the clarity and accuracy of our scientific communication.

      We are confident that these revisions have significantly enhanced the manuscript's accessibility to a broader readership and have addressed the language concerns raised.

      We appreciate your guidance and are committed to delivering a manuscript of the highest quality.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript presents a novel and significant investigation into the role of SIRT4 For CCN2 expression in response to TGF-β by modulating U2AF2-mediated alternative splicing and its impact on the development of kidney fibrosis.

      Strengths:

      The authors' main conclusion is that SIRT4 plays a role in kidney fibrosis by regulating CCN2 expression via pre-mRNA splicing. Additionally, the study reveals that SIRT4 translocates from the mitochondria to the cytoplasm through the BAX/BAK pore under TGF-β stimulation. In the cytoplasm, TGF-β activated the ERK pathway and induced the phosphorylation of SIRT4 at Ser36, further promoting its interaction with importin α1 and subsequent nuclear translocation. In the nucleus, SIRT4 was found to deacetylate U2AF2 at K413, facilitating the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Overall, the findings are fully convincing. The current study, to some extent, shows potential importance in this field.

      Weaknesses:

      (1) Exosomes containing anti-SIRT4 antibodies were found to effectively mitigate UUO-induced kidney fibrosis in mice. While the protein loading capacity and loading methods were not mentioned.

      We appreciate your inquiry about the protein loading capacity and methods for the exosomes. As you have correctly noted, these details are indeed essential for the comprehensive understanding of our experimental approach. We have provided these information in the electronic supplementary material, specifically in Section 2.17, where we describe the methodology used for loading the anti-SIRT4 antibodies into the exosomes and the capacity at which this was achieved.

      We hope that this additional detail in the supplementary material addresses your concerns and enhances the clarity of our study's methodology.

      (2) The method section is incomplete, and many methods like cell culture, cell transfection, gene expression profiling analysis, and splicing analysis, were not introduced in detail.

      Thank you for your meticulous review and the feedback provided on our manuscript. We acknowledge your concern regarding the completeness of the methods section.

      We would like to clarify that in our initial submission, all text and figures were compiled into a single document, with the supplementary methods detailed at the end, separate from the main text methods. This format was chosen to adhere to submission guidelines that prioritize the concise presentation of core methods in the main text while providing additional details in the supplementary material for comprehensiveness.

      The detailed methodologies for cell culture, cell transfection, gene expression profiling analysis, and splicing analysis, which you inquired about, are now indeed included in the revised electronic supplementary material.

      We apologize for any misunderstanding caused by the initial structure of our submission and appreciate the opportunity to clarify the comprehensive nature of our methodological reporting.

      (3) The authors should compare their results with previous studies and mention clearly how their work is important in comparison to what has already been reported in the Discussion section.

      We appreciate the opportunity to discuss the significance of our findings in the broader context of renal fibrosis research. In response to your suggestion, we have further refined our discussion to explicitly compare our results with those of previous studies and to clearly articulate the importance of our work.

      (1) Novelty of SIRT4's Role in Renal Fibrosis: Our study introduces a novel concept in the field by demonstrating the nuclear translocation of SIRT4 as a key initiator of kidney fibrosis. This finding diverges from previous studies that have primarily focused on SIRT4's mitochondrial roles, highlighting a new dimension of SIRT4's function in renal pathophysiology.

      (2) Mechanistic Insights: We provide a detailed mechanistic pathway, from the release of SIRT4 from mitochondria through the BAX/BAK pore to its subsequent nuclear translocation and impact on U2AF2 deacetylation. This pathway has not been previously described, offering a fresh perspective on the regulation of fibrogenic gene expression.

      (3) Implications for Therapy: Our findings suggest potential therapeutic interventions targeting SIRT4 nuclear translocation, which could be a significant advancement over existing treatments that have shown limited efficacy in addressing the root causes of renal fibrosis.

      (4) Epigenetic Regulation: By elucidating the role of SIRT4 in regulating alternative splicing of CCN2 pre-mRNA through U2AF2 deacetylation, our study contributes to the growing understanding of epigenetic mechanisms in renal fibrosis, a field that has been understudied compared to genetic factors.

      Differential Cellular Roles of SIRT4: Our work indicates that SIRT4 may have distinct roles in different cell types, which is a complex and nuanced aspect of CKD pathophysiology that has not been fully explored in previous research.

      Integration with Previous Research: We have compared our findings with existing literature, noting where our work aligns with and diverges from previous studies. This comparison underscores the value of our research in expanding the current paradigm of renal fibrosis.

      In conclusion, we believe that our study provides critical insights into the pathogenesis of renal fibrosis and offers a potential therapeutic target. We have clarified these points in the discussion section of our manuscript to ensure that the significance of our work is clearly communicated to the readers.

      Reviewer #3 (Public Review):

      Summary:

      Yang et al reported in this paper that TGF-beta induces SIRT4 activation, TGF-beta activated SIRT4 then modulates U2AF2 alternative splicing, U2AF2 in turn causes CCN2 for expression. The mechanism is described as this: mitochondrial SIRT4 transport into the cytoplasm in response to TGF-β stimulation, phosphorylated by ERK in the cytoplasm, and pathway and then undergo nuclear translocation by forming the complex with importin α1. In the nucleus, SIRT4 can then deacetylate U2AF2 at K413 to facilitate the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Moreover, they used exosomes to deliver Sirt4 antibodies to mitigate renal fibrosis in a mouse model. TGF-beta has been widely reported for its role in fibrosis induction.

      Strengths:

      TGF-beta induction of SIRT4 translocation from mitochondria to nuclei for epigenetics or gene regulation remains largely unknown. The findings presented here that SIRT4 is involved in U2AF2 deacetylation and CCN2 expression are interesting.

      Weaknesses:

      SIRT4 plays a critical role in mitochondria involved in respiratory chain reaction. This role of SIRT4 is critically involved in many cell functions. It is hard to rule out such a mitochondrial activity of SIRT4 in renal fibrosis. Moreover, the major concern is what kind of message mitochondrial SIRT4 proteins receive from TGF-beta. Although nuclear SIRT4 is increased in response to TNF treatment, it is likely de novo synthesized SIRT4 proteins can also undergo nuclear translocation upon cytokine stimulation. TGF-beta-induced mitochondrial calcium uptake and acetyl-CoA should be evaluated for calcium and acetyl-CoA may contribute to the gene expression regulation in nuclei.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      (1) SIRT4 overall is a mitochondrial enzyme that indeed can undergo shuttling between mitochondria and cytoplasm. Renal fibrosis is a process of complex, SIRT4 deacetylates U2AF4 at K 413.

      Thank you for your comment highlighting the known mitochondrial localization of SIRT4 and its role in renal fibrosis.

      We concur with the literature that SIRT4 is predominantly a mitochondrial enzyme. However, our study expands upon this understanding by demonstrating a novel shuttling mechanism of SIRT4 between mitochondria and the nucleus in the context of renal fibrosis. Specifically, we observed that under conditions of obstructive nephropathy and renal ischemia reperfusion injury, SIRT4 significantly accumulates in the nucleus, which is a critical event in the fibrotic response.

      Our findings reveal that upon TGF-β stimulation, a known inducer of fibrosis, SIRT4 is released from the mitochondria through the BAX/BAK pore and subsequently translocates to the nucleus. This translocation is mediated by the ERK1/2-dependent phosphorylation of SIRT4 at serine 36, which enhances its interaction with importin α1, a key component in nuclear import processes.

      Once in the nucleus, SIRT4 exerts its effects on the alternative splicing of CCN2 pre-mRNA by deacetylating U2AF2 at lysine 413. This deacetylation event promotes the formation of the U2 small nuclear ribonucleoprotein (U2 snRNP) and facilitates the splicing of CCN2 pre-mRNA, leading to increased expression of the profibrotic protein CCN2.

      Our study, therefore, not only confirms the mitochondrial association of SIRT4 but also uncovers its nuclear function in the regulation of gene expression during renal fibrosis. These findings underscore the complexity of SIRT4's role in cellular processes and its potential as a therapeutic target for fibrotic diseases.

      (2) Figure 2 and Figure 3 should be combined.

      Thank you for your suggestion to combine Figures 2 and 3 for potential improvement in presentation.

      After careful consideration, we have found that merging these figures is not feasible due to space constraints on a standard A4 page, which is necessary to maintain the clarity and detail of the data presented in both figures. Each figure contains complex data that, when combined, would compromise the readability and the integrity of the individual elements.

      We believe that the current presentation of Figures 2 and 3 provides a clear and detailed visualization of the data, which is essential for the reader's understanding of our study's findings.

      (3) In Figure 4G, the mass spectrum of U2AF2 acetylation on K413 should be included rather than the alignment among species. Moreover, endogenous HAT1 on endogenous U2AF2 rather than exogenous FLAG-U2F2 should be examined.

      Thank you for your thoughtful comments and for the suggestion to include the mass spectrum of U2AF2 acetylation on K413 in Figure 4G.

      We appreciate the value that the mass spectrometry data would add to our study, providing a direct and definitive assessment of the acetylation status at this specific residue. However, we regret to inform you that our current facilities do not have access to the necessary mass spectrometry equipment to perform these analyses.

      While we are unable to include this data in the present manuscript, we concur with the importance of such evidence and plan to undertake these studies in the future. We are in the process of establishing collaborations with laboratories that have the required facilities to perform mass spectrometry. Our intention is to incorporate these data into a follow-up study, which will further validate and expand upon the findings presented in this manuscript.

      We believe that our current findings, although lacking the mass spectrometry confirmation, still provide valuable insights into the role of U2AF2 acetylation in [insert relevant biological process]. We have taken care to present our data rigorously and transparently, and we are committed to pursuing the highest standards of experimental validation in our future work.

      We hope you will consider the merits of our study in the context of the current limitations and appreciate the opportunity to clarify our position.

      Furthermore, regarding the examination of endogenous HAT1's effect on endogenous U2AF2 acetylation levels, we have conducted the necessary experiments. Our results demonstrate that overexpression of HAT1 leads to a significant increase in the acetylation of endogenous U2AF2 (Figure. R2). This new data set has been added to the revised manuscript and supports the role of HAT1 in the regulation of U2AF2 acetylation.

      We believe that these revisions address your concerns and provide a more comprehensive understanding of the molecular mechanisms underlying the regulation of U2AF2 acetylation.

      We appreciate the opportunity to improve our manuscript based on your constructive feedback and hope that our revisions meet with your satisfaction.

      Author response image 2.

      HAT1 OE reduces the acetylation of endogenous U2AF2

      (4) Figure 6F. Does portien mean protein?

      Thank you for your careful review and insightful comments on our manuscript. You are correct in pointing out the error regarding the term "portien" in Figure 6F. It was indeed a typographical oversight on our part, and we apologize for any confusion this may have caused.

      We have made the necessary correction to ensure that "protein" is accurately used in place of "portien" in Figure 6F. We appreciate the opportunity to enhance the clarity and accuracy of our presentation.

      (5) The authors should pay attention to their writing. There are many typos and other issues with the use of the English language and grammar.

      Thank you for bringing the grammar issues to our attention. We have made diligent efforts to revise and improve the manuscript's English and grammar throughout. We have also enlisted the support of a professional language editing service to ensure the clarity and accuracy of our scientific communication.

      We are confident that these revisions have significantly enhanced the manuscript's accessibility to a broader readership and have addressed the language concerns raised.

      We appreciate your guidance and are committed to delivering a manuscript of the highest quality.

    1. eLife Assessment

      This study uses ex vivo live imaging of uteri post-mating to test the role of the sperm hook in the house mouse sperm in sperm movement that would be interesting to evolutionary biologists. The significance of the work is useful as live imaging can reveal information not seen in fixed images. The strength of evidence is incomplete as they cannot directly test the role of the sperm hook in facilitating movement along the uterine wall.

    2. Reviewer #1 (Public review):

      Summary:

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form that basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall.

      Strengths:

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm.

      Weaknesses:

      The paper is descriptive and the data are correlations.

      The authors cannot directly test their proposed function of the sperm hook in sliding and preventing backward slipping.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form the basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall.

      Strengths:

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm.

      Weaknesses:

      The paper is descriptive and the data are correlations.

      The authors cannot directly test their proposed function of the sperm hook in sliding and preventing backward slipping.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I suggest that the authors clearly state and explain in the manuscript that this study is limited with respect to the ability to "directly test the role of the sperm hook in facilitating movement along the uterine wall". I think that if they make this statement in the manuscript, perhaps at the end of the abstract, then the strength of evidence for their claims could be deemed as solid after re-review.

      We thank the reviewer again for the review process. We believe that our manuscript has improved considerably during the review process. Regarding the limitations and future work, we have added the following to the discussion section.

      “Further investigation of sperm behaviour inside the female reproductive tract or tissue mimicking microfluidic devices with real-time deep tissue imaging as in the current study, will provide valuable opportunities for a more comprehensive examination of both sperm-sperm and sperm-epithelium interactions in the female reproductive tract. While we have focused on observing sperm interactions for only natural healthy mice in this study, future works employing specifically targeted genetically modified knockout animal models will further elucidate and confirm the exact genetic and functional mechanisms that guide these interactions.”

      The revised manuscript is an improvement over the initial submission. I suggest that the authors mark the oviduct explicitly in Fig. 1A.

      The oviduct includes the ampulla, isthmus, and UTJ. We have additionally marked the oviduct in Fig. 1A, with according arrows and a box.

    1. eLife Assessment

      This useful manuscript presents an interesting multi-modal omics analysis of lung adenocarcinoma patients with distinct clinical clusters, mutation hotspots, and potential risk factors identified in cases linked to air pollution. The findings show potential for clinical and therapeutic impact. Some of the conclusions remain incomplete as they are based on correlative or suggestive findings, and would benefit from further functional investigation and validating approaches.

    2. Reviewer #1 (Public Review):

      Summary:

      This is a well-written and detailed manuscript showing important results on the molecular profile of 4 different cohorts of female patients with lung cancer.

      Strengths:

      The authors used several different methods to identify potential novel targets for therapeutic interventions.

      Weaknesses:

      Statistical test results need to be provided in comparisons between cohorts. This was addressed by the authors in the revisions.

    3. Reviewer #2 (Public Review):

      New comments are added after authors responses to my initial comments.

      Summary:

      Zhang et al. performed a proteogenomic analysis of lung adenocarcinoma (LUAD) in 169 female never-smokers from the Xuanwei area (XWLC) in China. These analyses reveal that XWLC is a distinct subtype of LUAD and that BaP is a major risk factor associated with EGFR G719X mutations found in the XWLC cohort. Four subtypes of XWLC were classified with unique features based on multi-omics data clustering.

      Strengths:

      The authors made great efforts in performing several large-scale proteogenomic analyses and characterizing molecular features of XWLCs. Datasets from this study will be a valuable resource to further explore the etiology and therapeutic strategies of air-pollution-associated lung cancers, particularly for XWLC.

      Weaknesses:

      [...]

      (2) Importantly, while providing the large datasets, validating key findings is minimally performed, and surprisingly there is no interrogation of XWLC drug response/efficacy based on their findings, which makes this manuscript descriptive and incomplete rather than conclusive. For example, testing the efficacy of XWLC response to afatinib combined with other drugs targeting activated kinases in EGFR G719X mutated XWLC tumors would be one way to validate their datasets and new therapeutic options.

      Response: We appreciate your suggestion. In reference to testing the efficacy of XWLC response to afatinib combined with drugs targeting kinases, we have planned to establish PDX and organoid models to validate the effectiveness of our therapeutic approach. Due to the extended timeframe required, we intend to present these results in a subsequent study.

      Comments: All conclusions in the manuscript made by authors are based on interpretations of large-scale multi-omics data, which should be properly validated by other approaches and methods. Without validation, these are all speculations and any conclusions without supporting evidence are not acceptable. This reviewer suggested an example of validation experiment, and Reviewer #3 also pointed out several data that need to be validated. However, authors do not agree to perform any of these validation experiments without reasonable justification.

      (3) The authors found MAD1 and TPRN are novel therapeutic targets in XWLC. Are these two genes more frequently mutated in one subtype than the other 3 XWLC subtypes? How these mutations could be targeted in patients?

      Response: Thank you for your question. We have investigated the TPRN and MAD1 mutations in our dataset, identifying five TPRN mutations and eight MAD1 mutations. Among the TPRN mutations, XWLC_0046 and XWLC_0017 belong to the MCII subtype, XWLC_0012 belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 2/24, 0/15, and 0/13, respectively. Similarly, for the MAD1 mutations, XWLC_0115, XWLC_0021, and XWLC_0047 belong to the MCII subtype, XWLC_0055 containing two mutations belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 3/24, 0/15, and 0/13 across subtypes, respectively. Fisher's test did not reveal significant differences between the subtypes. For targeting novel therapeutic targets such as MAD1 and TPRN, we propose a multi-step approach. Firstly, we advocate for conducting functional in vivo and in vitro experiments to verify their roles during cancer progression. Secondly, we suggest conducting small molecule drug screening based on the pharmacophore of these proteins, which may lead to the identification of potential therapeutic drugs. Lastly, we recommend testing the efficacy of these drugs to further validate their potential as effective treatments.

      Comments: Please properly incorporate the above explanation into the main text.

      (4) In Figures 2a and b: while Figure 2a shows distinct genomic mutations among each LC cohort, Figure 2b shows similarity in affected oncogenic pathways (cell cycle, Hippo, NOTCH, PI3K, RTK-RAS, and WNT) between XWLC and TNLC/CNLC. Considering that different genomic mutations could converge into common pathways and biological processes, wouldn't these results indicate commonalities among XWLC, TNLC, and CNLC? How about other oncogenic pathways not shown in Figure 2b?

      Response: Thank you for your question. Based on the data presented in Fig. 2a, which encompasses all genomic mutations, it appears that the mutation landscape of XWLC bears the closest resemblance to TSLC (Fig. 2a). However, when considering oncogenic pathways (Fig. 2b) and genes (Fig. 2c), there is a notable disparity between the two cohorts. These findings suggest that while XWLC and TSLC exhibit similarities in terms of genomic mutations, they possess distinct characteristics in terms of oncogenic pathways and genes.<br /> Regarding the oncogenic signaling pathways, we referred to ten well-established pathways identified from TCGA cohorts. These members of oncogenic pathways are likely to serve as cancer drivers (functional contributors) or therapeutic targets, as highlighted by Sanchez-Vega et al. in 2018(Sanchez-Vega et al., 2018).

      Comments: It is unclear to this reviewer how authors defined "distinct characteristics" in terms of oncogenic pathways and genes. Would 10-20% differences in "Fraction of samples affected" in Fig2b be sufficient to claim significance? How could authors be sure whether mutations in genes involved in each oncogenic pathway are activating or inactivating mutations (rather than benign, thus non-affecting mutations)?

      [...]

      (6) Supplementary Table 11 shows a number of mutations at the interface and length of interface between a given protein-protein interaction pair. Such that, it does not provide what mutation(s) in a given PPI interface is found in each LC cohort. For example, it fails to provide whether MAD1 R558H and TPRN H550Q mutations are found significantly in each LC cohort.

      Response: We appreciate your careful review. In Supplementary Table 11, we have provided significant onco_PPI data for each LC cohort, focusing on enriched mutations at the interface of two proteins. Our emphasis lies on onco_PPI rather than individual mutations, as any mutation occurring at the interface could potentially influence the function of the protein complex. Thus, our Supplementary Table 11 exclusively displays the onco_PPI rather than mutations. MAD1 R558H and TPRN H550Q were identified through onco_PPI analysis, and subsequent extensive literature research led us to focus specifically on these mutations.

      Comments: Are authors referring to Table S9 (Onco_PPIs identified in four cohorts) instead of Supplementary Table 11? There is no Table 11 among submitted files. In Table S9, the Column N (length of protein product of gene1) does not make sense: MYO1C (8152), TP53 (3924), EGFR (12961). These should not be the number of amino acids residues of each protein. Then, what do these numbers mean?

      (7) Figure 7c and d are simulation data not from an actual binding assay. The authors should perform a biochemical binding assay with proteins or show that the mutation significantly alters the interaction to support the conclusion.

      Response: We appreciate your suggestion. The relevant experiments are currently in progress, and we anticipate presenting the corresponding data in a subsequent study.

      Comments: The suggested experiment is to support the simulated data. Again, without supporting experimental results, authors could not make a conclusion simply based on simulated data. Where else could the supporting experimental results be presented?

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a well-written and detailed manuscript showing important results on the molecular profile of 4 different cohorts of female patients with lung cancer.

      The authors conducted comprehensive multi-omic profiling of air-pollution-associated LUAD to study the roles of the air pollutant BaP. Utilizing multi-omic clustering and mutation-informed interface analysis, potential novel therapeutic strategies were identified.

      Strengths:

      The authors used several different methods to identify potential novel targets for therapeutic interventions.

      Weaknesses:

      Statistical test results need to be provided in comparisons between cohorts.

      We appreciate your recognition and valuable suggestions.. We have revised statistical test results in the panels including: Fig. 3b, e and g.

      Reviewer #2 (Public Review):

      Summary:

      Zhang et al. performed a proteogenomic analysis of lung adenocarcinoma (LUAD) in 169 female never-smokers from the Xuanwei area (XWLC) in China. These analyses reveal that XWLC is a distinct subtype of LUAD and that BaP is a major risk factor associated with EGFR G719X mutations found in the XWLC cohort. Four subtypes of XWLC were classified with unique features based on multi-omics data clustering.

      Strengths:

      The authors made great efforts in performing several large-scale proteogenomic analyses and characterizing molecular features of XWLCs. Datasets from this study will be a valuable resource to further explore the etiology and therapeutic strategies of air-pollution-associated lung cancers, particularly for XWLC.

      Weaknesses:

      (1) While analyzing and interpreting the datasets, however, this reviewer thinks that authors should provide more detailed procedures of (i) data processing, (ii) justification for choosing methods of various analyses, and (iii) justification of focusing on a few target gene/proteins in the datasets for further validation in the main text.

      We appreciate your valuable feedback. In response to the suggestions for enhancing the manuscript's clarity, we have provided more detailed procedures in the main text and methods sections.

      (2) Importantly, while providing the large datasets, validating key findings is minimally performed, and surprisingly there is no interrogation of XWLC drug response/efficacy based on their findings, which makes this manuscript descriptive and incomplete rather than conclusive. For example, testing the efficacy of XWLC response to afatinib combined with other drugs targeting activated kinases in EGFR G719X mutated XWLC tumors would be one way to validate their datasets and new therapeutic options.

      We appreciate your suggestion. In reference to testing the efficacy of XWLC response to afatinib combined with drugs targeting kinases, we have planned to establish PDX and organoid models to validate the effectiveness of our therapeutic approach. Due to the extended timeframe required, we intend to present these results in a subsequent study.

      (3) The authors found MAD1 and TPRN are novel therapeutic targets in XWLC. Are these two genes more frequently mutated in one subtype than the other 3 XWLC subtypes? How these mutations could be targeted in patients?

      Thank you for your question. We have investigated the TPRN and MAD1 mutations in our dataset, identifying five TPRN mutations and eight MAD1 mutations. Among the TPRN mutations, XWLC_0046 and XWLC_0017 belong to the MCII subtype, XWLC_0012 belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 2/24, 0/15, and 0/13, respectively. Similarly, for the MAD1 mutations, XWLC_0115, XWLC_0021, and XWLC_0047 belong to the MCII subtype, XWLC_0055 containing two mutations belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 3/24, 0/15, and 0/13 across subtypes, respectively. Fisher’s test did not reveal significant differences between the subtypes.

      For targeting novel therapeutic targets such as MAD1 and TPRN, we propose a multi-step approach. Firstly, we advocate for conducting functional in vivo and in vitro experiments to verify their roles during cancer progression. Secondly, we suggest conducting small molecule drug screening based on the pharmacophore of these proteins, which may lead to the identification of potential therapeutic drugs. Lastly, we recommend testing the efficacy of these drugs to further validate their potential as effective treatments.

      (4) In Figures 2a and b: while Figure 2a shows distinct genomic mutations among each LC cohort, Figure 2b shows similarity in affected oncogenic pathways (cell cycle, Hippo, NOTCH, PI3K, RTK-RAS, and WNT) between XWLC and TNLC/CNLC. Considering that different genomic mutations could converge into common pathways and biological processes, wouldn't these results indicate commonalities among XWLC, TNLC, and CNLC? How about other oncogenic pathways not shown in Figure 2b?

      Thank you for your question. Based on the data presented in Fig. 2a, which encompasses all genomic mutations, it appears that the mutation landscape of XWLC bears the closest resemblance to TSLC (Fig. 2a). However, when considering oncogenic pathways (Fig. 2b) and genes (Fig. 2c), there is a notable disparity between the two cohorts. These findings suggest that while XWLC and TSLC exhibit similarities in terms of genomic mutations, they possess distinct characteristics in terms of oncogenic pathways and genes.

      Regarding the oncogenic signaling pathways, we referred to ten well-established pathways identified from TCGA cohorts. These members of oncogenic pathways are likely to serve as cancer drivers (functional contributors) or therapeutic targets, as highlighted by Sanchez-Vega et al. in 2018(Sanchez-Vega et al., 2018).

      (5) In Figure 2c, how and why were the four genes (EGFR, TP53, RBM10, KRAS) selected? What about other genes? In this regard, given tumor genome sequencing was done, it would be more informative to provide the oncoprints of XWLC, TSLC, TNLC, and CNLC for complete genomic alteration comparison.

      Thank you for your question and good suggestion. Building upon our previous study (Zhang et al., 2021), we found that EGFR, TP53, RBM10, and KRAS were the top mutated genes in Xuanwei lung cancer cohorts. Furthermore, we have included the mutation frequency of cancer driver genes (Bailey et al., 2018) across XWLC, TSLC, TNLC, and CNLC in Supplementary Table 2b.

      (6) Supplementary Table 11 shows a number of mutations at the interface and length of interface between a given protein-protein interaction pair. Such that, it does not provide what mutation(s) in a given PPI interface is found in each LC cohort. For example, it fails to provide whether MAD1 R558H and TPRN H550Q mutations are found significantly in each LC cohort.

      We appreciate your careful review. In Supplementary Table 11, we have provided significant onco_PPI data for each LC cohort, focusing on enriched mutations at the interface of two proteins. Our emphasis lies on onco_PPI rather than individual mutations, as any mutation occurring at the interface could potentially influence the function of the protein complex. Thus, our Supplementary Table 11 exclusively displays the onco_PPI rather than mutations. MAD1 R558H and TPRN H550Q were identified through onco_PPI analysis, and subsequent extensive literature research led us to focus specifically on these mutations.

      (7) Figure 7c and d are simulation data not from an actual binding assay. The authors should perform a biochemical binding assay with proteins or show that the mutation significantly alters the interaction to support the conclusion.

      We appreciate your suggestion. The relevant experiments are currently in progress, and we anticipate presenting the corresponding data in a subsequent study.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript from Zhang et al. utilizes a multi-omics approach to analyze lung adenocarcinoma cases in female never smokers from the Xuanwei area (XWLC cohort) compared with cases associated with smoking or other endogenous factors to identify mutational signatures and proteome changes in lung cancers associated with air pollution. Mutational signature analysis revealed a mutation hotspot, EGFR-G719X, potentially associated with BaP exposure, in 20% of the XWLC cohort. This correlated with predicted MAPK pathway activations and worse outcomes relative to other EGFR mutations. Multi-omics clustering, including RNA-seq, proteomics, and phosphoproteomics identified 4 clusters with the XWLC cohort, with additional feature analysis pathway activation, genetic differences, and radiomic features to investigate clinical diagnostic and therapeutic strategy potential for each subgroup. The study, which nicely combines multi-modal omics, presents potentially important findings, that could inform clinicians with enhanced diagnosis and therapeutic strategies for more personalized or targeted treatments in lung adenocarcinoma associated with air pollution. The authors successfully identify four distinct clusters with the XWLC cohort, with distinct diagnostic characteristics and potential targets. However, many validating experiments must be performed, and data supporting BaP exposure linkage to XWLC subtypes is suggestive but incomplete to conclusively support this claim. Thus, while the manuscript presents important findings with the potential for significant clinical impact, the data presented are incomplete in supporting some of the claims and would benefit from validation experiments.

      Strengths:

      Integration of omics data from multimodalities is a tremendous strength of the manuscript, allowing for cross-modal comparison/validation of results, functional pathway analysis, and a wealth of data to identify clinically relevant case clusters at the transcriptomic, translational, and post-translational levels. The inclusion of phosphoproteomics is an additional strength, as many pathways are functional and therefore biologically relevant actions center around activation of proteins and effectors via kinase and phosphatase activity without necessarily altering the expression of the genes or proteins.

      Clustering analysis provides clinically relevant information with strong therapeutic potential both from a diagnostic and treatment perspective. This is bolstered by the individual microbiota, radiographic, wound healing, outcomes, and other functional analyses to further characterize these distinct subtypes.

      Visually the figures are well-designed and presented and for the most part easy to follow. Summary figures/histograms of proteogenomic data, and specifically highlighted genes/proteins are well presented.

      Molecular dynamics simulations and 3D binding analysis are nice additions.

      While I don't necessarily agree with the authors' interpretation of the microbiota data, the experiment and results are very interesting, and clustering information can be gleaned from this data.

      Weaknesses:

      (1) Statistical methods for assessing significance may not always be appropriate.

      We appreciate your suggestion. We have revised statistical test results in the panels including: Fig. 3b,e and g.

      (2) Necessary validating experiments are lacking for some of the major conclusions of the paper.

      Thank you for raising this point. However, we respectfully choose not to comment on this matter at present.

      (3) Many of the conclusions are based on correlative or suggestive results, and the data is not always substantive to support them.

      Thank you for raising this point. However, we respectfully choose not to comment on this matter at present.

      (4) Experimental design is not always appropriate, sometimes lacking necessary controls or large disparity in sample sizes.

      Thank you for raising this point. However, we respectfully choose not to comment on this matter at present.

      (5) Conclusions are sometimes overstated without validating measures, such as in BaP exposure association with the identified hotspot, kinase activation analysis, or the EMT function.

      Thank you for raising this point. However, we respectfully choose not to comment on this matter at present.

      Reviewer #1 (Recommendations For The Authors):

      (1) Please provide a justification for why only females were included in the study. I am concerned that the results obtained in this study can not be generalized as only females were included.

      We appreciate your suggestion. Lung cancer in never smokers (LCINS) accounts for approximately 25% of lung cancer cases (15% of lung cancer in men and 53% in women) (Parkin et al., 2005). Currently, the etiology and mechanisms of LCINS are not clear. Globally, LCINS shows remarkable gender and geographic variations, occurring more frequently among Asian women (Bray et al., 2018). Indoor coal burning for heating and cooking has been implicated as a risk factor for Chinese women, as they spend more time indoors (Mumford et al., 1987). Among men, the proportion of never smokers is lower, with less regional variation, and lung cancer in males is frequently caused by smoking. Thus, to better reveal the etiology and molecular mechanisms of LCINS, we collected data exclusively from female LCINS patients in the Xuanwei area, excluding potential confounding factors such as hormonal or smoking status. Our study specifically aims to uncover the etiology and mechanisms of LCINS in female patients, with future research planned to verify whether our conclusions can be generalized to LCINS in male patients.

      (2) "Therefore, the XWLC and TSLC cohorts are more explicitly influenced by environmental carcinogens, while the TNLC and CNLC cohorts may be more affected by age or endogenous risk factors." This statement in the results (starting line 142) does not have adequate support from the results. First, the average age in the 4 cohorts does not seem to be very different to me based on Figure 1b. if they are different, please provide statistical test results. Please make sure this statement is supported by other results, otherwise, I would recommend excluding it from the manuscript.

      We appreciate your suggestion. To gain biological insights, we frequently associate mutational signatures with factors such as age, defective DNA mismatch repair, or environmental exposures. These remain associations rather than causation. Thus, we agree with the suggestion to weaken the conclusion as follows:

      “Generally, exposure to tobacco smoking carcinogens (COSMIC signature 4) and chemicals such as BaP (Kucab signatures 49 and 20) were identified as the most significant contributing factors in both the XWLC and TSLC cohorts (Fig. 1f and 1g). In contrast, defective DNA mismatch repair (COSMIC signature ID: SBS6) was identified as the major contributor in both the TNLC and CNLC cohorts (Fig. 1h and 1i), with no potential chemicals identified based on signature similarities. Therefore, the XWLC and TSLC cohorts appear to be more explicitly associated with environmental carcinogens, while the TNLC and CNLC cohorts may be more associated with defective DNA mismatch repair processes.”

      (3) Please provide statistical test results in this subsection "The EGFR-G719X mutation, which is a hotspot associated with BaP exposure, possesses distinctive biological features " (Line 203) showing that the number of G719X is significantly different in XWLC.

      We appreciate your suggestion. Two-sided Fisher’s test was used to calculate p-values, which are labeled in Figure 3b.

      (4) "Analysis of overall survival and progression-free interval (PFI) revealed that patients with the G719X mutation had worse outcomes compared to other EGFR mutation subtypes " This statement (starting Line 232) should be supported by literature data.

      We appreciate your suggestion.

      In the Watanabe et al. post-hoc analysis, patients with the G719 mutation had significantly shorter OS with gefitinib compared to patients with the common mutations (Watanabe et al., 2014). We revised the sentences as following:

      “Analysis of overall survival and progression-free interval (PFI) revealed that patients with the G719X mutation had worse outcomes compared to other EGFR mutation subtypes (Fig. 3j and 3k) which was consistent with a previous study(Watanabe et al., 2014).”

      (5) I would suggest changing this statement to a "suggestion" as there is no experimental support for this, and mentioning that this requires further experimental validation with the suggested drugs "Therefore, a promising approach to overcome resistance in tumors with this mutation could involve combining afatinib, which targets activated EGFR, with FDA-approved drugs that specifically target the activated kinases associated with G719X. " (Line 260).

      We appreciate your suggestion. We change the sentences as following:

      "Therefore, we propose a potential approach to overcoming resistance in tumors with this mutation, which could involve combining afatinib, targeting activated EGFR, with FDA-approved drugs that specifically target the activated kinases associated with G719X. "

      (6) It is not clear to me how PPIs were integrated with missense. Please clarify the method.

      We appreciate your suggestion. To identify interactions enriched with missense mutations, we constructed mutation-associated protein–protein interactomes (PPIs). Initially, we downloaded protein-protein interactomes from Interactome INSIDER (v.2018.2) (Meyer et al., 2018). Subsequently, we identified interfaces carrying missense mutations by mapping mutation sites to PPI interface genomic coordinates using bedtools (v2.25.0)(Quinlan and Hall, 2010). Finally, we defined oncoPPI as those PPIs significantly enriched in interface mutations in either of the two protein-binding partners across individuals. For more details, please refer to the methods sections “Building mutation-associated protein–protein interactomes” and “Significance test of PPI interface mutations.”

      Reviewer #2 (Recommendations For The Authors):

      Regarding the tumor microbiota composition, it is not clear what the significance of these results would be. Are the specific microbiota associated with MC-IV more pathogenic than other species found in other subtypes? What are the unique features of these MC-IV microbiota? If these are difficult to address, this section could be removed from the manuscript.

      We appreciate your suggestion. This section is removed from the manuscript.

      Regarding the radiomic data section (Figure 6d and Extended Figure 6d), more description about the eight and five features (that are different between MC-II and others) would be helpful to better understand the importance and significance of these data.

      We appreciate your suggestion. We have added the description as following: “Features such as median and mean reflect average gray level intensity and Idmn and Gray Level Non-Uniformity measure the variability of gray-level intensity values in the image, with a higher value indicating greater heterogeneity in intensity values. These results suggest a denser and more heterogeneous image in the MC-II subtype.”

      Other minor comments:

      (1) If EGFR G719X is a known hotspot mutation associated with BaP, please cite previous literature.

      We appreciate your suggestion. Upon careful retrieval using "G719X" and "BaP" as keywords, we did not find previous literature discussing G719X as a known hotspot mutation associated with BaP.

      (2) In Figure 1d, it should be clearly written in the legend that tumor (T) and normal (N) tissue were analyzed.

      We appreciate your suggestion. We have clarified the figure legend of Figure 1d.

      (3) In Figure 1m, it is not obvious that EGFR pY1173 and pY1068 are more abundant in the Bap+S9 sample. Total EGFR bands are very faint. These western blots should be repeated and quantified.

      We appreciate your suggestion. We have removed Fig. 1m. After identifying the antibody with satisfactory performance, we will provide the revised results.

      (4) In Figure 2d, aren't the EGFR E746__A750del mutations more frequently found in CNLC, TSLC, and TNLC? (which is opposite to what the authors wrote in the text).

      We appreciate your suggestion. This mistake has been corrected.

      (5) In Figure 7f-i and Ext Figure 8, Does "CK" mean empty vector control? Then, it would be changed to "EV".

      We appreciate your suggestion. This mistake has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      Methods:

      While previous work was referenced, a description of proteomics methods should still include: instrumentation, acquisition method, all software packages used, method for protein identification, method for protein quantification, how FDR was maintained for identification/quantification, definition of differentially expressed proteins, whether multiple testing correction was performed and if so what method.

      We appreciate your suggestion. We revised the description of label-free mass spectrometry methods accordingly.

      The paper would greatly benefit from brief methodological explanations throughout, as all methods are currently exclusively found in the supplementary information. This severely hampers the readability of the manuscript.

      Thank you for raising this point. However, we respectfully choose not to comment on this matter at present.

      Suggestions Throughout

      The paper would greatly benefit from proofreading/editing

      Line 157-158/Figure 1J for CYP1A1 displays protein concentrations while Figure 1K for AhR shows mRNA. Why this discrepancy? It would be preferable to show both mRNA and protein levels for both CYP1A1 and AhR. Also, there is a large discrepancy in the "n" between the normal and tumor groups, which makes the statistical comparison challenging. The AhR data is therefore unconvincing, and additional protein data is suggested. Thus the claim of significantly elevated AhR and CYP1A1 levels in tumors is not sufficiently supported and requires further investigation, both mRNA and protein, and with similarly sized sample groups.

      We appreciate your suggestion. We have thoroughly edited the revised manuscript, with all changes marked accordingly. Compared to mRNA level assessment, protein abundance is a better indicator of gene expression. Therefore, we reanalyzed the protein level of AhR for comparison and found no significant differences (Figure 1k). Additionally, the samples sequenced by mRNA-seq were not entirely consistent with those sequenced by label-free proteomics. The samples analyzed by different methods are shown in Figure 1d.

      Line 159 Figure 1I There is no control for the data serum data presented here. What are the serum levels for individuals not residing in the Xuanwei? It is unclear whether this represents elevated BPDE serum levels without appropriate controls. Thus nothing insightful can be derived from this data.

      We appreciate your suggestion. We have deleted the results concerning BPDE serum detection in the revised manuscript.

      Line 164 The statement "sites such as Y1173 and Y1068 of EGFR were more phosphorylated in BaP treated cells" is not sufficiently supported by the presented data and cannot be made. Figure 1M has no quantitation, no indication of "n" or whether this represents a single experiment or one validated with repeating. The western blot is also cropped with no indication of molecular weight or antibody specificity. This data is NOT convincing. The antibody signal is very weak, and not convincing with cropped blots. An updated figure, with an uncropped blot, and quantitation with multiple n's and statistical comparison is required. I am not sure the Wilcoxon rank sum test is appropriate to test significance in j-l. The null hypothesis should not be equal medians but equal means based on the experimental design.

      We appreciate your suggestion. We have removed Fig. 1m. After identifying the antibody with satisfactory performance, we will provide the revised results.

      Line 181 phrase "significant differences" should not be used unless making a claim about statistical significance.

      We appreciate your suggestion. We change “significant differences” to “noticeable differences”.

      Line 197: "The blood serum assay provided support..." As noted above this claim is not sufficiently supported by the presented data and requires more complete investigation.

      We appreciate your suggestion. This conclusion has been deleted in the revised manuscript.

      Line 219: Requires proofreading/editing.

      We appreciate your suggestion. We have thoroughly edited the revised manuscript, with all changes marked accordingly.

      Line 220: appears to have a typo and should read GGGC>GTGC

      We appreciate your suggestion. This mistake has been corrected in the revised manuscript.

      Line 223/224 Figure 3e-h. Again there is a large disparity between the n's of each group. Despite the WT having the highest frequency in the XWLC study population, it has only n=5 when comparing the protein and phosphosite for MAPKs. There is also no explanation for what the graph symbols indicate, what statistical test was performed to determine the statistical significance of the presented differences, and between which specific groups that significance exists. Thus, it is challenging to ascertain whether there are relevant differences in the MAPK signaling components.

      We appreciate your suggestion. We added the description of “N, number of tumor samples containing corresponding EGFR mutation” to the figure legend. p-values were calculated with a two-tailed Wilcoxon rank sum test, and p<0.05 was labeled on Figures 3e-i.

      Figure 3I Good figure. However, it would be beneficial to provide validation with Western Blotting for a few of these substrates using pospho-specific antibodies. It is suggested that this experiment be added.

      We appreciate your suggestion. Figure 3I showed the comparison of patients’ ages among subtypes. I guess you mean Figure 3g and Figure 3h. The relevant experiments are currently underway, and we will provide the corresponding data in the next revised version.

      Figure 4b. Very compelling figure.

      We appreciate your suggestion.

      Line 276: The AhR and CYP1A1 data presented earlier was not convincing, and CYP1A1 and AhR cannot be responsibly used as indicators of BaP activity based on potential. This is not an appropriate application.

      We appreciate your suggestion. CYP1A1 and AhR are two key regulators involved in BaP metabolism and signaling transduction, respectively. However, after examining the protein expression of AhR between tumor and normal tissues, we found no significant differences (Fig. 1k) and CYP1A1 has been proven to be highly expressed in tumor samples (Fig. 1j). Thus, we mainly examined the expression of CYP1A1 among the four subgroups. We changed our description as follows:

      “As CYP1A1 is a key regulator involved in BaP metabolism and has been proven to be highly expressed in tumor samples (Fig. 1j), we next examined the expression of CYP1A1 among the four subgroups to evaluate their associations with air pollution.”

      Figure 4d. Here it is AhR protein used rather than mRNA measured earlier. What is the explanation for this change?

      We appreciate your suggestion. As there was no significant differences of the protein expression of AhR between tumor and normal tissues (Fig. 1k), we deleted the expression comparison of AhR among subtypes.

      Line 281 "Moderately elevated expression level of AhR" is not supported by the presented data and should be removed.

      We appreciate your suggestion. We have deleted the result of comparison of AhR among subtypes.

      Figure 4: There is no indication or explanation of how the protein abundance is being measured. Is this from the proteomics (MS) approaches, by ELISA or by Western? If it is simply by MS then validation by another method is preferable. The data presented in Figure 4 do not adequately support the claim that MC-II subtype is more strongly associated with BaP exposure. What statistical test is used in 4F? Why is the n in the MC-II group, which is the highlighted group of interest nearly double the other groups?

      We appreciate your suggestion. Fig. 4e is derived from the proteomics data. The two-tailed Wilcoxon rank sum test was used to calculate p-values in panels c and e.

      Figure 4g: At least one or two of these should be validated by Western Blot or targeted MS.

      We appreciate your suggestion. The relevant experiments are currently underway, and we will provide the corresponding data in the next revised version.

      Figure 5a: Assuming these were also measured via proteomic analysis, how do their expression patterns compare across the different omics modes?

      Thank you for your suggestion. Figure 5 integrates transcriptomics (19182 genes), proteomics (9152 genes), and phosphoproteomics (5733 genes) data. In general, we utilized transcriptomics data to identify unique or distinct pathways among subgroups. Furthermore, proteomics and phosphoproteomics data were employed to validate key gene expressions, as they encompass fewer genes compared to transcriptomics data.

      For instance, in Fig. 5a-d, we observed higher expression levels of mesenchymal markers such as VIM, FN1, TWIST2, SNAI2, ZEB1, ZEB2, and others in the MC-IV subtype using transcriptomics data (Fig. 5a). Additionally, we calculated epithelial-mesenchymal transition (EMT) scores using the ssGSEA enrichment method based on protein levels and conducted GSEA analysis using transcriptomics data (Fig. 5b). Furthermore, using proteomics data, we evaluated Fibronectin (FN1), an EMT marker that promotes the dissociation, migration, and invasion of epithelial cells, at the protein level (Fig. 5c), and β-Catenin, a key regulator in initiating EMT, also at the protein level (Fig. 5d). Overall, our findings indicate that the MC-IV subtype exhibits an enhanced EMT capability, which may contribute to the high malignancy observed in this subtype.

      Line 314: Not compared with MCI, which appeared to be much lower at the mRNA level. Is there an explanation for this difference?

      We appreciate your suggestion. FN1 expression is lowest in MCI at the protein level (Fig. 5c). However, at the transcriptome level, FN1 expression is lowest in the MCIII subtype (Fig. 5a). You may wonder why these results are inconsistent. Discrepancies between mRNA and protein expression levels are common, and previous study showed that about 20% genes had a statistically significant correlation between protein and mRNA expression in lung adenocarcinomas (Chen et al., 2002). Post-transcriptional mechanisms, including protein translation, post-translational modification, and degradation, may influence the level of a protein present in a given cell or tissue. In this situation, we focused on identifying distinct biological pathways in each subgroup, supported by multi-omics data.

      Line 321: MC-IV *potentially* possesses an enhanced EMT capability. This statement cannot be conclusively made.

      We appreciate your suggestion. We changed our description as: “Collectively, our findings demonstrate that the MC-IV subtype is associated with enhanced EMT capability, which may contribute to the high malignancy observed in this subtype.”

      Lines 325 and 327 indicated dysregulation of cell cycle processes and activation of CDK1 and CDK2 pathways based on KSEA analysis which is closely linked to cell cycle regulation as two separate pieces of evidence. However, these are both drawn from the phosphoproteomics, and likely indicate conclusions drawn from the same phosphosite data. Said another way, if phosphosite data indicates differences in kinases linked to cell cycle regulation then you would also expect phosphosite data to indicate dysregulation of cell cycle.

      We appreciate your suggestion. You mentioned that Fig. 4f and Fig. 5e redundantly prove that the CDK1 and CDK2 pathways are dysregulated. However, KSEA analysis in Fig. 4f estimates changes in kinase activity based on the collective phosphorylation changes of its identified substrates (Wiredja et al., 2017). In contrast, Fig. 5e directly evaluates the abundance of protein and phosphosite levels of CDK1 and CDK2 across subtypes. These analyses mutually confirm each other rather than being redundant.

      Line 413/Figure 6b: While there may be a trend displayed by the figure, it is not convincing enough to state that MC-IV shows a conclusively distinguishable bacterial composition. Too much variability exists within groups MC-II and MC-III. However, it does show that MC-IV and MC-II have consistent composition within their groups, and that is interesting.

      We appreciate your suggestion. We have deleted the analysis of bacterial composition across subtypes.

      Figure 6: Overall very nice figure, with intriguing diagnostic potential. See the above note on 6a-b interpretation.

      We appreciate your suggestion. We have deleted the analysis of bacterial composition across subtypes, including Fig. 6a-6c.

      Figure 7c-f better labeling of the panels will aid reader comprehension.

      We appreciate your suggestion. Necessary labeling has been added to Fig. 7c-f to enhance comprehension.

      Figure 7 panel order is confusing, switching from right to left to vertical. Rearranging to either left to right or vertical would help orient readers.

      We appreciate your suggestion. We have adjusted the order of Fig. 7 and extended Fig. 8 panel.

      Figure 7 legend i: should read Cell colony* assay

      We appreciate your suggestion. We have corrected this mistake in the revised manuscript.

      The Discussion is very brief. While it includes a discussion of the potential impact of the study, it does not include an analysis of the caveats/drawbacks of the study. A more thorough discussion of other studies focusing on the impacts of BaP exposure is also suggested as this was a highlighted point by the authors.

      We appreciate your suggestion. we have added discussion about the associations between BaP exposure and lung cancer and also talked about the shortcomings of our study as followings:

      “Mechanistically, Qing Wang showed that BaP induces lung carcinogenesis, characterized by increased inflammatory cytokines, and cell proliferative markers, while decreasing antioxidant levels, and apoptotic protein expression(Wang et al., 2020). In our study, we used clinical samples and linked the mutational signatures of XWLC to the chemical compound BaP, which advanced the etiology and mechanism of air-pollution-induced lung cancer. In our study, several limitations must be acknowledged. Firstly, although our multi-omics approach provided a comprehensive analysis of the subtypes and their unique biological pathways, the sample size for each subtype was relatively small. This limitation may affect the robustness of the clustering results and the identified subtype-specific pathways. Larger cohort studies are necessary to confirm these findings and refine the subtype classifications. Secondly, although our study advanced the understanding of air-pollution-induced lung cancer by using clinical samples, the reliance on epidemiological data in previous studies introduces potential confounding factors. Our findings should be interpreted with caution, and further mechanistic studies are warranted to establish causal relationships more definitively. Thirdly, our in silico analysis suggested potential approach to drug resistence in G719X mutations. However, these predictions need to be validated through extensive in vitro and in vivo experiments. The reliance on computational models without experimental confirmation may limit the clinical applicability of these findings.”

      References:

      Bailey, M. H., Tokheim, C., Porta-Pardo, E., Sengupta, S., Bertrand, D., Weerasinghe, A., Colaprico, A., Wendl, M. C., Kim, J., Reardon, B., et al. (2018). Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 173, 371-385 e318.

      Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., and Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68, 394-424.

      Chen, G., Gharib, T. G., Huang, C. C., Taylor, J. M., Misek, D. E., Kardia, S. L., Giordano, T. J., Iannettoni, M. D., Orringer, M. B., Hanash, S. M., and Beer, D. G. (2002). Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 1, 304-313.

      Meyer, M. J., Beltran, J. F., Liang, S., Fragoza, R., Rumack, A., Liang, J., Wei, X., and Yu, H. (2018). Interactome INSIDER: a structural interactome browser for genomic studies. Nat Methods 15, 107-114.

      Mumford, J. L., He, X. Z., Chapman, R. S., Cao, S. R., Harris, D. B., Li, X. M., Xian, Y. L., Jiang, W. Z., Xu, C. W., Chuang, J. C., and et al. (1987). Lung cancer and indoor air pollution in Xuan Wei, China. Science 235, 217-220.

      Parkin, D. M., Bray, F., Ferlay, J., and Pisani, P. (2005). Global cancer statistics, 2002. CA Cancer J Clin 55, 74-108.

      Quinlan, A. R., and Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842.

      Sanchez-Vega, F., Mina, M., Armenia, J., Chatila, W. K., Luna, A., La, K. C., Dimitriadoy, S., Liu, D. L., Kantheti, H. S., Saghafinia, S., et al. (2018). Oncogenic Signaling Pathways in The Cancer Genome Atlas. Cell 173, 321-337 e310.

      Wang, Q., Zhang, L., Huang, M., Zheng, Y., and Zheng, K. (2020). Immunomodulatory Effect of Eriocitrin in Experimental Animals with Benzo(a)Pyrene-induced Lung Carcinogenesis. J Environ Pathol Toxicol Oncol 39, 137-147.

      Watanabe, S., Minegishi, Y., Yoshizawa, H., Maemondo, M., Inoue, A., Sugawara, S., Isobe, H., Harada, M., Ishii, Y., Gemma, A., et al. (2014). Effectiveness of gefitinib against non-small-cell lung cancer with the uncommon EGFR mutations G719X and L861Q. J Thorac Oncol 9, 189-194.

      Wiredja, D. D., Koyuturk, M., and Chance, M. R. (2017). The KSEA App: a web-based tool for kinase activity inference from quantitative phosphoproteomics. Bioinformatics 33, 3489-3491.

      Zhang, H., Liu, C., Li, L., Feng, X., Wang, Q., Li, J., Xu, S., Wang, S., Yang, Q., Shen, Z., et al. (2021). Genomic evidence of lung carcinogenesis associated with coal smoke in Xuanwei area, China. Natl Sci Rev 8, nwab152.

    1. eLife Assessment

      The study elucidates a detailed molecular mechanism of the initial stages of transport in the medically relevant Na+-coupled GABA neurotransmitter transporter GAT1 and thus generates important new insights into this protein family. In particular, it presents convincing evidence for the presence of a "staging binding site" that locally concentrates Na+ ions to increase transport activity, whilst solid evidence for how Na+ binding influences larger scale dynamics.

    2. Reviewer #1 (Public review):

      The authors have tried to identify the plausible Na+ entry pathway in an important SLC6 member GAT1, using computational approaches to assess residence times of the ions as they enter the vestibule of GAT1. The authors identify a patch of negative residues in TM6a and implicate them for being important to attract the Na+ ions during their movement towards the binding sites Na1 and Na2. Besides this they also suggest that sodium binding at site 1 is flexible and at times can occupy the primary binding site when the substrate is not available. Na2 site as other literature also suggests is demonstrated to be vital for the stability of the outward-open state.

      Studies of ion permeation are challenging given that the states are challenging to trap through structural studies and computational methods are vital for understanding these steps. The authors suggest that two negatively charged residues are vital to attract Na+ ions to the vestibule. Using a combination of simulations and PCA analysis the authors identify the importance of Na+ binding at site 2 that stabilises the outward-open state and the flexibility observed in Na1 site for ion binding which happens alongside substrate in the GABA bound state. The study reconfirms earlier observations in the SLC6 family that Na2 site is critical for conformational transitions and Na1 site is substrate dependent in amino acid transporters.<br /> One of the challenges in such studies is to conclusively establish the presence of additional Na+ sites or regions of ion-binding with experimental structures as they are nearly impossible to trap. Such studies using simulations therefore become the only resort to understand such phenomena.

      The work is likely to further provide insights into the transport mechanism of GAT1 and lends credence to some structural studies where the sodium at site1 is displaced but the ion remains proximal to the bound substrate.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript authored by Stockner and colleagues delves into the molecular simulations of Na+ binding pathway and the ionic interactions at the two known sodium binding sites site 1 and site 2. They further identify a patch of two acidic residues in TM6 that seemingly populate the Na+ ions prior to entry into the vestibule. These results highlight the importance of studying the ion-entry pathways through computational approaches and the authors also validate some of their findings through experimental work. They observe that sodium site 1 binding is stabilized by the presence of the substrate in the s1 site and this is particularly vital as the GABA carboxylate is involved in coordinating the Na+ ion unlike other monoamine transporters and binding of sodium to the Na2 site stabilizes the conformation of the GAT1 by reducing flexibility among the helical bundles involved in alternating access.

      Strengths:

      The study displays results that are generally consistent with available information from experiments on SLC6 transporters particularly GAT1 and puts forth the importance of this added patch of residues in the extracellular vestibule that could be of importance to the ion permeation in SLC6 transporters. This is a nicely performed study and could be improved if the authors could comment on and fix the following queries.

      We thank the reviewer for the overall positive assessment of our work.

      Comments on revised version:

      The authors have satisfactorily addressed my comments and this has significantly improved the clarity of the manuscript.

      The only point that I would like to inquire about is the role of EL4 in modulating Na+ entry.

      In the simulations do the authors see no role of EL4 in controlling Na+ entry. It is particularly intriguing as some studies in the recent past displayed charged mutations in EL4 of dDAT, SERT and GAT1 as being detrimental for substrate entry/uptake. It would therefore be nice to add a small discussion if there is any role for EL4 in Na+ entry.

      In this study we focused on sodium binding to the sodium binding site NA1 and NA2 and discovered the role of negatively charged residues at the beginning of TM6 contribution to sodium binding. Our data shows less than average interactions of sodium ions with EL4. In particular, we do also not observe any prominent role for D355, which is the only negatively charged residues in EL4a. We associate this effect to the presence of four positively charged residues (R69,Y76, K350, R351) surrounded D355 and an electrostatic repulsion by a local positive field, which is also visible in Figure 1k. Following the suggestion of the reviewer, we added a short statement to the last paragraph of the discussion.

      Reviewer #2 (Public Review):

      Summary

      Starting from an AlphaFold2 model of the outward-facing conformation of the GAT1 transporter, the authors primarily use state-of-the-art MD simulations to dissect the role of the two Na+ ions that are known to be co-transported with the substrate, GABA (and a cotransported Cl- ion). The simulations indicated that Na+ binding to OF GAT depends on the electrostatic environment. The authors identify an extracellular recruiting site including residues D281 and E283 which they hypothesized to increase transport by locally increasing the available Na+ concentration and thus increasing binding of Na+ to the canonical binding sites NA1 and NA2. The charge-neutralizing double mutant D281AE283A showed decreased binding in simulations. The authors performed GABA uptake experiments and whole-cell patch clamp experiments that taken together validated the hypothesis that the Na+ staging site is important for transport due to its role in pulling in Na+.

      Detailed analysis of the MD simulations indicated that Na+ binding to NA2 has multiple structural effects: The binding site becomes more compact (reminiscent of induced fit binding) and there is some evidence that it stabilizes the outward-facing conformation.

      Binding to NA1 appears to require the presence of the substrate, GABA, whose carboxylate moiety participates in Na+ binding; thus the simulations predict cooperativity between binding of GABA and Na+ binding to NA1.

      Strengths

      - MD simulations were used to propose a hypothesis (the existence of the staging Na+ site) and then tested with a mutant in simulations AND in experiments. This is an excellent use of simulations in combination with experiments.

      - A large number of repeat MD simulations are generally able to provide a consistent picture of Na+ binding. Simulations are performed according to current best practices and different analyses illuminate the details of the molecular process from different angles.

      - The role of GABA in cooperatively stabilizing Na+ binding to the NA1 site looks convincing and intriguing.

      We thank the reviewer for the overall positive assessment of our work.

      Weaknesses

      - Assessing the effects of Na+ binding on the large scale motions of the transporter is more speculative because the PCA does not clearly cover all of the conformational space and the use of an AlphaFold2 model may have introduced structural inconsistencies. For example, it is not clear if movements of the inner gate are due to a AF2 model that's not well packed or really a feature of the open outward conformation.

      We do not think that the results of the manuscript and in particular the large scale motions are speculative or dependent too much on the limitations of PCA. We only use PCA for Figure 6a-d,6g,h. Motions of SLC6 transporters (and of any other transporter) are much more complex than a single 2D PCA plot could every capture. We therefore used PCA here only to identify the two motions with the largest amplitude, show in Figure 6a-d, 6g,h.

      Given that all the ~13000 degrees of freedom of GAT1 contribute to conformational differences, a dimensionally reduction method like PCA can be very helpful for extracting dominant motions. Structure comparison showed that motions observed in PC1 captured a large portion of the motions of occlusion (Figure 6c,d) when compared to the full transition observed in the unfiltered trajectories (See Figure 6e,f). PCA therefore helps to extract this main motions.

      For completeness, we show a series of structures from the unfiltered trajectories in figure 6e,f. In the overlay, the motion of occlusion is more difficult to observe, because convoluted with all other degrees of freedom. In figure 6e,f, the structures are aligned with the maximum likelihood method theseus, while the coloring is based on the amplitudes measured by PCA to visualize the regions moving relative to each other with largest amplitude. All other structural measures, including the opening of the inner gate (Figure 6i-k), are direct measures of the raw trajectories.

      With respect to the question of the instability of the inner gate, we made similar observations for hSERT (please see DOI: 10.1038/s41467-023-44637-6) using the experimentally determined structure as starting point. We find a weakening of the inner gate for sodium free SERT and at intermediate or full occlusion of sodium- and serotonin-bound SERT. These previous data on SERT corroborate our finding and indicates that the effect could be a general feature of the SLC6 transporter family.

      Unfortunately no outward-open structure of GAT1 was available for this study. AlphaFold2 models have limitations and we are well aware of these limitations, but AlphaFold2 can also make high quality models including small adjustment of backbone positions, if the sequence identity is high, as in the current project (43% sequence identity for the transmembrane region). For GAT1 (as described in the manuscript) we initially tested hSERT based model created with MODELLER. MODELLER uses as premises the assumption that the protein backbone does not change or only very little between the template protein and the target protein. These MODELLER created models did not perform well, because of a slight shift in the position of the backbone, which is a consequence of consistently smaller side chains in the bundle domain-scaffold domain interface of GAT1 as compared to SERT.

      In the simulations described in the manuscript (using the AlphaFold created model) we observed that the overall structural and dynamic parameters and in particular also observation at the inner gate are very similar to the results described in our papers on sodium binding to SERT using experimental SERT structures. The differences of Na1 binding are explained in the manuscript and are contingent to the residue difference of D98 in SERT and the corresponding residue G65 in GAT1. This makes us confident about the quality of the obtained data. Please see DOI: 10.3390/cells11020255; DOI: 10.3389/fncel.2021.673782.

      - Quantitative analyses are difficult with the existing data; for example, the tICA "free energy" landscape is probably not converged because unbinding events haven't been observed.

      The tICA analysis is a Marco State Model approach, which relies on the convergence of transitions between a large number of microstates. A limited number of trajectories showing full sodium unbinding are not obligatory for converged dataset, but the transitions between the microstates must to be converged. For the transitions within the S1 we have many transitions and very good convergence for transition probabilities within the S1. We limit interpretation of free energy data and discussion on this part of the free energy surface. The supporting information (Figure S5) reports on the quality of the tICA analysis. Flat lines with a time lag larger than 40 ns is consistent with a converged model based on the data of the trajectories used for the analysis, and consistently, also the Chapman-Kolmogorov tests show minimal difference between estimates and predictions.

      We see about 40 binding event from the extracellular side to the S1, which seems insufficient for a converged quantification for sodium transiting from the extracellular side to the S1. We state this limitation of the dataset in the results section of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, building on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The experiments support the main message of the paper regarding durotaxis by amoeboid cells. In my opinion, a few clarifications on the mechanism proposed to explain this phenomenon could strengthen this research:

      (1) According to your model, the rear end of the cell, which is in contact with softer substrates, will have slower diffusion rates of MNIIA. Does this mean that bigger cells will durotax better than smaller cells because the stiffness difference between front and rear is higher? Is it conceivable to attenuate the slope of the durotactic gradient to a degree where smaller cells lose their ability to durotact, while longer cells retain their capacity for directional movement?

      We thank the reviewer for this comment. In fact, it is not always the case that bigger cells will durotax better than smaller cells. Although bigger cells will sense higher stiffness difference between the front and rear, cells placed on different regions of underlying substrates may respond differently. This is because diffusion coefficient difference is not proportional to stiffness difference in our theoretical model. Therefore, when cells are placed on a very stiff substrate, cells may not durotax. When cells are placed on a region with suitable stiffness, where cells are sensitive to stiffness gradient, bigger cells will durotax better than smaller cells. In this situation, as you mentioned, lowering the stiffness gradient will make smaller cells become adurotactic while longer cells still durotax.

      We tried to further address this question by our durotaxis assay but there was a challenge: the amoeboid cells we use, including CD4+ Naïve T cells, neutrophils, dHL-60 cells and Dictysotelium, frequently protrude, retract and alter contact area with the substrate which make it difficult for us to distinguish between bigger and smaller cells in a particular cell type. Previously reported durotactic cell lines, such as MDA-MB-231 and HT1080 cells, are bigger than the amoeboid cells we use but they are mesenchymal cells and adopt distinct mechanisms which always involve stable focal adhesions. Due to this, although we are eager to answer this question by experiments and that the stiffness gradient is tunable in our system, we have not found an appropriate approach and experimental setup.

      (2) Where did you place the threshold for soft, middle, and stiff regions (Figure 6)? Is it possible that you only have a linear rigidity gradient in the center of your gel and the more you approach the borders, the flatter the gradient gets? In this case, cells would migrate randomly on uniform substrates. Did you perform AFM over the whole length of the gel or just in the central part?

      We thank the reviewer for this comment. We have performed AFM over the whole length of our gradient gel (Fig. S1A). We divide the gel into three equal parts (stiff: 1-4 mm; middle: 4-7 mm; soft: 7-10 mm) and the stiffness gradient is almost linear within each part as shown in Fig. S1A.

      (3) In which region (soft, middle, stiff) did you perform all the cell tracking of the previous figures?

      We thank the reviewer for this question. We performed the cell tracking in the soft region of the gradient gel.

      (4) What is the level of confinement experienced by the cells? Is it possible that cells on the soft side of the gels experience less confinement due to a "spring effect" whereby the coverslips descending onto the cells might exert diminished pressure because the soft hydrogels act as buffers, akin to springs? If this were the case, cells could migrate following a confinement gradient.

      We thank the reviewer for this comment. Although the possibility that our thin hydrogel layers act as buffers cannot be completely excluded, we have performed the durotaxis assay without upper gradient gel providing confinement (Author response image 1A). In this case, CD4+ Naïve T cells, neutrophils, dHL-60 cells and Dictysotelium can still durotax (Author response image 1B-E), indicating stiffness gradient itself is sufficient to direct amoeboid cell migration.

      Author response image 1.

      Illustration of the durotaxis system without confinement (A) and y-FMI of CD4+ Naïve T cells (B), neutrophils (C), dHL-60 cells (D) and Dictysotelium (E) cultured on uniform substrate or gradient substrate (n ≥ 30 tracks were analyzed for each experiment, N = 3 independent experiments for each condition, replicates are biological). All error bars are SEM. ****, P < 0.0001, by Student’s t-test.

      Reviewer #2 (Public Review):

      Summary:

      The authors developed an imaging-based device that provides both spatialconfinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.

      We thank the reviewer for this suggestion. We have investigated the compensation of myosin in NMIIA and NMIIB KD HL-60 cells using Western blot and added this result in our updated manuscript (Fig. S4B, C). The results showed that the level of NMIIB protein in NMIIA KD cells doubled while there was no compensatory upregulation of NMIIA in NMIIB KD cells. This is consistent with our conclusion that NMIIA rather than NMIIB is responsible for amoeboid durotaxis since in NMIIA KD cells, compensatory upregulation of NMIIB did not rescue the durotaxis-deficient phenotype.

      (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.

      We thank the reviewer for this comment. We have updated details of the expansion microscopy assay in our revised manuscript in line 481-485 including how the assay is performed on cells under confinement:

      Briefly, CD4+ Naïve T cells were seeded on a gradient PA gel with another upper gel providing confinement. 4% PFA was used to fix cells for 15 min at room temperature. After fixation, the upper gradient PA gel is carefully removed and the bottom gradient PA gel with seeded cells were immersed in an anchoring solution containing 1% acrylamide and 0.7% formaldehyde (Sigma, F8775) for 5 h at 37 °C.

      (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.

      We thank the reviewer for this suggestion. Active nematic models have been employed to recapitulate many phenomena during cell migration (Nat Commun., 2018, doi: 10.1038/s41467-018-05666-8.). The active nematic model describes the motion of cells using the orientation field, Q, and the velocity field, u. The director field n with (n = −n) is employed to represent the nematic state, which has head-tail symmetry. However, in our experiments, actin filaments are obviously polarized, which polymerize and flow towards the direction of cell migration. Therefore, we choose active gel model which describes polarized actin field during cell migration. In the discussion part, we have provided the comparison between active gel model and motor-clutch model. We have also supplemented a short discussion between the present model and active nematic model in the main text of line 345-347:

      The active nematic model employs active extensile or contractile agents to push or pull the fluid along their elongation axis to simulate cells flowing (61).

      (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

      We thank the reviewer for this question. In our model, the polarization field P(r,t) is employed to couple actin and myosin together. It is obvious that actin accumulate at the front while myosin diffuses in the opposite direction. Therefore, we propose that actin and myosin flow towards the opposite direction, which is captured in the convection term of actin (∇[c(v+wP)])  and myosin (∇[m(-wP)]) density field.

      Reviewing Editor (Recommendations For The Authors):

      We suggest that you cite the publication about confinement force microscopy from the Betz lab (https://doi.org/10.1101/2023.08.22.554088).

      We thank the editor for this suggestion. We have cited this publication in line 89 in our updated manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Minor points and text corrections:

      - In line 288 you state that NMIIA basal diffusion rate is larger on softer substrates, while in line 315 you say that NMIIA is more diffusive on stiff. The two sentences seem to contradict each other.

      We thank the reviewer for pointing out this mistake. In our active gel model, the basal diffusion rate of NMIIA is larger on stiffer substrate. We have corrected this mistake in line 288 (line 283 in the updated manuscript) in our revised manuscript.

      - How were the non-muscle myosin images (Figure 3F) collected?

      We thank the reviewer for this question. The non-muscle myosin images in Fig. 3F are single planes collected by epifluorescence-confocal microscopy. We have updated the related method in our revised manuscript in line 477-478:

      After mounting medium is solidified, single plane images were captured using a 63×1.4 NA objective lens on Andor Dragonfly epi-fluorescence confocal imaging system.

      - Is there a quantification of NMAII accumulation at the back?

      We thank the reviewer for this question. We have a quantification of NMIIA distribution in Fig. 3G. We measured the fluorescence intensity of NMIIA and NMIIB in the soft and stiff region of cells and found that the soft/stiff fluorescence ratio of NMIIB is about 0.95 and the ratio of NMIIA is about 1.82, indicating NMIIA tend to be localized at back while NMIIB is evenly distributed in the soft and stiff region of cells.

      - At which frequency were images acquired for Fluorescent Speckle Microscopy? Overall, I think it would help to state the length and frequency of videos in the legends.

      We thank the reviewer for this comment. We have updated the length (10 min for movie 6-10 and 80 sec for movie11) and frequency (15 sec intervals for movie 6-10 and 2 sec intervals for movie11) of Fluorescent Speckle Microscopy videos in our revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      The cell contour of Figure S5C is not very clear.

      We thank the reviewer for this comment. We have marked the outline of the cell in Fig. S5C in our updated manuscript.

    2. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    3. Reviewer #1 (Public review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The authors responded to all my comments and I have nothing to add. The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

    4. Reviewer #2 (Public review):

      Summary:

      The authors developed an imaging-based device, that provides both spatial confinement and stiffness gradient, to investigate if and how amoeboid cells, including T cells, neutrophils and Dictyostelium can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that are not dependent on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:<br /> (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors take compensatory effects into account.<br /> (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.<br /> (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.<br /> (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?