5,953 Matching Annotations
  1. Aug 2024
    1. Author response:

      Reviewer #2 (Public Review):

      This is, to my knowledge, the most scalable method for phylogenetic placement that uses likelihoods. The tool has an inter- esting and innovative means of using gaps, which I haven’t seen before. In the validation the authors demonstrate superior performance to existing tools for taxonomic annotation (though there are questions about the setup of the validation as described below).

      The program is written in C with no library dependencies. This is great. However, I wasn’t able to try out the software because the linking failed on Debian 11, and the binary artifact made by the GitHub Actions pipeline was too recent for my GLIBC/kernel. It’d be nice to provide a binary for people stuck on older kernels (our cluster is still on Ubuntu 18.04). Also, would it be hard to publish your .zipped binaries as packages?

      We have provided a binary (and zipped package) that supports Ubuntu 18.04 in GitHub Actions ( https://github.com/lpipes/tronko/actions/runs/9947708087). This should facilitate the use of our software on older sys- tems like yours. We were not able to test the binary however, since GitHub did not seem to find any nodes with Ubuntu 18.04. It is important to note that Ubuntu 18.04 is deprecated. The latest version of Ubuntu is 24.04, and we recommend users to upgrade to newer, supported versions of their operating systems to benefit from the latest security updates and features.

      Thank you for publishing your source files for the validation on zenodo. Please provide a script that would enable the user to rerun the analysis using those files, either on zenodo or on GitHub somewhere.

      We have posted all datasets as well as scripts to Zenodo.

      The validations need further attention as follows.

      First, the authors have not chosen data sets that are not well-aligned with real-world use cases for this software, and as a re- sult, its applicability is difficult to determine. First, the leave-one-species-out experiment made use of COI gene sequences representing 253 species from the order Charadriiformes, which includes bird species such as gulls and terns. What is the reasoning for selecting this data set given the objective of demonstrating the utility of Tronko for large scale community profiling experiments which by their nature tend to include microorganisms as subjects? If the authors are interested in evaluating COI (or another gene target) as a marker for characterizing the composition of eukaryotic populations, is the heterogeneity and species distribution of bird species within order Charadriiformes comparable to what one would expect in populations of organisms that might actually be the target of a metagenomic analysis?

      Our reasoning for selecting Charadriiformes is that these species are often misidentified for each other and there is a heavy reliance on COI for their species identification. This choice allows us to demonstrate Tronko’s ability to handle difficult and realistic identification challenges. Additionally, we aimed to simulate a challenging dataset to effectively differentiate between the methods used, showcasing Tronko’s robustness. Including more distantly related bird species would have simplified the identification process, which would not serve our objective of demonstrating the utility of Tronko for dis- tinguishing closely related species. It is also important to note that all methods used the exact same reference database which is not always the case in other species assignment comparative studies.

      Furthermore, while our study uses bird species, the principles and techniques applied are broadly applicable to other taxa, including microorganisms. By selecting a datase tknown for its identification difficulties, we underscore Tronko’spotential utility in a wide range of taxonomic profiling scenarios, including those involving high heterogeneity and closely related species, such as in microbial communities.

      Second, It appears that experiments evaluating performance for 16S were limited to reclassification of sequencing data from mock communities described in two publications, Schirmer (2015, 49 bacteria and 10 archaea, all environmental), and Gohl (2016; 20 bacteria - this is the widely used commercial mock community from BEI, all well-known human pathogens or commensals). The authors performed a comparison with kraken2, metaphlan2, and MEGAN using both the default database for each as well as the same database used for Tronko (kudos for including the latter). This pair of experiments provide a reasonable high-level indication of Tronko’s performance relative to other tools, but the total number of organ- isms is very limited, and particularly limited with respect to the human microbiome. It is also important to point out that these mock communities are composed primarily of type strains and provide limited species-level heterogeneity. The per- formance of these classification tools on type strains may not be representative of what one would find in natural samples. Thus, the leave-one-individual-out and leave-one-species-out experiments would have been more useful and informative had they been applied to extended 16S data sets representing more ecologically realistic populations.

      We thank the reviewer for this comment and we have included both an additional bacterial mock community dataset from Lluch et al. (2015) and an additional leave-one-species-out experiment. We describe how this leave-one-species-out dataset was constructed in our previous response to ’Essential Revisions’ #1. We also added Figure 5, S5, and S6.

      Finally, the authors should describe the composition of the databases used for classification as well as the strategy (and toolchain) used to select reference sequences. What databases were the reference sequences drawn from and by what criteria? Were the reference databases designed to reflect the composition of the mock communities (and if so, are they limited to species in those communities, or are additional related species included), or have the authors constructed general pur- pose reference databases? How many representatives of each species were included (on average), and were there efforts to represent a diversity of strains for each species? The methods should include a section detailing the construction of the data sets: as illustrated in this very study, the choice of reference database influences the quality of classification results, and the authors should explain the process and design considerations for database construction.

      To construct our databases, we used CRUX (Curd et al., 2018). This is described in the Methods section under ’Custom 16S and COI Tronko-build reference database construction’. All missing outs tests were downsamples of these two databases. It is beyond the scope of the manuscript to discuss how CRUX works. Additionally, we added the following text:

      To compare the new method (Tronko) to previous methods, we constructed reference databases for COI and 16S for com- mon amplicon primer sets using CRUX (See Methods for exact primers used).

    1. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Perez-Lopez et al. examine the function of the chemokine CCL28, which is expressed highly in mucosal tissues during infection, but its role during infection is poorly understood. They find that CCL28 promotes neutrophil accumulation in the intestines of mice infected with Salmonella and in the lungs of mice infected with Acinetobacter. They find that Ccl28-/- mice are highly susceptible to Salmonella infection, and highly resistant and protected from lethality following Acinetobacter infection. They find that neutrophils express the CCL28 receptors CCR3 and CCR10. CCR3 was pre-formed and intracellular and translocated to the cell surface following phagocytosis or inflammatory stimuli. They also find that CCL28 stimulation of CCR3 promoted neutrophil antimicrobial activity, ROS production, and NET formation, using a combination of primary mouse and human neutrophils for their studies. Overall, the authors' findings provide new and fundamental insight into the role of the CCL28:CCR3 chemokine:chemokine receptor pair in regulating neutrophil recruitment and effector function during infection with the intestinal pathogen Salmonella Typhimurium and the lung pathogen Acinetobacter baumanii.

      We would like to thank the reviewer for their positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #2 (Public Review):

      In this manuscript by Perez-Lopez et al., the authors investigate the role of the chemokine CCL28 during bacterial infections in mucosal tissues. This is a well-written study with exciting results. They show a role for CCL28 in promoting neutrophil accumulation to the guts of Salmonella-infected mice and to the lung of mice infected with Acinetobacter. Interestingly, the functional consequences of CCL28 deficiency differ between infections with the two different pathogens, with CCL28-deficiency increasing susceptibility to Salmonella, but increasing resistance to Acinetobacter. The underlying mechanistic reasons for this suggest roles for CCL28 in enhanced neutrophil antimicrobial activity, production of reactive oxygen species, and formation of extracellular traps. However, additional experiments are required to shore up these mechanisms, including addressing the role of other CCL28-dependent cell types and further characterization of neutrophils from CCL28-deficient mice.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #3 (Public Review):

      The manuscript by Perez-Lopez and colleagues uses a combination of in vivo studies using knockout mice and elegant in vitro studies to explore the role of the chemokine CCL28 during bacterial infection on mucosal surfaces. Using the streptomycin model of Salmonella Typhimurium (S. Tm) infection, the authors demonstrate that CCL28 is required for neutrophil influx in the intestinal mucosa to control pathogen burden both locally and systemically. Interestingly, CCL28 plays the opposite role in a model lung infection by Acinetobacter baumanii, as Ccl28-/- mice are protected from Acinetobacter infection. Authors suggest that the mechanism by which CCL28 plays a role during bacterial infection is due to its role in modulating neutrophil recruitment and function.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      The major strengths of the manuscript are:

      The novelty of the findings that are described in the manuscript. The role of the chemokine CCL28 in modulating neutrophil function and recruitment in mucosal surfaces is intriguing and novel.

      Authors use Ccl28-/- mice in their studies, a mouse strain that has only recently been available. To assess the impact of CCL28 on mucosal surfaces during pathogen-induced inflammation, the authors choose not one but two models of bacterial infection (S. Tm and A. baumanii). This approach increases the rigor and impact of the data presented.

      Authors combine the elegant in vivo studies using Ccl28 -/- with in vitro experiments that explore the mechanisms by which CCL28 affects neutrophil function.

      The major weaknesses of the manuscript in its present form are:

      Authors use different time points in the S. Tm model to characterize the influx of immune cells and pathology. They do not provide a clear justification as to why distinct time points were chosen for their analysis.

      The reviewer raises a good point. As discussed in the detailed response to the reviewers, we have now generated extensive results at different time points and included these in the revised manuscript.

      Authors provide puzzling data that Ccl28-/- mice have the same numbers of CCR3 and CCR10- expressing neutrophils in the mucosa during infection. It is unclear why the lack of CCL28 expression would not affect the recruitment of neutrophils that express the ligands (CCR3 and CCR10) for this chemokine. Thus, these results need to be better explained.

      As discussed in the detailed response to the reviewers, we clarified that Ccl28-/- mice have reduced numbers of neutrophils in the mucosa during infection, but the percentage of CCR3+ and CCR10+ neutrophils does not change. We provide additional discussion of this point in the manuscript and in the response to the reviewers.

      The in vitro studies focus primarily on characterizing how CCL28 affects the function of neutrophils in response to S. Tm infection. There is a lack of data to demonstrate whether Acinetobacter affects CCR3 and CCR10 expression and recruitment to the cell surface and whether CCL28 plays any role in this process.

      We agree and have performed additional studies with Acinetobacter and CCL28, which we discuss in greater detail below in the response to the reviewers.

    1. Author response:

      We appreciate the time of the reviewers and their detailed comments, which will help to improve the manuscript.

      We are sorry that at least one reviewer seems to have had the impression that we have conflated issues about gonadal and non-gonadal sex phenotypes. This referee suggests that we should use Sharpe et al. (2023) to develop our concepts. However, what is discussed in Sharpe et al. was already the guiding principle for our study (without knowing this paper before). In our paper, we introduce the gonadal binary sex (which is self-evidently also the basis for creating the dataset in the first place, because we needed to separate males from females) and go then on to the question of (adult) sex phenotypes for the rest of the paper. The gonadal data are included only as comparison for contrasting the patterns in the non-gonadal tissues.

      Our study presents the largest systematic dataset so far on the evolution of sex-biased gene expression. It is also the first that explores the patterns of individual variation in sex-biased gene expression and the SBI is an entirely new procedure to directly visualize these variance patterns in an intuitive way (note that the relative position of the distributions along the X-axis is indeed not relevant). The results are actually quite nuanced (e.g. the rather dynamv changes seen in mouse kidney and liver comparisons) and go certainly beyond what would have been predictable based on the current literature.

      Also, we should like to point out that our study contradicts recent conclusions that were published in high profile journals, that had suggested that a substantial set of sex-biased genes has conserved functions between humans and mice and that mice can therefore be informative for gender-specific medicine studies. Our data suggest that that only a very small set of genes are conserved in their sex-biased expression. These are epigenetic regulator genes and it will therefore be interesting in the future to focus on their roles in generating the differences between sexual phenotypes in given species.

      We will be happy to use the referee comments to clarify all of these points in a revised version. But we do not think that our "evidence is incomplete" and that there are several "overstated key conclusions". We have used all canonical statistical analyses that are typically used in papers of sex-biased gene expression, as acknowledged by reviewers 1 and 2. The additional statistical analyses that are requested are not within the scope of such papers, but could be subject to separate general studies, independent of the sex-bias analysis (e.g. the role of highly expressed genes versus low expressed genes, or the analysis of the fraction of neutrally evolving loci).

      Finally, it is unclear why the overall rating of the paper is at the lowest possible category ("useful study"), given that it adds a substantial amount of data and new insights into the exploration of the non-binary nature of sexual phenotypes.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Revisions Round 1

      Reviewer #1

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review): 

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors): 

      - remove unscientific language: "it seems that there are about as many unique atomic-resolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      - for same reason, remove "Obviously, " 

      Done

      - What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      - What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      - "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      - Remove "historically" 

      Done

      - Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      - "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      - Reference 10 is a comment on reference 9; it should be removed. Instead, as for alpha-synuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      - Rephrase: "is not always 100% faithful"

      Removed “100%”

      - What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      - Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      - "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      - The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      - A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      - Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      - Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      - Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      - Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.  

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      - Many references are incorrect, containing "Preprint at (20xx)" statements.

      This has been corrected.  

      Reviewer #3 (Public Review): 

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor: 

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

      Revisions Round 2

      Reviewer #2 (Public Review): 

      I do worry that the FSC values of model-vs-map appear to be higher than expected from the corresponding FSCs between the half-maps (e.g. see Fig 13). The implication of this observation is that the atomic models may have been overfitted in the maps, which would have led to a deterioration of their geometry. A table with rmsd on bond lengths, angles, etc would probably show this. In addition, to check for overfitting, the atomic model for each data set could be refined in one of the half-maps, and then that same model could be used to calculate 2 FSC model-vs-map curves: one against the half-map it was refined in and one against the other half-map. Deviations between these two curves are an indication of overfitting. 

      Thank you for the recommendations for model validation.  We have added the suggested statistics to Table 2 and performed the suggested model fitting to one of the half-maps and plotted 3 FSC model-vs-map curves: one for each half-map versus the model fit against only one half map and one for the model fit against the full map. We feel that the degree of overfitting is reasonable and does not  significantly impact the quality of the models. 

      In addition, the sudden drop in the FSC curves in Figure 16 shows that something unexpected has happened to this refinement. Are the authors sure that only the procedures outlined in the Methods were used to create these curves? The unexpected nature of the FSC curve for this type (2A) raises doubts about the correctness of the reconstruction. 

      We thank the reviewer for the attention to detail.  We should have caught this mistake. It turns out that in the last round of 3D refinement, the two half-maps become shifted with respect to each other in the z direction. We realigned the two maps using Chimera and then re-ran the postprocessing. The new maps have been deposited in EMD-50850. This mistake motivated us to inspect all of the maps and we found the same problem had occurred in the Type 3B maps.  This was not noticed by the reviewer because we accidentally plotted the FSC curves from postprocessing from one refinement round before the one deposited in the EMD. We performed the same half-map shifting procedure for the Type 3B data and performed a final round of real-space refinement to produce new maps and models that have been deposited as EMD-50888 and 9FYP (superseding the previous entries).

      Reviewer #3 (Public Review): 

      There are two minor points I recommend the authors to address: 

      (1) In the response to Weakness 1, point (3), the authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      We aim to be as transparent as possible and this information was included in the main text however we did not label the percentage of Type 5 fibrils in Figure 4 because that would make the other percentages ambiguous.  The percentages in Figure 4 represent the ratio of helical segments used for each type of refined structure in the dataset (always adding up to 100%), not the percent of all fibrils in the dataset.  That is, there are sometimes untwisted or unidentifiable fibrils in datasets and these were not accounted for in the listed percentages. We have added a sentence to the Figure 4 legend to explain to what the percentages refer.

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Thank you for reminding us to add the scale bars. This is now done for the 2D classes in Figures 11-17.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      A critical look at the maps and models of the various structures at this stage may prevent the authors from entering suboptimal structures into the databases.  

      We agree. Thank you for suggesting this.

      Reviewer #3 (Recommendations For The Authors): 

      The authors have responded adequately to these critiques in the revised version of the manuscript. There are two minor points. 

      (1) The authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Answered in public comments

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Little is known about the local circuit mechanisms in the preoptic area (POA) that regulate body temperature. This carefully executed study investigates the role of GABAergic interneurons in the POA that express neurotensin (NTS). The principal finding is that GABA-release from these cells inhibits neighboring neurons, including warm-activated PACAP neurons, thereby promoting hyperthermia, whereas NTS released from these cells has the opposite effect, causing a delayed activation and hypothermia. This is shown through an elegant series of experiments that include slice recordings alongside matched in vivo functional manipulations. The roles of the two neurotransmitters are distinguished using a cell-type-specific knockout of Vgat as well as pharmacology to block GABA and NTS receptors. Overall, this is an excellent study that is noteworthy for revealing local circuit mechanisms in the POA that control body temperature and also for highlighting how amino acid neurotransmitters and neuropeptides released from the same cell can have opposing physiologic effects. I have only minor suggestions for revision.

      Reviewer #2 (Public Review):

      Summary:

      The study has demonstrated how two neurotransmitters and neuromodulators from the same neurons can be regulated and utilized in thermoregulation.

      The study utilized electrophysiological methods to examine the characteristics and thermoregulation of Neurotensin (Nts)-expressing neurons in the medial preoptic area (MPO). It was discovered that GABA and Nts may be co-released by neurons in MPO when communicating with their target neurons.

      Strengths:

      The study has leveraged optogenetic, chemogenetic, knockout, and pharmacological inhibitors to investigate the release process of Nts and GABA in controlling body temperature.

      The findings are relevant to those interested in the various functions of specific neuron populations and their distinct regulatory mechanisms on neurotransmitter/neuromodulator activities

      Weaknesses:

      Key points for consideration include:

      (1) The co-release of GABA and Nts is primarily inferred rather than directly proven. Providing more direct evidence for the release of GABA and the co-release of GABA and Nts would strengthen the argument. Further in vitro analysis could strengthen the conclusion regarding this co-releasing process.

      Measurement of Nts concentrations in various brain regions during thermoregulatory responses is part of a future study.

      (2) The differences between optogenetic and chemogenetic methods were not thoroughly investigated. A comparison of in vitro results and direct observation of release patterns could clarify the mechanisms of GABA release alone or in conjunction with Nts under different stimulation techniques.

      A comparison of chemogenetic and optogenetic stimulation methods is not within the scope of this study.

      (3) Neuronal transcripts were mainly identified through PCR, and alternative methods like single-cell sequencing could be explored.

      Single cell transcriptomics of preoptic neurotensinergic neurons will be part of a different study.

      (4) In Figure 6, the impact of GABA released from Nts neurons in MPO on CBT regulation appears to vary with ambient temperatures, requiring a more detailed explanation for better comprehension.

      The different possible roles of GABA in different thermoregulatory circumstances is discussed on lines 555-581.

      (5) The model should emphasize the key findings of the study.

      The model is presented in Fig 8.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the central neural circuits regulating body temperature is critical for improving health outcomes in many disease conditions and in combating heat stress in an ever-warming environment. The authors present important and detailed new data that characterizes a specific population of POA neurons with a relationship to thermoregulation. The new insights provided in this manuscript are exactly what is needed to assemble a neural network model of the central thermoregulatory circuitry that will contribute significantly to our understanding of regulating the critical homeostatic variable of body temperature. These experiments were conducted with the expertise of an investigator with career-long experience in intracellular recordings from POA neurons. They were interpreted conservatively in the appropriate context of current literature.

      The Introduction begins with "Homeotherms, including mammals, maintain core body temperature (CBT) within a narrow range", but this ignores the frequent hypothermic episodes of torpor that mice undergo triggered by cold exposure. Although the author does mention torpor briefly in the Discussion, since these experiments were carried out exclusively in mice, greater consideration (albeit speculative) of the potential for a role of MPO Nts neurons in torpor initiation or recovery is warranted. This is especially the case since some 'torpor neurons' have been characterized as PACAP-expressing and a population of PACAP neurons represent the target of MPO Nts neurons.

      Additional discussion of a possible role of neurotensinergic neurons in the initiation or recovery from torpor is included (lines 593-597).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary:

      The authors profile gene expression, chromatin accessibility, and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T-cell activation. 

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation. 

      Suggestions for improvement:

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes. 

      The obvious question that the authors don't venture into is why the results are quite different. In principle, this could be due to the differences between: 

      (a) the cell stimulation procedure 

      (b) the GWAS datasets used 

      (c)  the types of assay (Hi-C vs Capture Hi-C) 

      (d) approaches for defining gene-linked regions (loops vs neighbourhoods) 

      (e) how the GWAS signals at gene-linked regions are aggregated (e.g., the flavours of COGS in Javierre and Burren vs the authors' approach)

      Re (a), I'm not sure the authors make it explicitly clear in the main text that the Capture Hi-Cbased studies also use *stimulated* CD4 T cells, particularly in the section "Comparative predictive power...". So the cells used are pretty much the same, and the differences likely arise from points (b) to (e).

      It would be useful for the community to understand more clearly what is driving these differences, ideally with some added data. Could the authors, for example, take the PCHi-C data from Javierre/Burren and use their GWAS data and variant-to-gene assignment algorithms? 

      We greatly appreciate the referee’s expert assessment of our work and its value to the field, and we are glad that the referee was enthused by our comparison of the predictive power of the various V2G approaches. A point not emphasized enough in the original version of the manuscript is that we actually did harmonize the various datasets in the way the referee suggests for the precision/recall analysis. We took the contact maps presented from each paper, mapped genes using the same set of GWAS SNPs, and defined all gene-linked regions using our loop calling approach. This has been clarified in the revised version of the manuscript. We have now included a more thoughtful discussion of the possible sources of discrepancy between the different studies included in the comparison, and our thoughts on the potential sources raised by the referee are outlined below:

      (a) The modes of stimulation used are similar between studies, but timepoints and donors did vary, and ours was the only study that sorted naïve CD4+ T cells before stimulation. These aspects could represent a source of variability. 

      (b) The GWAS is not a source of variability because we re-ran the raw data from all the orthogonal studies through our V2G pipeline using the same GWAS as in the current manuscript. 

      (c) The use of HiC vs. Capture HiC is a likely source of variability. The Capture-HiC datasets included in our comparison are lower resolution (i.e. HindIII) but focus higher sequencing depth at promoters compared to our HiC datasets – i.e., Capture-HiC may mis-call loops to the wrong promoters due to lower resolution as we have shown in our previous study [Su, Human Genetics, 2021], and will miss distal SNP interactions at promoters not included in the capture set. While HiC is unbiased in this regard, HiC will fail to call some SNP-promoter loops called by CaptureHiC because the sequencing power is not specifically focused at promoters. 

      (d) For studies using neighborhood approaches, we re-ran the raw data through our loop calling algorithm to connect distal SNP to gene promoters, and regarding (e) above, we ran the raw data through our V2G pipeline to allow a better comparison.

      In addition, given that the authors use Hi-C, a popular method for V2G prioritisation for this type of data is currently ABC (Nasser et al, Nature 2021). Could the authors provide a comparative analysis with respect to the V2G assignments in the paper and, if they see it appropriate, also run ABC-based GWAS integration on their own Hi-C data?

      This is an excellent suggestion, which we have followed in the revised version of our manuscript. It should be noted (and we do so in the text of the revision) that there is an important caveat to bringing in the ABC model. Chromosome conformation-based approaches are biologically constrained (i.e., informed) by the natural structure of chromatin in the nucleus that controls how gene transcription is regulated in cis, and it does so in a way that brings value to GWAS data. However, the ABC model further constrains the input data by imposing non-biological filters that allow the algorithm to be applied, but impose artifactual limitations that may negatively impact interpretation and discovery. In addition to filtering out pseudogenes, bidirectional RNA, antisense RNAs, and small RNAs, the ABC model gene set eliminates genes ubiquitously expressed across tissues (based on the assumption that these genes are driven primarily by elements adjacent to their promoters) and only allows annotation of one promoter per gene, even though the median number of promoters per gene in the human genome is three. In contrast, our chromatin-based V2G removes pseudogenes, but includes lincRNA and small RNAs, and includes all alternative transcription start sites annotated by gencode. 

      To apply the ABC GWAS gene nomination model to our CD4+ T cell chromatin-based V2G data, we used our ATAC-seq data and publicly available CD4+ T cell H3K27ac ChIP-seq data as input, and integrated this with GWAS and the average ENCODE-derived HiC dataset from the original ABC paper. The activity-by-contact model nominated 650 genes, compared to 1836 genes when using our cell type-matched HiC data and analysis pipeline. Only 357 of these genes were nominated by both approaches; 1479 genes nominated by our approach were not nominated by ABC, while 293 genes not implicated by our approach were newly implicated by ABC. To determine how the ABC-constrained approach performs against the HIEI gold standard set, we subjected all datasets used for the comparison depicted in the new Figure 5D to the same promoter filter used by the ABC model prior as part of the precision-recall re-analysis. Firstly, we found that applying the restricted ABC model promoter annotation to all datasets did not have a large effect on recall, however, the precision of several of the datasets were affected. For example, using the restricted promoter set reduced the precision of our (Pahl) V2G approach and inflated the precision of the nearest gene to SNP metric. Second, the new precision-recall analysis shows that the ABC score-based approach is only half as sensitive at predicting HIEI genes as the chromatin-based V2G approaches. This indicates that constraining GWAS data with cell type- and state-specific 3D chromatin-based data brings more GWAS target gene predictive power than application of the multi-tissue-averaged HiC used by the ABC model. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #2 (Public Review): 

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information on dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation. 

      Strengths:

      The work done characterizing elements at the IL2 locus is impressive. 

      Weaknesses:

      Missing critical context to evaluate claims. There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq that have been ignored by this study. How do conclusions from previous studies compare to what the authors conclude here? It is impossible to evaluate the claims without this additional context. These are a few studies I am familiar with (the authors should perform a more comprehensive search to be sure they're not ignoring existing observations) that would be important to compare/contrast conclusions:  o Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424-431 (2018). 

      - Calderon, D., Nguyen, M.L.T., Mezger, A. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494-1505 (2019). 

      - Gate, R.E., Cheng, C.S., Aiden, A.P. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50, 1140-1150 (2018).  o Glinos, D.A., Soskic, B., Williams, C. et al. Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation. Genes Immun 21, 390-408 (2020).  o Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247-253 (2020). 

      - Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).  o Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014). 

      - As a general point, I appreciate it when each claim includes a corresponding effect size and p-value, which helps me evaluate the strength of significance of supporting evidence. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. Our precision-recall analyses were not meant to represent an exhaustive comparison of all prior GWAS gene nomination studies, although we agree that this could (and should) be done as part of a separate study in a future manuscript. Instead, we focused on gene nomination studies that 1) analyzed resting and activated human CD4+ T cells, 2) whose experimental design was most comparable to our own studies, and 3) had raw data readily available in the appropriate formats to allow re-analysis and harmonization before comparison. This is a point we did not make sufficiently clear in the original version of the manuscript, but have clarified in the revision. 

      Based on this rationale, we agree that the studies by Gate et al. and Ye et al. should be included in our comparative precision-recall analysis, and we have done so in the revised manuscript. The Gate study reported ATAC-seq peak co-accessibility, caQTL, eQTL, and HiC data, and we now include the resulting gene nominations from these datasets in the precision-recall analysis. These datasets performed poorly with respect to nomination of HIEI genes, likely due to small sample numbers and low sequencing depth compared to the other eQTL and chromatin capture-based studies. The eQTL reported by Ye et al. nominated 15 genes for autoimmune traits, two of which were in the ‘truth’ HIEI set (IL7R and IL2RB). This resulted low predictive power but a high precision due to the low number of nominated genes compared to the other V2G datasets. As suggested by referee 1, we have also subjected our data to the ‘activity-by-contact’ (ABC) algorithm and have included this dataset in the comparison as well. Please see Figure 5 in the revised manuscript. 

      We have elected not to include data from the other studies suggested by the referee for the following reasons: The stimulation paradigm used in the Glinos study is very different from that used in other studies. Also, this study and the study by Calderon did not nominate genes. The studies by Alasoo et al. and Kim-Hellmuth et al. analyzed macrophages, which are not a comparable cell type to CD4+ T cells. The allele-specific eQTL study by Gutierrez-Arcelus et al. included relevant the cell type and activation states, but included a relatively small number of samples (24) and variants (561), and the raw data in dbGAP does not readily allow for re-analysis and harmonization with the other studies. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #3 (Public Review): 

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the authors identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cisregulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated with genes upon activation. Some of these regulate proliferation and cytokine production, but others are novel. 

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis-acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression. 

      Another strength of this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis-acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease-relevant SNPs, which were shown to affect IL-2 transcription. 

      The data from this study were also mined against past CRISPR screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation. 

      Weaknesses:

      A weakness of this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for the low overlap of their data with the eQTL-based approach or the HIEI truth set. 

      Impact:

      This study indicates that defining distal chromatin interacting regions helps to identify distal genetic elements, including relevant variants, that contribute to gene activation. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. We have ensured that all sample sizes, effect sizes, p values and FDR statistics are included in the figures and figure legends. We agree that including more donors for the HiC studies would increase the number of implicated variants and genes, however, all the chromatin-based V2G approaches described in our manuscript use relatively small sample sizes, but implicate more variants and genes than the comparable eQTL studies. I.e., the low overlap is not driven by a paucity of GWAS-chromatin-based associations. An alternative explanation for the low overlap between GWAS-chromatin-based approaches and eQTL approaches was recently by Pritchard and colleagues, who reported that GWAS and eQTL studies systematically implicate different types of variants (Mostafavi et al., Nature Genetics 2023). Among other differences, eQTL tend to implicate nearby genes while GWAS variants implicate distant genes, and our results support this contention. We referred to this study in the original version of the manuscript, but have included a more extensive discussion of potential explanations in the revised version. We thank the reviewer for helpful suggestions that have improved the quality of our study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Ever-improving techniques allow the detailed capture of brain morphology and function to the point where individual brain anatomy becomes an important factor. This study investigated detailed sulcal morphology in the parieto-occipital junction. Using cutting-edge methods, it provides important insights into local anatomy, individual variability, and local brain function. The presented work advances the field and will stimulate future research into this important area.

      Strengths:

      Detailed, very thorough methodology. Multiple raters mapped detailed sulci in a large cohort. The identified sulcal features and their functional and behavioural relevance are then studied using various complementary methods. The results provide compelling evidence for the importance of the described sulcal features and their proposed relationship to cortical brain function.

      We thank the Reviewer for highlighting the strengths of our methods and findings.

      Weaknesses:

      A detailed description/depiction of the various sulcal patterns is missing.

      We agree that adding these details for the newly described sulci is necessary and have now done so. These details are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      And in four new Supplementary Tables.

      A possible relationship between sulcal morphology and individual demographics might provide more insight into anatomical variability.

      We have conducted additional analyses to relate sulcal incidence to demographic features (age and gender). These results are included on Pages 5-6:

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      The unique dataset offers an opportunity to provide insights into laterality effects that should be explored.

      We included hemisphere as a factor in all models for this exact reason. Throughout the paper, we have edited the text to ensure that these laterality effects are more apparent to readers.

      Further, we have a Supplementary Results section on hemispheric effects regarding the slocs-v, cSTS3, and lTOS:

      “Hemispheric asymmetries in morphological, architectural, and functional features with regards to the slocs-v, cSTS3, and lTOS comparison

      We observed a sulcus x metric x hemisphere interaction on the morphological and architectural features of the slocs-v (F(4.20, 289.81) = 4.16, η2 = 0.01, p = .002; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by  the slocs-v being cortically thinner in the left than the right hemisphere (p < .001; Fig. 2a).

      There was also a sulcus x network x hemisphere interaction on the functional connectivity profiles (using functional connectivity parcellations from (Kong et al., 2019)) of the slocs-v and lTOS (F(32, 2144) = 3.99, η2 = 0.06, p < .001; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by three effects: (i) the slocs-v overlapped more with the Default C subnetwork in the left than the right hemisphere (p = .013), (ii) the lTOS overlapped more with Visual A subnetwork in the right than the left hemisphere (p = .002), and (iii) the lTOS overlapped more with the Visual B subnetwork in the left than the right hemisphere (p = .002; Fig. 2b).”

      As well as the other STS rami on morphology:

      “It is also worth noting that there was a sulcus x metric x hemisphere interaction (F(4, 284.12) = 6.60, η2 = 0.08, p < .001). Post hoc tests showed that: (i) the cSTS3 was smaller (p < .001) and thinner (p = .025) in the left than the right hemisphere (Supplementary Fig. 8a), (ii) the cSTS2 was shallower (p = .004) and thicker (p < .001) in the right than left hemisphere (Supplementary Fig. 8a), and (iii) the cSTS1 was shallower (p < .001), smaller (p = .002), thinner (p = .001), and less myelinated (p < .001) in the left than the right hemisphere (Supplementary Fig. 8a).”

      And functional connectivity of the STS rami:

      “There was also a sulcus x network x hemisphere interaction (F(32, 2208) = 12.26, η2 = 0.15, p < .001). Post hoc tests showed differences for each cSTS component. Here, the cSTS1 overlapped more with the Auditory network (p < .001), less with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), more with the Default C subnetwork (p < .001), more with the Ventral Attention B subnetwork (p < .001), and more with the Visual A subnetwork (p = .024) in the right than in the left hemisphere (Supplementary Fig. 8b). In addition, the cSTS2 overlapped more with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), and less with the Temporal-Parietal network (p = .011) in the right than in the left hemisphere (Supplementary Fig. 8b). Finally, the cSTS3 overlapped more with the Control B subnetwork (p = .002), less with the Default B subnetwork (p = .014), more with the Default C subnetwork (p = .022), less with the Ventral Attention B subnetwork (p = .029) in the right than in the left hemisphere (Supplementary Fig. 8b).”

      Reviewer #2 (Public Review):

      Summary: After manually labeling 144 human adult hemispheres in the lateral parieto-occipital junction (LPOJ), the authors 1) propose a nomenclature for 4 previously unnamed highly variable sulci located between the temporal and parietal or occipital lobes, 2) focus on one of these newly named sulci, namely the ventral supralateral occipital sulcus (slocs-v) and compare it to neighboring sulci to demonstrate its specificity (in terms of depth, surface area, gray matter thickness, myelination, and connectivity), 3) relate the morphology of a subgroup of sulci from the region including the slocs-v to the performance in a spatial orientation task, demonstrating behavioral and morphological specificity. In addition to these results, the authors propose an extended reflection on the relationship between these newly named landmarks and previous anatomical studies, a reflection about the slocs-v related to functional and cytoarchitectonic parcellations as well as anatomic connectivity and an insight about potential anatomical mechanisms relating sulcation and behavior.

      Strengths:

      - To my knowledge, this is the first study addressing the variable tertiary sulci located between the superior temporal sulcus (STS) and intraparietal sulcus (IPS).

      - This is a very comprehensive study addressing altogether anatomical, architectural, functional and cognitive aspects.

      - The definition of highly variable yet highly reproducible sulci such as the slocs-v feeds the community with new anatomo-functional landmarks (which is emphasized by the provision of a probability map in supp. mat., which in my opinion should be proposed in the main body).

      - The comparison of different features between the slocs-v and similar sulci is useful to demonstrate their difference.

      - The detailed comparison of the present study with state of the art contextualizes and strengthens the novel findings.

      - The functional study complements the anatomical description and points towards cognitive specificity related to a subset of sulci from the LPOJ

      - The discussion offers a proposition of theoretical interpretation of the findings

      - The data and code are mostly available online (raw data made available upon request).

      We thank the Reviewer for highlighting the strengths of our methods, analyses, and applications of our findings.

      Weaknesses:

      - While three independent raters labeled all hemispheres, one single expert finalized the decision. Because no information is reported on the inter-rater variability, this somehow equates to a single expert labeling the whole cohort, which could result in biased labellings and therefore affect the reproducibility of the new labels.

      Our group does not use an approach amenable to calculating inter-rater agreements to expedite the process of defining thousands of sulci at the individual level in multiple regions. Our method consists of a two-tiered procedure. Here, authors YT and TG defined sulci which were then checked by a trained expert (EHW). These were then checked again by senior author  (KSW) . We emphasize that this process has produced reproducible anatomical results in other regions such as posteromedial cortex (Willbrand et al., 2023 Science Advances; Willbrand et al., 2023 Communications Biology; Maboudian et al., 2024 The Journal of Neuroscience), ventral temporal cortex (Weiner et al., 2014 NeuroImage; Miller et al., 2020 Scientific Reports; Parker et al., 2023 Brain Structure and Function), and lateral prefrontal cortex (Miller et al., 2021 The Journal of Neuroscience; Voorhies et al., 2021 Nature Communications; Yao et al., 2022 Cerebral Cortex; Willbrand et al., 2022 Brain Structure and Function; Willbrand et al., 2023 The Journal of Neuroscience) across age groups, species, and clinical populations. Further, in the Supplemental Materials we provide post mortem images showing that these sulci exist outside of cortical reconstructions, supporting this updated sulcal schematic of the lateral parieto-occipital junction. For the present study, by the time the final tier of our method was reached, we emphasize that a very small percentage (~2%) of sulcal definitions were actually modified. We will include an exact percentage in future publications in LPC/LOPJ.

      - 3 out of the 4 newly labeled sulci are only described in the very first part and never reused. This should be emphasized as it is far from obvious at first glance of the article.

      We have edited the Abstract (shown below, on Page 1) and paper throughout to emphasize the emphasis on the slocs-v over the other three sulci.

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task.”

      It is worth noting that we have added additional analyses that include the other three newly-characterized sulci in response to Reviewer 1. We first described the relationship between these sulci and demographic features, alongside analyses on the patterning of these sulci, which are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001). Though we characterize these sulci in this paper for the first time, the location of these four sulci is consistent with the presence of variable “accessory sulci” in this cortical expanse mentioned in prior modern and classic studies (Supplementary Methods). We could also identify these sulci in post-mortem hemispheres (Supplementary Figs. 2, 3), ensuring that these sulci were not an artifact of the cortical reconstruction process.

      Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).  Finally, to help guide future research on these newly- and previously-classified LPC/LPOJ sulci, we generated probabilistic maps of each of these 17 sulci and share them with the field with the publication of this paper (Supplementary Fig. 6; Data availability).”

      - The tone of the article suggests a discovery of these 4 sulci when some of them have already been reported (as rightfully highlighted in the article), though not named nor studied specifically. This is slightly misleading as I interpret the first part of the article as a proposition of nomenclature rather than a discovery of sulci.

      We have toned down our language throughout the paper, emphasizing that this paper is updating the sulcal landscape of LPC/LOPJ taking into account these sulci that have not been comprehensively described previously. For example, in the Abstract (Page 1), we now write:

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task. “

      - The article never mentions the concept of merging of sulcal elements and the potential effect it could have on the labeling of the newly named variable sulci.

      We emphasize that we use multiple surfaces (pial, inflated, smoothwm) to help distinguish intersecting sulci from one another. We include extra text in the Methods (Page 21):

      “We defined LPC/LPOJ sulci for each participant based on the most recent schematics of sulcal patterning by Petrides (2019) as well as pial, inflated, and smoothed white matter (smoothwm) FreeSurfer cortical surface reconstructions of each individual. In some cases, the precise start or end point of a sulcus can be difficult to determine on a surface (Borne et al., 2020); however, examining consensus across multiple surfaces allowed us to clearly determine each sulcal boundary in each individual. “

      Further, upon quantifying the patterning of these variable sulci, a majority of the time they are independent (described in the Results on Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see (Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      Thus, merging sulcal elements likely had a minimal impact on the present definitions.

      - The definition of the new sulci is solely based on their localization relative to other sulci which are themselves variable (e.g. the 3rd branch of the STS can show different locations and different orientation, potentially affecting the definition of the slocs-v). This is not addressed in the discussion.

      As displayed in our probabilistic maps of these sulci (Supplementary Fig. 6), the cSTS components (2-4) are actually relatively consistent between individuals, and thus, future investigators can utilize these maps to help define these sulci in new hemispheres.

      Nevertheless, there is, of course, individual variability in the location of these sulci, and we do agree that this point brought up by the Reviewer is important. We have now added text to the Limitations section of the Discussion (Pages 15-16):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci, let alone PTS, without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This method is also arduous and time-consuming—which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull  relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg and Finn, 2022).”

      - The new sulci are only defined in terms of localization relative to other sulci, and no other property is described (general length, depth, orientation, shape...), making it hard for a new observer to take labeling decisions in case of conflict.

      To help guide future investigators, we now show these metrics for all sulci in Supplemental Figure 7 to help future groups identify these sulci with the assistance of their general morphology.

      - The very assertive tone of the article conveys the idea that these sulci are identifiable certainly in most cases, when by definition these highly variable tertiary sulci are sometimes very difficult to take decisions on.

      The highly variable nature of ¾ of the putative tertiary sulci (slocs-v, slocs-d, pAngs-v, pAngs-d) described here is why we focused on the slocs-v (as it is identifiable in nearly all f hemispheres). However, we have edited our language throughout the text to also emphasize the variability of these sulci. For example, in the Results (Page 5), we now write:

      “In previous research in small sample sizes, neuroanatomists noticed shallow sulci in this cortical expanse (Supplementary Methods and Supplementary Figs. 1-4 for historical details). In the present study, we fully update this sulcal landscape considering these overlooked indentations. In addition to defining the 13 sulci previously described within the LPC/LPOJ, as well as the posterior superior temporal cortex (Methods) (Petrides, 2019) in individual participants, we could also identify as many as four small and shallow PTS situated within the LPC/LPOJ that were highly variable across individuals and uncharted until now (Supplementary Methods and Supplementary Figs. 1-4). Macroanatomically, we could identify two sulci between the cSTS3 and the IPS-PO/lTOS ventrally and two sulci between the cSTS2 and the pips/IPS dorsally. We focus our analyses on the slocs-v since it was identifiable in nearly every hemisphere.”

      - I am not absolutely convinced with the labeling proposed of a previously reported sulcus, namely the posterior intermediate parietal sulcus.

      In defining previously-identified LPC sulci, we followed the previous labeling procedure by Petrides (2019) alongside historical definitions (detailed in Supplementary Figures 1-4). Nevertheless, future deep learning algorithms using these and others data can be used to rectify discrepancies in labeling (e.g., Borne et al., 2020 Medical Image Analysis; Lyu et al., 2021 NeuroImage). We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). Finally, the time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus, restricting the present study to LPC/LPOJ. “

      Assuming that the labelling of all sulci reported in the article is reproducible, the different results are convincing and in general, this study achieves its aims in defining more precisely the sulcation of the LPOJ and looking into its functional/cognitive value. This work clearly offers a finer understanding of sulcal pattern in this region, and lacks only little for the new markers to be convincingly demonstrated. An overall coherence of the labelling can still be inferred from the supplementary material which support the results and therefore the conclusions, yet, addressing some of the weaknesses listed above would greatly enhance the impact of this work. This work is important to the understanding of sulcal variability and its implications on functional and cognitive aspects.

      We thank the Reviewer for their positive remarks on the implications of this work.

      Reviewer #3 (Public Review):

      Summary: 72 subjects, and 144 hemispheres, from the Human Connectome Project had their parietal sulci manually traced. This identified the presence of previously undescribed shallow sulci. One of these sulci, the ventral supralateral occipital sulcus (slocs-v), was then demonstrated to have functional specificity in spatial orientation. The discussion furthermore provides an eloquent overview of our understanding of the anatomy of the parietal cortex, situating their new work into the broader field. Finally, this paper stimulates further debate about the relative value of detailed manual anatomy, inherently limited in participant numbers and areas of the brain covered, against fully automated processing that can cover thousands of participants but easily misses the kinds of anatomical details described here.

      Strengths:

      - This is the first paper describing the tertiary sulci of the parietal cortex with this level of detail, identifying novel shallow sulci and mapping them to behaviour and function.

      - It is a very elegantly written paper, situating the current work into the broader field.

      - The combination of detailed anatomy and function and behaviour is superb.

      We thank the Reviewer for their positive remarks on paper and our findings.

      Weaknesses:

      - The numbers of subjects are inherently limited both in number as well as in typically developing young adults.

      We emphasize that the sample size is limited due to the arduous nature of manually defining sulci; however, we provide probabilistic maps with the publication of this work to help expedite this process for future investigators. Further, with improved deep learning algorithms, the sample sizes in future neuroanatomical studies should be enhanced. We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). The time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus restricting the present study to LPC/LPOJ.”

      - While the paper begins by describing four new sulci, only one is explored further in greater detail.

      Due to the increased variability of three of the four newly-classified sulci, we chose to only focus on the slocs-v given that it was present in nearly all hemispheres. In response to other reviewers, we have conducted additional analyses that also describe these new sulci and potential factors related to their incidence (Page 6):

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      In addition, given that sulcal variability is cognitively (e.g., Amiez et al., 2018 Scientific Reports; Cachia et al., 2021 Frontiers in Neuroanatomy; Garrison et al., 2015 Nature Communications; Willbrand et al., 2022, 2023 Brain Structure & Function), anatomically (e.g., Amiez et al., 2021 Communications Biology; Vogt et al., 1995 Journal of Comparative Neurology), functionally (e.g., Lopez Persem et al., 2019 The Journal of Neuroscience), and translationally (e.g., Yucel et al., 2002 Biological Psychiatry) relevant, future research can investigate these relationships regarding the slocs-d and pAngs components. We have added text to the Limitations section of the Discussion (Pages 17-18) to discuss this:

      “Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features.”

      - There is some tension between calling the discovered sulci new vs acknowledging they have already been reported, but not named.

      We have edited the manuscript throughout to emphasize our primary focus on revising the LPC/LOPJ sulcal landscape to include these often overlooked small, shallow, and variable putative tertiary sulci, rather than using the terms “discovered sulci” and “new.”

      - The anatomy of the sulci, as opposed to their relation to other sulci, could be described in greater detail.

      Beyond the radar plots in the main text which compare specific groupings of sulci, we now show the morphological metrics for all sulci investigated in the present work in Supplemental Figure 7.

      Overall, to summarize, I greatly enjoyed this paper and believe it to be a highly valued contribution to the field.

      We are glad the Reviewer enjoyed reading our paper and thank them for their positive thoughts on the potential impact of this work on the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The slocs-v is found in 71 subjects left and right. Is that the same subject?

      No, these are different subjects.

      (2) How were the 72 subjects chosen?

      The subjects were randomly selected from the HCP database as describe in the methods (Page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (3) Are there effects of laterality on sulcal pattern? Table?

      We now include sulcal pattern results in the Results section and Supplementary Materials; although there were no laterality effects regarding the sulcal pattern .

      (4) Depiction/description of common sulcal patterns

      We now include sulcal pattern results in the Results section and Supplementary Materials.

      (5) Is there a relationship between sulcal patterns and demographic features?

      We now include analyses on this in the Results section. There is no relationship between sulcal patterns and demographic features.

      (6) Just for clarity, the sulcal features are studied and extracted in native space?

      Yes, sulcal features are studied and extracted in native space, as described in the Methods section (Page 19):

      “Anatomical T1-weighted (T1-w) MRI scans (0.8 mm voxel resolution) were obtained in native space from the HCP database. Reconstructions of the cortical surfaces of each participant were generated using FreeSurfer (v6.0.0), a software package used for processing and analyzing human brain MRI images (surfer.nmr.mgh.harvard.edu) (Dale et al., 1999; Fischl et al., 1999). All subsequent sulcal labeling and extraction of anatomical metrics were calculated from these native space reconstructions generated through the HCP’s version of the FreeSurfer pipeline (Glasser et al., 2013).”

      (7) The authors use "Gender". Are they referring to biological sex (female/male) or socially defined characteristics (man/woman etc.)?

      The term gender is referred to socially defined characteristics, as used by the HCP data dictionary (Methods page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (8) Fig 2. Grey is poorly visible compared to green and blue.

      The shade of gray has been edited to be more distinguishable.

      (9) The relationship between behavior and sulcal features is significant but weak.

      We acknowledge that the morphological-behavioral relationship identified in the present study explains a modest amount of variance; however, the more important aspect of the finding is that multiple sulci identified in the model are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. We have added text to the Limitations section of the Discussion (Pages 17-18) to emphasize this point:

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. “

      (10) The Limitation section could be expanded.

      We have added additional text to flesh out the Limitations section of the Discussion (Pages 17-18):

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features. “

      Reviewer #2 (Recommendations For The Authors):

      First, I would like to thank the authors for their important contribution to the field of sulcal studies and anatomo-functional correlates. My main comments about the work are treated in the public review, and I will only address details in this section. I have detected a number of typos which are harder to report from a document in which lines are not numbered. Could you please submit a numbered document for the next iteration?

      - p2. "hominoid-specific, shallow indentations, or sulci" - can lead to misunderstanding that sulci are hominoid-specific and shallow

      Sentence has been rewritten:

      “Of all the neuroanatomical features to target, recent work shows that morphological features of the shallower, later developing, hominoid-specific indentations of the cerebral cortex (also known as putative tertiary sulci, PTS) are not only functionally and cognitively meaningful, but also are particularly impacted by multiple brain-related disorders and aging (Amiez et al., 2019, 2018; Ammons et al., 2021; Cachia et al., 2021; Fornito et al., 2004; Garrison et al., 2015; Harper et al., 2022; Hathaway et al., 2023; Lopez-Persem et al., 2019; Miller et al., 2021, 2020; Nakamura et al., 2020; Parker et al., 2023; Voorhies et al., 2021; Weiner, 2019; Willbrand et al., 2023b, 2023c, 2022a, 2022b; Yao et al., 2022).”

      - p2. next sentence (starting with "The combination [...]": not clear that you are addressing tertiary sulci here, maybe introduce the concept beforehand?

      The previous sentence (just above) has been edited to introduce putative tertiary sulci beforehand.

      - p5. error in numbering of sulci relative to Fig1. (5,6,7,8 -> 6,7,8,9)

      Sulcal numbering has been fixed.

      -p5. reference to supp mat -> I would have expected the nomenclature used in Borne et al. 2020 to be discussed alongside with the state of the art. How would you relate F.I.P.r.int.1 and F.I.P.r.int.2 to the sulci you describe?

      We thank the Reviewer for bringing up this relevant literature. The F.I.P.r.int. 1 and 2 are described as rami of the IPS, whereas the slocs and pAngs are independent, small indentations near the IPS, but not part of the complex. Nevertheless, future work should integrate these two schematics together to establish the most comprehensive sulcal map of LPC/LOPJ. We have added text to the Supplementary Methods detailing the differences between the F.I.P.r.int.1 and F.I.P.r.int.2 and slocs-/pAngs:

      “slocs/pAng vs. F.I.P.r.int.1 and F.I.P.r.int.2

      Recent work (Borne et al., 2020; Perrot et al., 2011) identified two intermediate rami of the IPS (F.I.P.r.int.1 and F.I.P.r.int.2) that were not defined in the present investigation. Crucially, the newly classified sulci here (slocs and pAngs) are distinguishable from the two F.I.P.r.int. in that the F.I.P.r.int. are branches coming off the main body of the IPS (Borne et al., 2020; Perrot et al., 2011), whereas the slocs/pAngs are predominantly non-intersecting (“free”) structures that never intersected with the IPS (Supplementary Tables 1-4).”

      - p6. Fig 1.a. labelling discrepancy between line 1 and 2, column 4: the labels 10 and 11 from the inflated hemisphere do not match the labels 10 and 11 in the pial surface. Fig 1.b. swapped label 2 and 3 in the 4th hemisphere

      These aspects of Figure 1 have been edited accordingly.

      - p7. "(iii) the slocs-v was thicker than both the cSTS3 and lTOS" -> the slocs-v showed thicker gray matter?

      The sentence has been adjusted (Page 7):

      “(iii) the slocs-v showed thicker gray matter than both the cSTS3 and lTOS (ps < .001), “

      - p9. Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance -> missing

      Fixed (Page 9):

      “Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance (Fig. 3a, b). “

      - p14. "Steel and colleagues" -> missing space

      Fixed (Page 14):

      “Furthermore, the slocs-v appears to lie at the junction of scene-perception and place-memory activity (a transition that also consistently co-localizes with the HCP-MMP area PGp) as identified by Steel and colleagues (2021).”

      - p20. Probability maps "we share these maps with the field" -> specify link to data availability

      The link to data availability has been added (Page 21):

      “To aid future studies interested in investigating LPC/LPOJ sulci, we share these maps with the field (Data availability). “

      Reviewer #3 (Recommendations For The Authors):

      No detailed recommendations not already present in the rest of the review.

    1. Author response:

      eLife assessment

      Cav2 voltage-gated calcium channels play key roles in regulating synaptic strength and plasticity. In contrast to mammals, invertebrates like Drosophila encode a single Cav2 channel, raising questions on how diversity in Cav2 is achieved from a single gene. Here, the authors present convincing evidence that two alternatively spliced isoforms of the Cac gene (cacophony, also known as Dmca1A and nightblindA) enable diverse changes in Cav2 expression, localization, and function in synaptic transmission and plasticity. These valuable findings will be of interest to a variety of researchers.

      We suggest replacing “two alternatively spliced isoforms of the Cac gene” by “two alternatively spliced mutually exclusive exon pairs of the Cac gene”. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and 2nd set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

      We agree that we need to explain more clearly why IS4B is unlikely required for channel stability, but instead, likely has a unique function at the presynaptic active zone of fast synapses. We will address this by revising text and by providing additional data. If IS4B was required for evoked release because it supported channel protein stability, then the removal of IS4B should cause protein degradation throughout all sub-neuronal compartments and throughout the CNS, but this is not the case. First, upon removal of IS4B in adult motoneurons (which use cac channels at the presynapse and somatodendritically, Ryglewski et al., 2012) evoked release from axon terminals is abolished (as at the larval NMJ), but somatodendritic cac inward current is present. If IS4B was required for cac channel stability, somatodendritic current should also be abolished. We will add these data to the ms. Second, immunohistochemistry for tagged IS4B channels reveals that these are present not only at presynaptic active zones at the NMJ but also throughout the VNC motor neuropils. Excision of IS4B causes the absence of cac channels from the presynaptic active zones at the NMJ and throughout the VNC neuropils (and accordingly this is lethal). By contrast, tagged IS4A channels (with IS4B excised) are not found at the presynaptic terminals of fast synapses, but instead, in other distinct parts of the CNS. We will also provide data to show this. Together these data are in line with a unique requirement of IS4B at presynaptic active zones (not excluding additional functions of IS4B), whereas IS4A containing cac isoforms mediate different functions.

      We appreciate the additional reviewer suggestions to the authors that we will address point by point when revising the ms. 

      Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones. Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

      We agree that some additional information on cac isoform localization (in particular for splicing at the IS4 site) will strengthen the manuscript. We will address this by providing additional data and revising text (see responses to reviewers 1 and 3). We are also grateful for the additional reviewer suggestions which we will address point by point when revising the ms.  

      Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      We understand the reviewer’s comment and will do the following to convincingly demonstrate absence of cac from presynaptic active zones upon IS4B excision. First, we will show selective enlargements of IS4A and IS4B with Brp in presynaptic active zones to show distinct cac label in active zones following excision of IS4A but not following excision of IS4B. Second, we will provide Pearson’s co-localization coefficients of Brp with IS4B and with IS4A, respectively. Third, we will reduce the intensity of the green channels in figures 2C and 2H to the same levels as in 2A and B, and H control to allow a fair comparison of cac intensities following excision of IS4B versus excision of IS4A and control. We had increased intensity to show that following excision of IS4B, no distinct cac label is found in active zones, even at high exaggerated image brightness. However, we agree with the reviewer that the bright background hampers interpretation and thus will show the same intensity in all images that need to be compared.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      We will precisely define channel localization, and we will explain why it is highly unlikely that the absence of IS4B channels as well as the lower number of I-IIA channels are simply a consequence of reduced expression, but instead of splice variant specific channel function and localization. For example, upon excision of IS4B no cac channels are found at the presynaptic active zones and these synapses are thus non-functional. The isoforms containing the mutually exclusive IS4A exon are expressed and mediate other functions (see also response to reviewer 1) but cannot substitute IS4B containing isoforms at the presynapse. In fact, our Western blots are in line with reduced cac expression if all isoforms that mediate evoked release are missing, again indicating that the presynapse specific cac isoforms cannot be replaced by other cac isoforms (see also below, response to (3)). Feedback mechanisms that regulate cac expression in the absence of presynapse specific cac isoforms are beyond the scope of this study.

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      We will provide additional evidence that mutually exclusive splicing at the IS4 site results in cac channels that localize to the presynaptic active zone (IS4B) versus cac channels that localize to other brain parts and/or other subneuronal compartments (see response to reviewer 1).  In addition, we already show in figure 2J that IS4B is required for normal cac HVA current, and we can add data showing that IS4A is not essential for cac HVA current. Similarly, for I-II we find it unlikely that differential splicing regulates channel numbers, but rather splice variant specific functions in different brain parts and different sub-neuronal compartments. To substantiate this interpretation, we will add data from developing adult motoneurons showing that excision of I-IIA causes reduced activity induced calcium influx into dendrites (new data), but it does not reduce channel number at the larval NMJ (figure 4). In our opinion these data are not in line with the idea that splicing regulates cac expression levels, and this in turn, results in specific defects in distinct neuronal compartments. However, we agree that the lack of isoforms with specific functions results in altered overall cac expression levels as indicated by our Western data. If isoforms normally abundantly expressed throughout most neuropils are missing due to exon excision, we indeed find less cac protein in Westerns. By contrast, the lack of isoforms with little abundance has little effect on cac expression levels. This may be the results of unknown feedback mechanisms which are beyond the scope of this study.

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      There are several possibilities to explain this, but as none of the effects are statistically significant, we prefer to not investigate this in depth. However, given that we cannot find IS4A at the presynaptic active zone, IS4A is unlikely to have a direct negative effect on release probability. Nonetheless, given that IS4A containing cac isoforms mediate functions in other neuronal compartments it may regulate release indirectly by affecting action potential shape. We will provide data in response to the more detailed suggestions to authors that will provide additional insight.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peak-normalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      Most importantly, HVA current is mostly abolished upon excision of IS4B (not IS4A, we think the reviewer accidentally mixed up the genotype). This indicates that the cac isoforms that mediate evoked release encode HVA channels. However, the somatodendritic current shown in figure 2J that remains upon excision of IS4B is mediated by IS4A containing cac isoforms. Please note that these never localize to the presynaptic active zone, thus the small inactivating HVA that remains in figure 2J does normally not mediate evoked release. Therefore, the interpretation is that specifically HVA current encoded by IS4B cac isoforms is required for synaptic transmission. Reduced cac current density is not the cause for this phenotype because a specific current component is absent. 

      We agree with the reviewer that a deeper electrophysiological analysis of cac currents mediated by IS4B containing isoforms will be instructive. However, a precise analysis of activation and inactivation voltages and kinetics suffers form space clamp issues in recordings from the soma of such complex neurons (DLM motoneurons of the adult fly). Therefore, we will analyze the currents in a heterologous expression system and present these data to the scientific community as a separate study at a later time point.

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      Max. z-projections would be imprecise because they can artificially suggest close proximity of label that is close in x and y but far away in z. Therefore, the analysis was executed in xy-direction of various planes of entire 3D image stacks. We considered active zones of different orientations (Fig. 4C, D). In fact, we searched the entire z-stacks until we found active zones of all orientations shown in figures 4C1-C6 within the same boutons. The same active zone orientations were analyzed for all exon-out mutants with cac localization in active zones. The distance between cac and brp did not change if viewed from the side.

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      In the top views (planar) we did not find any clear offset in cac orientation to brp between genotypes. This study focuses on cac splice isoform specific localization and function. Possible effects of different cac isoforms on Brp-ring dimensions or other aspects of scaffold structure are not central to our study, in particular given that Brp puncta are clearly present even if cac is absent from the synapse (Fig. 2H), indicating that cac is not instructive for the formation of the Brp scaffold.  

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      We agree that reporting charge in figure 3 will be informative and will do so. We also understand the reviewer’s concern attributing altered PSC kinetics to presynaptic cac channel properties. We will tone down our interpretation in the discussion and list possible alterations in presynaptic AP shape or Cav2 channel kinetics as alternative explanations (not conclusions). Moreover, we will quantify postsynaptic GluRIIA abundance to test whether altered PSC kinetics are caused by altered GluRIIA expression. In our opinion, the latter is more instructive than mini decay kinetic analysis because this depends strongly on the distance of the recording electrode to the actual site of transmission in these large muscle cells.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-24-09608.2001).

      We agree that the PP protocol and analyses have to be described more precisely in the methods, and we will do so. PPR values are based on the PPRs of each sweep and then averaged. We are aware of the study of Kim and Alger 2001, but it does not affect our data interpretation because all genotypes were analyzed identically, but only the I-IIB excision resulted in the large data spread shown in figure 5.

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      In fact, the interpretation that decreased PSC amplitude upon I-IIB excision is caused mainly by reduced channel number is precisely our interpretation (see discussion page 14, last paragraph to page 15, first paragraph). In addition, we are grateful for the reviewer’s suggestion to triturate the external calcium such that the first PSC amplitude matches the one in ΔI-IIB to test whether altered short term plasticity is solely a function of altered channel number or whether additional causes, such as altered channel properties, also play into this. We will conduct these experiments and include them in the revised manuscript.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dI-IIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      For each animal, the amplitudes of each PSC were plotted over time and fitted with a single exponential. For depression at 1 and 10 Hz, we used one train per animal, and 5-6 animals per genotype (as reflected in the data points in Figs 5H and 5L). Given that the tau values are highly similar between control and excision of I-IIA, but ΔI-IIA tends to have larger single PSC amplitudes, differences in first PSC amplitude do not seem to skew the data (but see also response to comment 10 above). We thank the reviewer for pointing out that tau values in the range of ms are not informative at 1 and 10 Hz stimulations (Figs 5H and 5L). We mis-labeled (or did not label) the axes. The label should read seconds, not milliseconds. We apologize, and this will be corrected accordingly.

      In sum, pending the outcome of additional important control experiments for GluRIIA abundance (see response to comment 8) and trituration of control PSC amplitude for the first pulse of paired pulses in ΔI-IIB (see response to comment 10) we will either modify or further support that interpretation.

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

      We will show panels with all three labels matched as suggested by the reviewer. For the size of the puncta: this could be different numbers and types of fluorophores on the different antibodies used and thus different point spread, chromatic aberration, different laser and detector intensities etc. We will re-analyze the data to test whether there are systematic differences in size. We do not want to speculate whether the different tags have any effect on localization precision because of the abovementioned reasons as well as artificial differences in localization precision that can be suggested by different antibodies. We prefer to not move the figure because we believe it is informative to show our finding that active zones usually contain both splice variants together with the finding that only one splice variant is required for PHP.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. However, a limitation of the work is the lack of control experiments with bedaquiline and pretomanid only, to further dissect the relevant contributions of linezolid and spectinamide in efficacy and adverse effects.

      We acknowledge a limitation in our study due to lack of groups with monotherapy of bedaquiline and pretomanid however, similar studies to understand contribution of bedaquiline and pretomanid to the BPaL have been published already (references #4 and #60 in revised manuscript).  Our goal was to compare the BPaS versus the BPaL with the understanding that TB treatment requires multidrug therapy.   We omitted monotherapy groups to reduce complexity of the studies because the multidrug groups require very large number of animals with very intensive and complex dosing schedules. Even if B or Pa by themselves have better efficacy than the BPa or BPaL combination, patients will not be treated with only B or Pa because of very high risk of developing drug resistance to B or/and PA. If drug resistance is developed for B or Pa, the field will lose very effective drugs against TB. 

      Although the manuscript is well written overall, a re-formulation of some of the stated hypotheses and conclusions, as well as the addition of text to contextualize translatability, would improve value.

      Manuscript has been edited to address these critiques.  Answers to individual critiques are below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript is an extension of previous studies by this group looking at the new drug spectinamide 1599. The authors directly compare therapy with BPaL (bedaquiline, pretomanid, linezolid) to a therapy that substitutes spectinamide for linezolid (BPaS). The Spectinamide is given by aerosol exposure and the BPaS therapy is shown to be as effective as BPaL without adverse effects. The work is rigorously performed and analyses of the immune responses are consistent with curative therapy.

      Strengths:

      (1) This group uses 2 different mouse models to show the effectiveness of the BPaS treatment.

      (2) Impressively the group demonstrates immunological correlates associated with Mtb cure with the BPaS therapy.

      (3) Linezolid is known to inhibit ribosomes and mitochondria whereas spectinaminde does not. The authors clearly demonstrate the lack of adverse effects of BPaS compared to BPaL.

      Weaknesses:

      (1) Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and its delivery from a RS01 Plastiape inhaler device (reference #59 in revised manuscript).  To address this critique, we added a last paragraph in discussion “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler" (reference #59 in revised manuscript).  

      Reviewer #2 (Public Review):

      Summary:

      Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, and reduces hematological changes and proinflammatory responses.

      Strengths:

      The authors not only measure efficacy but also quantify histological changes, hematological responses, and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency. 

      Weaknesses:

      The articulation of objectives and hypotheses could be improved.

      We edited to "The AEs were associated with the long-term administration of the protein synthesis inhibitor linezolid. Spectinamide 1599 (S) is also a protein synthesis inhibitor of Mycobacterium tuberculosis with an excellent safety profile, but which lacks oral bioavailability. Here, we propose to replace L in the BPaL regimen with spectinamide administered via inhalation and we demonstrate that inhaled spectinamide 1599, combined with BPa ––BPaS regimen––has similar efficacy to that of BPaL regimen while simultaneously avoiding the L-associated AEs.

      Reviewer #3 (Public Review):

      Summary:

      In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients long-term, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaL did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:

      The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups, etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well-written and cited article with an informative introduction.

      Weaknesses:

      The authors performed a large amount of work with the drugs given at the doses and dosing intervals started, but at present, there is no exposure data available in the paper. It would be of great value to understand the exposures achieved in plasma at least (and in the lung if more relevant for S) in order to better understand how these relate to clinical exposures that are observed at marketed doses for B, Pa, and L as well as to understand the exposure achieved at the doses being evaluated for S. If available as historical data this could be included/cited. Considering the great attempts made to evaluate parameters that are relevant to clinical adverse events, it would add value to understand what exposures of drug effects such as anemia, weight loss, and bone marrow effects, are being observed. It would also be of value to add an assessment of whether the weight loss, anemia, or bone marrow effects observed for BPaL are considered adverse, and the extent to which we can translate these effects from mouse to patient (i.e. what are the limitations of these assessments made in a mouse study?). For example, is the small weight loss seen as significant, or is it reversible? Is the magnitude of the changes in blood parameters similar to the parameters seen in patients given L? In addition, it is always challenging to interpret findings for combinations of drugs, so the addition of language to explain this would add value: for example, how confident can we be that the weight loss seen for only the BPaL group is due to L as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa?

      We totally agree with this critique but the studies suggested by the reviewer are very expensive and

      logistically/resource intensive. Data reported in this manuscript was used as preliminary data in a RO1 application to NIH-NIAID that included studies proposed above by this reviewer. The authors are glad to report that the application got a fundable score and is currently under consideration for funding by NIH-NIAID.   The summary of proposed future studies is included in the last paragraph of the discussion in this revised manuscript. 

      Turning to the evaluations of activity in mouse TB models, unfortunately, the evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection and so, to this reviewer's understanding of the data, comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS; it is possible that BPa would have shown the same efficacy as the 3 drug combinations. It would be valuable to conduct a study including a BPa control and with a shorter treatment time to allow comparison of BPa, BPaS, and BPaL. 

      We agree with the reviewer these studies need to be done.  Some of them were recently published by our colleague Dr. Lyons (reference #60 in revised manuscript). The studies proposed by the reviewer will be performed under a new award under consideration for funding by the NIH-NIAID, the summary of future studies is included in the last paragraph of the discussion in this revised manuscript. 

      In the Kramnik lungs, as the authors rightly note, the studies do not support any contribution of S or L to BPa - i.e. the activity observed for BPa, BPaL, and BPaS did not significantly differ. Although the conclusions note equivalency of BPaL and BPaS, which is correct, it would be helpful to also include BPa in this statement;

      We edited and now included in lines #191 as requested 

      It would be useful to conduct a study dosing for a longer period of time or assessing a relapse endpoint, where it is possible that a contribution of L and/or S may be seen - thus making a stronger argument for S contributing an equivalent efficacy to L. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.

      Added in the future plans in the last paragraph of discussion

      “Future studies are already under consideration for funding by NIH-NIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (reference #61 in revised manuscript), to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      Last paragraph of discussion was added “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler". We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript)

      Reviewer #2 (Recommendations For The Authors):

      Major comments

      The Abstract lacks focus and could more clearly convey the key messages.

      Edited as requested 

      The two mouse models and why they were chosen need to be described earlier. Currently, it's covered in the first section of the Discussion, but the reader needs to understand the utility of each model in answering the questions at hand before the first results are described, either in the introduction or in the opening section of the results.

      Thank you for suggestion, we agree.  We moved the first paragraph in discussion to last paragraph in Introduction. 

      Line 130: Please justify the doses and dosing frequency for S. A reference to a published manuscript could suffice if compelling.

      The dosing and regimens were previously reported by our groups in ref 21 and 22 in revised manuscript.- 

      (21) Robertson GT, Scherman MS, Bruhn DF, Liu J, Hastings C, McNeil MR, et al. Spectinamides are effective partner agents for the treatment of tuberculosis in multiple mouse infection models. J Antimicrob Chemother.

      2017;72(3):770–7. 

      (22) Gonzalez-Juarrero M, Lukka PB, Wagh S, Walz A, Arab J, Pearce C, et al. Preclinical Evaluation of Inhalational Spectinamide-1599 Therapy against Tuberculosis. ACS Infect Dis. 2021;7(10):2850–63. 

      Figures 1 E to H: several "ns" are missing, please add them.

      Edited as requested 

      Line 184 to 190: suggest moving the body weight plots to a Supplemental Figure, and at least double the size of the histology images to convey the message of lines 192-203.

      Please include higher magnification insets to illustrate the histopathological findings. In that same section, please add a sentence or two describing the lesion scoring concept/method. It is a nice added feature, not widespread in the field, and deserves a brief description.

      Edited as requested.  We added detailed description for scoring method in M&M under histopathology and lesion scoring

      Line 206: please add an introductory sentence explaining why one would expect S to cause (or not) hematological disruption, and why MCHC and RDW were chosen initially (they are markers of xyz). The first part of Figure 3 legend belongs to the Methods.

      To address this critique we added in #225-226 “The effect of L in the blood profile of humans and mouse has been reported (references #38-42 in revised manuscript) but the same has not been reported for S” . In line #229-230 we added “Of 20-blood parameters evaluated, two blood parameters were affected during treatment”. 

      The first part of Figure 3 legend belongs to the Methods.

      We edited Figure 3 to “During therapy of mice in Figure 1, the blood was collected at 1, 2- and 4-weeks posttreatment. The complete blood count was collected in VETSCAN® HM5 hematology analyzer (Zoetis)”.

      Line 218: please explain why the 4 blood parameters that are shown were selected, out of the 20 parameters surveyed.

      We added an explanation in line 239-240 “out 20-blood parameters evaluated, a total of four blood parameters were affected at 2 and 4-weeks-of treatment”.

      Line 243 and again Line 262 (similar to comment Line 206): please add an introductory paragraph explaining the motivation to conduct this analysis and the objective. Can the authors put the experiment in the context of their hypothesis?

      To address this critique, we added in line #235-237 “The Nix-TB trial associated the long-term administration of L within the BPaL regimen as the causative agent resulting in anemia in patients treated with the BPaL regimen (5).”

      Figure 4C (and the plasma and lung equivalent in the SI). This figure needs adequate labeling of axes: X axis = LOG CFU? Please add tick marks for all plots since log CFU is only shown for the bottom line. Y axes have no units: pg/mL as in B?

      Figure legend were edited to add (Y axis:pg/ml) and (X axis; log10CFU).  

      Line 255-256: please remove "pronounced" and "profound". There is a range of CFU reduction and cytokine reduction, from minor to major. The correlation trend is clear and those words are not needed.

      Edited as requested 

      Line 277-289, Figure 6: given the heterogeneity of a C3HeB/FeJ mouse lung (TB infected), and the very heterogeneous cell population distribution in these lungs (Fig. 6A), the validity of whole lung analysis on 2 or 3 mice (the legend should state what 1, 2 and 3 means, individual mice?) is put into question. "F4/80+ cells were observed significantly higher in BPaS compared to UnRx control": Figure S14 suggests a statistically significant difference, but nothing is said about the other cell type, which appears just as much reduced in BPaS compared to UnRx as F4/80+. Overall, sampling the whole lung for these analyses should be mentioned as a limitation in the Discussion.

      We agree with the reviewer that "visually" it appears as other populations in addition to F4/80 have statistical significance.  We run again the two way Anova with Tukey test and only the BPaS and UnRx for F4/80 is significant. 

      We edited figure S16 (previously S14) to add ns for every comparation.  

      In Figure 6A was edited ;  N=2 are 2 mice for Unrx and n=3 mice for BPaL/BPaS each.

      Line 355-360: "The BPa and BPaL regimens altered M:E in the C3HeB/FeJ TB model by suppressing myeloid and inducing erythroid lineages" This suggests that altered M:E is not associated with L, putting into question the comparison between BPaS, BPaL, and UnRx. Can the authors comment on how M:E is altered in BPa and not in BPaS?

      Our interpretation to this result was that addition of S in our regimen BPsS was capable of restoring the M:E ratio altered by the BPa and BPaL. This interpretation was included in main text in line #263-264 and is also now added to abstract

      Line 379: discuss the limitations of working with whole lungs.

      Sorry we cannot understand this request. In our studies we always work with whole lungs if the expected course of histopathology/infection among lung lobes is very variable (as is the case of C3HeB/Fej TB model)

      Concluding paragraph: "Here we present initial results that are in line with these goals." If such a bold claim is made, there needs to be a discussion on the translatability of the route of administration and the dose of S. Otherwise, please rephrase.

      We added the following last paragraph to discussion:

      To conclude, the TB drug development field is working towards developing shorter and safer therapies with a common goal of developing new multidrug regimens of low pill burden that are accessible to patients, of short duration (ideally 2-3 months) and consist of 3-4 drugs of novel mode-of-action with proven efficacy, safety, and limited toxicity. Here we present initial results for new multidrug regimens containing inhaled spectinamide 1599 that are in line with these goals. It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler.  We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript). Future studies are already under consideration for funding by NIHNIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (references #60 and 61 in revised manuscript) , to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies.

      Minor edits

      Adverse events, not adverse effects (side effects)

      Edited as requested

      BALB/c (not Balb/c, please change throughout).

      Edited as requested

      Line 92: replace 'efficacy' with potency or activity.

      Edited as requested

      "Live" body weight: how is that different from "body weight"? Suggest deleting "live" throughout, or replace with "longitudinally recorded" if that's what is meant, although this is generally implied.

      Edited as requested

      The last line of Figure 2 legend is disconnected. 

      Line 331: delete "human".

      Edited as requested

      Reviewer #3 (Recommendations For The Authors):

      We thank the reviewer for these suggestions.  The data presented in this manuscript with 4 weeks of treatment along with monitoring of effects of therapy in blood, bone marrow and immunity have been submitted for a RO1 application to NIH-NIAID, which have received a fundable score and is under funding consideration. All the points suggested by the reviewer(s) here are included in the research proposed in the RO1 application including manufacturing and physico-chemically characterize larger scale of dry powders of spectinmides and evaluation of their aerodynamic performance for human or animal use; Pharmacokinetics and efficacy studies to determine the optimal dose level and dosing frequency for new multidrug regimens containing spectinamides. These studies include mono, binary and ternary combinations of each multidrug regimen along with their efficacy and relapse free- sterilization potential. These studies will also develop PK/PD simulation-based allometric scaling to aid in human dose projections inhalation. We hope the reviewer will understand all together these studies will last 4-5 years.  

      Although I truly appreciate the great efforts of the authors, I suggest that in order to better evaluate the contribution of S versus L to BPa in these models, repeat studies be run that:

      (a) include BPa groups to allow the contribution of S and L to be assessed. Included in research proposed RO1 application mentioned above

      (b) use shorter treatment times in BALB/c to allow comparisons at end of Tx CFU above the LOD. We have added new data for 2 weeks treatment with BPaL and BPaS in Balb/c mice infected with MTb that was removed from previous submission of this manuscript

      (c) use longer treatment times and ideally a relapse endpoint in Kramnik to allow

      assessment of L and S as contributors to BPa (i.e. give a chance to see better efficacy of BPaL or BPaS versus BPa) and also measure plasma exposures of all drugs (or lung levels if this is the translatable parameter for S) to allow detection of any large DDI and also understand the translation to the clinic. Related to the safety parameters, it would be really great to understand whether or not the observations for BPaL would be labeled adverse in a toxicology study/in a clinical study, and it would be useful to include information on the magnitude of observations seen here versus in the clinic (eg for the hematological parameters).

      The research proposed in the RO1 application mentioned above included extensive PK, extended periods of treatment beyond 1 month of treatment (2-5 months as needed to reach negative culturable bacterial from organs) and of course relapse studies. 

      Minor point: I suggest rewording "high safety profile" when describing spectinomides in the intro - or perhaps qualify the length of dosing where the drug is well tolerated

      "high safety profile" was replaced by “an acceptable safety profile”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment  

      This important study builds on a previous publication (with partially overlapping authors), demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as the molecular motor, is localized close to the endosomal system in the bloodstream form (BSF) of Trypanosoma brucei. It shows convincingly that actin is important for the organization and integrity of the endosomal system, and that the trypanosome Myo1is an active motor that interacts with actin and transiently associates with endosomes, but a role of Myo1 in endomembrane function in vivo was not directly demonstrated. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

      We were delighted at the editors’ positive assessment and the reviewers’ rigorous, courteous, and constructive responses to the paper. We agree that a direct functional role for TbMyo1 in endomembrane activity was not demonstrated in the original submission, but have incorporated some new data (see new supplemental Figure S5) using the TbMyo1 RNAi cell line which are consistent with our earlier observations and interpretations.  

      Public Reviews:   

      Reviewer #1 (Public Review):  

      Using a combination of cutting-edge high-resolution approaches (expansion microscopy, SIM, and CLEM) and biochemical approaches (in vitro translocation of actin filaments, cargo uptake assays, and drug treatment), the authors revisit previous results about TbMyo1 and TbACT in the bloodstream form (BSF) of Trypanosoma brucei. They show that a great part of the myosin motor is cytoplasmic but the fraction associated with organelles is in proximity to the endosomal system. In addition, they show that TbMyo1 can move actin filaments in vitro and visualize for the first time this actomyosin system using specific antibodies, a "classical" antibody for TbMyo1, and a chromobody for actin. Finally, using latrunculin A, which sequesters G-actin and prevents F-actin assembly, the authors show the delocalization and eventually the loss of the filamentous actin signal as well as the concomitant loss of the endosomal system integrity. However, they do not assess the localization of TbMyo1 in the same conditions.  

      Overall the work is well conducted and convincing. The conclusions are not over-interpreted and are supported by the experimental results. 

      We are very grateful to Reviewer1 for their balanced assessment. The reviewer is correct that we did not assess the localisation of TbMyo1 following latrunculin A treatment, but it is worth noting that Spitznagel et al. carried out this exact experiment in the earlier 2010 paper – we have mentioned this in the revised manuscript.  

      Reviewer #2 (Public Review):  

      Summary:  

      The study by Link et al. advances our understanding of the actomyosin system in T. brucei, focusing on the role 

      of TbMyo1, a class I myosin, within the parasite's endosomal system. Using a combination of biochemical fractionation, in vitro motility assays, and advanced imaging techniques such as correlative light and electron microscopy (CLEM), this paper demonstrates that TbMyo1 is dynamically distributed across early and late endosomes, the cytosol, is associated with the cytoskeleton, and a fraction has an unexpected association with glycosomes. Notably, the study shows that TbMyo1 can translocate actin filaments at velocities suggesting an active role in intracellular trafficking, potentially higher than those observed for similar myosins in other cell types. This work not only elucidates the spatial dynamics of TbMyo1 within T. brucei but also suggests its broader involvement in maintaining the complex architecture of the endosomal network, underscoring the critical role of the actomyosin system in a parasite that relies on high rates of endocytosis for immune evasion. 

      Strengths:  

      A key strength of the study is its exceptional rigor and successful integration of a wide array of sophisticated techniques, such as in vitro motility assays, and advanced imaging methods, including correlative light and electron microscopy (CLEM) and immuno-electron microscopy. This combination of approaches underscores the study's comprehensive approach to examining the ultrastructural organization of the trypanosome endomembrane system. The application of functional data using inhibitors, such as latrunculin A for actin depolymerization, further strengthens the study by providing insights into the dynamics and regulatory mechanisms of the endomembrane system. This demonstrates how the actomyosin system contributes to cellular morphology and trafficking processes. Furthermore, the discovery of TbMyo1 localization to glycosomes introduces a novel aspect to the potential roles of myosin I proteins within the cell, particularly in the context of organelles analogous to peroxisomes. This observation not only broadens our understanding of myosin I functionality but also opens up new avenues for research into the cellular biology of trypanosomatids, marking a significant contribution to the field. 

      We are very pleased that the Reviewer felt the work is a significant contribution to the state of the art.  

      Weaknesses:  

      Certain limitations inherent in the study's design and scope render the narrative incomplete and make it challenging to reach definitive conclusions. One significant limitation is the reliance on spatial association data, such as colocalization of TbMyo1 with various cellular components-or the absence thereof-to infer functional relationships. Although these data suggest potential interactions, the authors do not confirm functional or direct physical interactions.  

      While TbMyo1's localization is informative, the authors do not directly demonstrate its biochemical or mechanical activities in vivo, leaving its precise role in cellular processes speculative. Direct assays that manipulate TbMyo1 levels, activity, and/or function, coupled with observations of the outcomes on cellular processes, would provide more definitive evidence of the protein's specific roles in T. brucei. A multifaceted approach, including genetic manipulations, uptake assays, kinetic trafficking experiments, and imaging, would offer a more robust framework for understanding TbMyo1's roles. This comprehensive approach would elucidate not just the "what" and "where" of TbMyo1's function but also the "how" and "why," thereby deepening our mechanistic insights into T. brucei's biology.  

      The reviewer is absolutely correct that the study lacks data on direct or indirect interactions between TbMyo1 and its intracellular partners, and this is an obvious area for future investigation. Given the generally low affinities of motor-cargo interactions, a proximity labelling approach (such has already been successfully used in studies of other myosins) would probably be the best way to proceed. 

      The reviewer is also right to highlight that a detailed mechanistic understanding of TbMyo1 function in vivo is currently lacking. We feel that this would be beyond the scope of the present work, but have included some new data using the TbMyo1 RNAi cell line (Figure S5), which are consistent with our previous findings.  

      Reviewer #3 (Public Review):  

      Summary:  

      In this work, Link and colleagues have investigated the localization and function of the actomyosin system in the parasite Trypanosoma brucei, which represents a highly divergent and streamlined version of this important cytoskeletal pathway. Using a variety of cutting-edge methods, the authors have shown that the T. brucei Myo1 homolog is a dynamic motor that can translocate actin, suggesting that it may not function as a more passive crosslinker. Using expansion microscopy, iEM, and CLEM, the authors show that MyoI localizes to the endosomal pathway, specifically the portion tasked with internalizing and targeting cargo for degradation, not the recycling endosomes. The glycosomes also appear to be associated with MyoI, which was previously not known. An actin chromobody was employed to determine the localization of filamentous actin in cells, which was correlated with the localization of Myo1. Interestingly, the pool of actomyosin was not always closely associated with the flagellar pocket region, suggesting that portions of the endolysomal system may remain at a distance from the sole site of parasite endocytosis. Lastly, the authors used actin-perturbing drugs to show that disrupting actin causes a collapse of the endosomal system in T. brucei, which they have shown recently does not comprise distinct compartments but instead a single continuous membrane system with subdomains containing distinct Rab markers.  

      Strengths:  

      Overall, the quality of the work is extremely high. It contains a wide variety of methods, including biochemistry, biophysics, and advanced microscopy that are all well-deployed to answer the central question. The data is also well-quantitated to provide additional rigor to the results. The main premise, that actomyosin is essential for the overall structure of the T. brucei endocytic system, is well supported and is of general interest, considering how uniquely configured this pathway is in this divergent eukaryote and how important it is to the elevated rates of endocytosis that are necessary for this parasite to inhabit its host.  

      We are very pleased that the Reviewer formed such a positive impression of the work. 

      Weaknesses:  

      (1) Did the authors observe any negative effects on parasite growth or phenotypes like BigEye upon expression of the actin chromobody?  

      Excellent question! There did appear to be detrimental effects on cell morphology in some cells, and it would definitely be worth doing a time course of induction to determine how quickly chromobody levels reach their maximum. The overnight inductions used here are almost certainly excessive, and shorter induction times would be expected to minimise any detrimental effects. We have noted these points in the Discussion.  

      (2) The Garcia-Salcedo EMBO paper cited included the production of anti-actin polyclonal antibodies that appeared to work quite well. The localization pattern produced by the anti-actin polyclonals looks similar to the chromobody, with perhaps a slightly larger labeling profile that could be due to differences in imaging conditions. I feel that the anti-actin antibody labeling should be expressly mentioned in this manuscript, and perhaps could reflect differences in the F-actin vs total actin pool within cells.  

      Implemented. We have explicitly mentioned the use of the anti-actin antibody in the Garcia-Salcedo paper in the revised Results and Discussion sections.  

      (3) The authors showed that disruption of F-actin with LatA leads to disruption of the endomembrane system, which suggests that the unique configuration of this compartment in T. brucei relies on actin dynamics. What happens under conditions where endocytosis and endocyctic traffic is blocked, such as 4 C? Are there changes to the localization of the actomyosin components? 

      Another excellent question! We did not analyse the localisation of TbMyo1 and actin under temperature block conditions, but this would definitely be a key experiment to do in follow-up work.

      (4) Along these lines, the authors suggest that their LatA treatments were able to disrupt the endosomal pathway without disrupting clathrin-mediated endocytosis at the flagellar pocket. Do they believe that actin is dispensable in this process? That seems like an important point that should be stated clearly or put in greater context.  

      Whether actin plays a direct or indirect role in endocytosis would be another fascinating question for future enquiry, and we do not have the data to do more than speculate on this point. Recent work in mammalian cells (Jin et al., 2022) has suggested that actin is primarily recruited when endocytosis stalls, and it could be that a similar role is at play here. We have noted this point in the Discussion. The observation of clathrin vesicles close to the flagellar pocket membrane and clathrin patches on the flagellar pocket membrane itself in the LatA-treated cells might suggest that some endocytic activity can occur in the absence of filamentous actin. 

      Recommendations for the authors:

      Note from the Reviewing Editor:  

      During discussion, all reviewers agreed that the role of TbMyo1 in vivo in endomembrane function had not been directly demonstrated. This could be done by testing the endocytic trafficking of (for example) fluorophoreconjugated TfR and BSA in the existing Myo1 RNAi line, using wide-field microscopy. Examining the endosomes/lysosomes' organization by thin-section EM would be even better. The actin signal detected by the chromobody tends to occupy a larger region than the MyoI. It's therefore conceivable that actin filamentation and stabilization via other actin-interacting proteins create the continuous endosomal structure, while MyoI is necessary for transport or other related processes. 

      These are all excellent points and very good suggestions. We have now incorporated new data (supplemental Figure S5) that includes BSA uptake assays in the TbMyo1 RNAi cell line and electron microscopy imaging after TbMyo1 depletion – the results are consistent with our earlier observations.   

      Reviewer #1 (Recommendations For The Authors):  

      -  Figure S2E. This panel is supposed to show the downregulation of TbMyo1 in the PCF compared to BSF but there is no loading control to support this claim. This is important because the authors mention in lines 381-383 that this finding conflicts with the previous study (Spitznagel et al., 2010). The authors also indicate in the figure legend that there is 50% less signal but there is no explanation about this quantification.   

      Good point. Equal numbers of cells were loaded in each lane, but we did not have an antibody against a protein known to be expressed at the same level in both PCF and BSF cells to use as a loading control. Using a total protein stain would have been similarly unhelpful in this context, as the proteomes of PCF and BSF cells are dissimilar. The quantification was made by direct measurement after background subtraction, but without normalisation owing to the lack of a loading control. This makes the conclusion somewhat tentative, but given the large difference in signal observed between the two samples (and the fact that this is consistent with the proteomic data obtained by Tinti and Ferguson) we feel that the conclusion is valid. We have clarified these points in the figure legend and Discussion.  

      -  It is mentioned in the discussion, as unpublished observations, that the predicted FYVE motif of TbMyo1 can bind specifically PI(3)P lipids. This is a very interesting point that would be new and would strengthen the suggested association with the endosomal system mainly based on imaging data. 

      We agree that this is – potentially – a very exciting observation and it is an obvious direction for future enquiry.  

      The data are preliminary at this stage and will form the basis of a future publication. Given that the predicted FYVE domain of TbMyo1 and known lipid-binding activity of other class I myosins makes this activity not wholly unexpected, we feel that it is acceptable at this stage to highlight these preliminary findings.  

      -  The authors use the correlation coefficient to estimate the colocalization (lines 223-226). Although they clearly explain the difference between the correlation coefficient and the co-occurrence of two signals, I wonder if it would not be clearer for the audience to have quantification of the overlapping signals. Also, it is not mentioned on which images the correlation coefficient was measured. It seems that it is from widefield images (Figures 3E and 6E), and likely from SIM images for Figure 3C but the resolution is different. Are widefield images sufficient to assess these measurements? 

      With hindsight, and given the different topological locations of TbMyo1 and the cargo proteins (cytosolic and lumenal, respectively) it would probably have been wiser to measure co-occurrence rather than correlation, but we would prefer not to repeat the entire analysis at this stage. The correlations were measured from widefield images using the procedure described in the Materials & Methods. These are obviously lower resolution than confocal or SIM images would be, but are still of value, we believe. One further point – upon re-examination of some of the TbMyo1 transferrin (Tf) and BSA data, we noticed that there are many pixels with a value of 0 for Tf/BSA and a nonzero value for TbMyo1 and vice-versa. The incidence of zero-versus-nonzero values in the two channels will have lowered the correlation coefficient, and in this sense, the correlation coefficients are giving us a hint of what the immuno-EM images later confirm: that the TbMyo1 and cargo are present in the same locations, but in different proportions. We have added this point to the discussion.  

      -  It would be good to know if the loss of the endosomal system integrity (using EBI) is the same upon TbMyo1 depletion than in the latrunculin A treated parasites. 

      We agree! We have now included new data (Figure S5) that suggests endosomal system morphology is altered upon TbMyo1 depletion. We would predict that the effect upon TbMyo1 depletion is slower or less dramatic than upon LatA treatment (as LatA affects both actin and TbMyo1, given that TbMyo1 depends upon actin for its localisation).

      -  Conversely, it would be of interest to see how the localization of TbMyo1 changes upon latrunculin A treatment.

      This experiment was done in 2010 by Spitznagel et al., who observed a delocalisation of the TbMyo1 signal after LatA treatment. We have noted this in the Results and Discussion.

      Minor corrections:  

      -  Line 374: Figure S1 should be Figure S2. 

      Implemented (many thanks!).  

      -  Panel E of Figure S2 refers to TbMyo1 and should therefore be included in Figure S1 and not S2. 

      We would prefer not to implement this suggestion. We did struggle over the placing of this panel for exactly this reason, but as the samples were obtained as part of the experiments described in Figure S2, we felt that its placement here worked best in terms of the narrative of the manuscript.    

      -  Figure S2F: the population of TbMyo21 +Tet seems lost after 48 h although the authors mention that there is no growth defect. 

      Good eyes! We have re-added the panel, which shows that there was no growth defect in the tetracycline-treated population.  

      Reviewer #2 (Recommendations For The Authors):  

      Fig 1 vs. Figure 3: The biochemical fractionation experiments have been well-controlled, showing that 40% of TbMyo1 is found in both the cytosolic and cytoskeletal fractions, with only 20% in the organelle-associated fraction. The conclusion is supported by the experimental design, which includes controls to rule out crosscontamination between fractions. However, does this contrast with the widefield microscopy experiments, where the vast majority of the signal is in endocytic compartments and nowhere else? 

      This is a good point. There are three factors that probably explain this. First, given that the actin cytoskeleton is associated with the endosomal system, a large proportion of the material partitioning into the cytoskeleton (P2) fraction is probably localised to the endosomal system (a fun experiment would be to repeat the fractionation with addition of ATP to the extraction buffer to make the myosin dissociate and see whether more appeared in the SN2 fraction as a result). Second, the 40% of the TbMyo1 that is cytosolic is distributed throughout the entire cellular volume, whereas the material localised to the endosomes is concentrated in a much smaller space, by comparison, and producing a stronger signal. Third, the widefield microscopy images have had brightness and contrast adjusted in order to reduce “background” signal, though this will also include cytosolic molecules. We hope these explanations are satisfactory, but would welcome any additional thoughts from either the reviewer or the community.  

      The section title 'TbMyo1 translocates filamentous actin at 130 nm/s' could mislead readers by not specifying that the findings are from an in vitro experiment with a recombinant protein, which may not fully reflect the cell's complex context. Although this detail is noted in the figure legend, incorporating it into the main text and considering a title revision would ensure clarity and accuracy.  

      Good point. Implemented – we have amended the section title to “TbMyo1 translocates filamentous actin at 130 nm/s in vitro” and the figure legend title to “TbMyo1 translocates filamentous actin in vitro”.  

      The discussion of the translocation experiment could be better phrased addressing certain limitations. The in vitro conditions might not fully capture the complexity and dynamic nature of cellular environments where multiple regulatory mechanisms, interacting partners, and cellular compartments come into play. 

      Good point, implemented. We have added a note on this to the Discussion.  

      It is puzzling that RNAi, which is widely used in T. brucei was not used to further investigate the functional roles of TbMyo1 in Trypanosoma brucei. Given that the authors already had the cell line and used it to validate the specificity of the anti-TbMyo1. RNAi could have been employed to knock down TbMyo1 expression and observe the resultant effects on actin filament dynamics and organization within the cell. This would have directly tested TbMyo1's contribution to actin translocation observed in the in vitro experiments. 

      It would obviously be interesting to carry out an in-depth characterisation of the phenotype following TbMyo1 depletion and whether this has an effect on actin dynamics. We have now included additional data (supplemental Figure S5) using the TbMyo1 RNAi cells and the results are consistent with our earlier observations and interpretations. It is worth noting too that at least for electron microscopy studies of intracellular morphology, the slower onset of an RNAi phenotype and the asynchronous replication of T. brucei populations make observation of direct (early) effects of depletion challenging – hence the preferential use of LatA here to depolymerise actin and trigger a faster phenotype.  

      I found that several declarative statements within the main text may not be fully supported by the overall evidence. I suggest modifications to present a more balanced view,  

      Line 227: "The results here suggest that although the TbMyo1 distribution overlaps with that of endocytic cargo, the signals are not strongly correlated." This conclusion about the lack of strong correlation might mislead readers about the functional relationship between TbMyo1 and endocytic cargo, as colocalization does not directly imply functional interaction. 

      We would prefer not to alter this statement. It was our intention to phrase this cautiously, as we have not directly investigated the functional interplay between TbMyo1 and endocytic cargo and the subsequent sentence directs the reader to the Discussion for more consideration of this issue.    

      Line 397: "This relatively high velocity might indicate that TbMyo1 is participating in intracellular trafficking of BSF T. brucei and functioning as an active motor rather than a static tether." The statement directly infers TbMyo1's functional role from in vitro motility assay velocities without in vivo corroboration.

      We have amended the sentence in the Discussion to make it clear that it is speculative.  

      The hypothesis that cytosolic TbMyo1 adopts an auto-inhibited "foldback" configuration, drawn by analogy with findings from other studies, is intriguing. Yet, direct evidence linking this configuration to TbMyo1's function in T. brucei is absent from the data presented. 

      We have amended the sentence in the Discussion to make it clear that it is speculative. Future in vitro experiments will test this hypothesis directly.  

      The suggestion that a large cytosolic fraction of TbMyo1 indicates dynamic behavior, high turnover on organelles, and a low duty ratio is plausible but remains speculative without direct experimental evidence. Measurements of TbMyo1 turnover rates or duty ratios in T. brucei through kinetic studies would substantiate this claim with the necessary evidence.  

      We have amended the sentence in the Discussion to make it clear that it is speculative, and deleted the reference to a possible low duty ratio. Again, future in vitro experiments will measure the duty ratio of TbMyo1 using stopped-flow. 

      Reviewer #3 (Recommendations For The Authors):  

      Lines 171-172: The authors mention that MyoI could be functioning as a motor rather than a tether. The differences in myosin function have not been introduced prior to this. I would recommend explaining these differences and what it could mean for the function of the motor in the introduction to help a non-expert audience.

      Good point. Implemented.  

      Line 94-95: This phenotype only holds for the bloodstream form- the procyclic form are quite resistant to actin RNAi and MyoI RNAi. I would clarify. 

      Good point. Implemented.  

      Line 142-146: did the authors attempt to knock out the Myo21? 

      Good point. No, this was not attempted. Given the extremely low expression levels of TbMyo21 in the BSF cells we would not expect a strong phenotype, but this assumption would be worth testing. 

      Figure 3D: is there a reason why the authors chose to show the single-channel images in monochrome in this case?  

      Not especially. These panels are the only ones that show a significant overlap in the signals between the two channels (unlike the colabelling experiments with ER, Golgi), so greyscale images were used because of their higher contrast. 

      Line 397-398: I'm struggling a bit to understand how MyoI could be involved in intracellular trafficking in the endosomal compartments if the idea is that we have a continuous membrane? Some more detail as to the author's thinking here would be useful. 

      Implemented. We have noted that this statement is speculative, and emphasised that being an active motor does not automatically mean that it is involved in intracellular traffic – it could instead be involved in manipulating endosomal membranes. We have noted too that the close proximity between TbMyo1 and the lysosome (Figures

      3-5) could be important in this regard. The lysosome is not contiguous with the endosomal system, and it is possible that TbMyo1 is working as a motor to transport material (class II clathrin-coated vesicles) from the endosomal system to the lysosome.  

      Line 493-496: Does this mean that endocytosis from the FP does not require actin? This would be hard to explain considering the phenotypes observed in the original actin RNAi work. Is the BigEye phentopye observed in BSF actin RNAi and Myo1 RNAi cells due to some indirect effect? 

      It seems possible that actin is not directly or essentially involved in endocytosis, and the characterisation of the actin RNAi phenotype would be worth revisiting in this respect – we have noted this in the Discussion. Although RNAi of actin was lethal, the phenotype appears less penetrant than that seen following depletion of the essential endocytic cofactor clathrin (based on the descriptions in Garcia-Salcedo et al., 2004 and Allen et al., 2003). BigEye phenotypes occur in BSF cells whenever there is some perturbation of endomembrane trafficking and are not necessarily a direct consequence of depletion – this is why careful investigation of early timepoints following RNAi induction is critical.

    1. Author response:

      We are very appreciative of the reviewers’ assessment that we used “solid and creative” methods to provide a “convincing demonstration” of “compelling theoretical results” on a “crucial but less-explored issue” in cognitive neuroscience. We are also grateful for their thoughtful suggestions for analyses and for pointing out areas where our analysis descriptions need more clarity. While we will respond to all comments in a future response and revision, here we provide information and clarification on a few central points.

      Localization of semantic content:

      Regarding our semantic analysis, one reviewer rightly pointed out that items with a high degree of semantic association, as captured by word2vec, tend to occur in the same images, and they expressed concern that this could drive our similarity results. We wish to clarify here (and will revise the manuscript accordingly) that we excluded all pairs of co-occurring items in our word2vec semantic analysis in order to avoid this issue. Thus, our results cannot be driven by the number of images within which items co-occurred. We also agree with the reviewer who stated that “semantic information” is a nebulous term in the cognitive neurosciences, and it appears to have led to some confusion as to the nature of our claims. We take a broad view of this term, with the perspective that visual features (e.g., color, shape) can contribute to semantic content rather than necessarily competing with it. In our work, we use word2vec to identify neural representations that reflect the kind of semantic content present in word embedding models—but the conclusions we draw do not depend on these representations being devoid of visual content. That is, we do not use word2vec to examine semantic versus visual representations, but rather to narrow down the set of representations to be considered in subsequent analyses. While there are a range of legitimate views on what should be considered a “semantic” representation, our broad view, which is inclusive of visual content, along with our strategy for localizing semantic content are both standardly used in the visual neuroscience literature. Prior work in this literature has compared the ability of word2vec and low-level visual models to predict neural responses to natural images and found that the brain regions in which activity is accurately predicted by the models are considerably distinct: whereas a low-level visual model best predicts activity in V1, V2, and V4, word2vec performs better in more anterior regions, including in visual areas such as lateral occipital cortex (Güçlü & van Gerven, 2015, arXiv). This suggests that our effects are unlikely to be explained by overlap in the kinds of low-level visual features mentioned by the reviewers. However, the semantic content we localize and the representation of high-level visual features may indeed overlap, and this is compatible with our claims. We will do more in our revision to be explicit about our intended meaning in our use of the word “semantic” and how our approach relates to and builds on prior work in this literature.

      Long-term representational drift:

      We want to clarify our claims regarding the representational drift analysis. One reviewer stated that, while we show evidence of representational drift, we “provide no evidence suggesting that this long-term neural representational drift reflects a drift in semantic representation.” Another reviewer said: “The inference is that this [drift] is due to an updating of knowledge about the associations each item has had with other items,” and that our finding that semantic structure remains stable within these regions seems “to contradict the claims about semantic plasticity.” The claim we intended to make, which will be unpacked more clearly in our revision, is that the neural representations underlying semantic content drift over time, even if the semantic content itself is unchanging. In other words, we do not claim that our across-session drift analyses show changes in knowledge about object associations. Indeed, one of the reasons that representational drift has recently captured the attention of neuroscientists is that the neural representations underlying certain behaviors or cognitive content appear to drift over time even when the behaviors or cognitive content remain fixed. The relational structure of the neural representations can remain stable, even if the particular neurons recruited to represent each stimulus change over time (see, e.g., the T-maze in Rule, O’Leary, & Harvey., 2019, Curr Opin Neurobiol). Here we are translating these ideas, which were developed using animal models and/or primarily focused on low-level vision, to the semantic system in humans. The neural representations we identify in our paper capture semantic information because they share a similarity structure with word2vec, and the level of similarity to word2vec remains stable over time. Thus, our findings provide a simple demonstration of long-term representational drift in the human semantic system akin to that reported in animals—drift in the neural semantic representations of items even as the relations between these item representations appear stable.

      Signal-to-noise variability across the MTL:

      A reviewer raised the possibility that differences between our ROIs could be driven by variability in signal-to-noise ratio (SNR) across regions, particularly within the medial temporal lobe (MTL). We looked at noise ceiling SNR brain maps for each participant, which reflect the reliability of neural responses across repetitions of the same image. Preliminary analyses indicate that SNR differences do not account for our object encoding, semantic content, representational drift, or short-term plasticity measures across the MTL.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) Please provide more background about Rpgrip1l in the introduction, particularly the past studies of mammalian homolog of Rpgrip11, if any? Is there any human disease associated with Rpgrip1l? Do these patients have scoliosis phenotype? 

      • We have added more background on the human ciliopathies caused by RPGRIP1L mutations and on their occasional association with early onset scoliosis (lines 45-54 page 2 in the introduction, see cited references). 

      (2) The allele is a large deficiency of most of the coding region of rpgrip1l, can you give details in the Supplementary data of how you show this by genotyping? It would be good to explain that this mutation is most likely behaving as a null, if you have RNAseq data that supports this please note that. Otherwise, it may be incorrect to assume it is a null allele as your shorthand nomenclature states. If you do not have stronger evidence that the deficiency allele is behaving as a null allele, then please think about using an allele nomenclature as outlined at ZFIN:  

      • We now describe in the results section (Lines 72-76, page 3) the extent of the deletion of rpgrip1l ∆/∆ (22 exons out of 26) that creates an early stop at position 88 of 1256 aas. We have submitted to ZFIN our two novel mutant lines: rpgrip1l∆  is recorded as rpgrip1l bps1 and rpgrip1l ex4 as rpgrip1l bps2 , and we provide this information in the text. Transcriptomics data confirmed this allele is behaving as a null as the most down-regulated transcript found in the brain of rpgrip1l ∆/∆ is rpgrip1l transcript itself, (volcano plot in Fig 5A, described in the results, Line 270-71, page 9).

      • We also have provided in Supplementary Figure 1 A’ a picture of a typical genotyping gel for the rpgrip1l∆ allele. Sequences of both CRISPR guide RNAs and genotyping primers are provided in the Math & Meth section. 

      (3) Throughout the manuscript, the authors refer to zebrafish mutant phenotypes as "juvenile scoliosis". However, scoliosis may not appear until 11 weeks post-fertilization in some animals. After 6-8 weeks of age, it would be more appropriate to describe the phenotype as "late-onset or adult scoliosis" to differentiate between other reported scoliosis mutants (such as hypomorphic or dominant negative alleles of scospondin) that start body curvatures at 3-5 dpf .

      • We think we can really qualify rpgrip1l-/- scoliosis as being a “juvenile scoliosis” as shown by the time course displayed in Fig 1B: rpgrip1l-/- scoliosis develops asynchronously between 4 weeks and 9 weeks (from 0.8 cm/1 cm to 1.6 cm, corresponding to juvenile stages according to Parichy et al, 2009 PMID: 19891001), after which it reaches a plateau. Half of the mutants are already scoliotic by 5 weeks and no scoliosis develops at adult stage, ie from 10 weeks on. We have acknowledged the late onset scoliosis in page 3 line 93.

      (4) A more careful demonstration of the individual vertebrae, using magnified high-resolution pictures in Figures 1D-G, should be made to more clearly show no obvious vertebral malformations are present. 

      • We now provide a movie in Sup Data that presents 3D views of controls and mutant spines, which show the intervertebral spaces as well as vertebral shape and size. With these images we could exclude vertebral fusion and the presence of dysmorphic vertebrae.

      (5) On page 5: the authors comment on transgenic expression of RPGRIP1L in foxj1a-lineages as "rescuing" scoliosis. This terminology is confusing, as rescuing a condition could be interpreted as inducing it where it was once absent. "Suppressing" scoliosis may be a more appropriate term. 

      • We agree with the reviewers, the “rescue” term is confusing, we changed it for “suppress” in the title of the paragraph (line 95 page 3) and within the text (line 115 page 3).

      (6) On page 5, lines 155-156: the authors state that "Indeed, no tissue-specific rescue has been performed yet in zebrafish ciliary gene mutants". This is misleading, as ptk7a and katnb1 mutations both disrupt cilia, and transgenic reintroduction of both ptk7a and katnb1 in foxj1a- expressing lineages has previously been shown to suppress cilia defects as well as scoliosis in these models. The statement should be removed for accuracy. 

      • We agree that we were not precise enough in our sentence: when we mentioned “ciliary gene” mutants, we were referring to genes whose products are enriched within cilia and directly affecting ciliogenesis, cilia content and maintenance such as TZ or BBS genes, without encompassing genes like ptk7 and katnb1 whose products perform multiple functions on top of cilia maintenance such as Wnt signalling and remodelling of the whole microtubule network respectively. We have therefore modified our sentence by adding zebrafish ciliary “TZ and BBS” genes (line 104, page 4).

      (7) Figure 2: panels A-B: In the text (line 196) you state that cilia length was increased and that Arl13b content was severely reduced. However, Panel B shows no significant length difference between scoliotic mutants and controls. This statement and graph should be corrected for accuracy. Also, the Arl13b staining is difficult to see in panel A - can channels be split, and/or quantified? 

      • We have now split the Arl13b and glutamylated tubulin channels (Fig 2 A-C”). We think that the reduction of Arl13b staining intensity is now obvious in both straight and scoliotic mutants (Compare 2A” with 2B” and 2C”). We were not able to quantify Arl13b staining using ciliary masks from glutamylated tubulin staining since both staining only partially overlap along the length of the cilium, Arl13b being more distal than glutamylated tubulin (Fig 2A’). 

      • Ciliary length was significantly increased (from 3.4 to 5.3 µ) in straight rpgrip1l-/-, while the average mean values for scoliotic rpgrip1l-/- were heterogenous (mean 4.1µ) and therefore not significantly different when compared to controls. This heterogeneity stems from the combined presence of both shorter and longer cilia in scoliotic fish, a finding we interpreted by the potential breakage over time of extra-long and thin cilia observed in scoliotic fish (as in Sup figure 1 H’’’, Sup Fig 2M’ and 2O’). 

      • We changed the text to be more accurate: we now state that cilia length increased in straight mutants, and became more heterogenous than controls in scoliotic mutants (line 143-144, page 5). 

      (8) Figure 3: Page 7, line 206: authors state that SCO-spondin secreting cells varied in number along SCO length. What is the evidence that these cells secrete SCO-spondin? The staining shown in Figure 3L-O appears to demonstrate extracellular accumulation of sspo:GFP. What is the evidence that this staining originated from cells in proximity to it? 

      The claim of SCO-secreting cells in Figure 2E-J is confusing. I assume you are using anatomy to infer the SCO is captured in these sections. This should be done in sspo-GFP animals (as in Figure 3) and/or dual anti-body labeling can be done to show SCO-secreting cells and cilia. 

      • We now show in Supplementary Figure 2 A-D a double staining for Sco-spondin-GFP and cilia (Ac-tub, Glu-Tub). Analyzing GFP staining along SCO length on successive sections, we identified the SCO producing cells on the diencephalic dorsal midline by their position under the posterior commissure (PC), which forms an Acetylated Tubulin positive arch), and counted the nuclei surrounded by cytoplasmic GFP from the most anterior region ( 24 cells wide, Sup Fig 2A-A’) to the most posterior region (4-8 cells wide, Sup Fig 2 C).` 

      • Furthermore, the close-ups presented on Fig 2A’ and 2B’ allow to detect the cytoplasmic Sspo-GFP staining around SCO nuclei, above the region presenting primary cilia pointing towards the diencephalic ventricle, both in controls and mutants at scoliosis onset (tail-up mutants), showing that the extracellular staining in B’ very likely originates from these cells. In these tail-up mutants, extracellular Sspo aggregates have not yet filled the whole diencephalic ventricle as in Fig 3 N and Q. 

      (9) Figure 5: Is the transcriptome data and proteomic data consistent for any transcripts and encoded protein products? Please highlight those consistent targets in both analyses. 

      • We would like to emphasize that the transcriptomic study was performed at scoliosis onset, at 5 weeks, while the proteomics analysis was performed at adult stage (3 months) so they cannot be directly compared.

      Moreover, low abundance proteins (such as centrosomal proteins and transcription factors like Foxj1a ) are not detected by label-free proteomics, without prior subcellular fractionation procedure (Lindemann et al, 2017 PMID: 28282288). The extraction protocol also does not allow to purify short neuropeptides such as Urp1-2.

      Nevertheless, we found four targets in common, now highlighted in red in Fig 5, Panel E: Anxa2, complement proteins

      C4 and C7a, and Stat3, all related to immune response, a GO term enriched in both studies as explained in the text (Lines 308-311, page 10). 

      The absence of many inflammation markers or immune response proteins at adult stage in scoliotic mutants most probably indicates a transient inflammatory episode at scoliosis onset, while astrogliosis, as detected by GFAP staining, increases with scoliosis severity. Along the same lines, the two-fold increase of Lcp1 cells within the tectum is present before axis curvature (in straight mutants) and disappears in scoliotic fish (Graph G in Sup Figure S5) as explained in the text, Lines 378-381, page 12, 

      (10) Supplementary Figure 1 F-H: What stage/age samples were used for SEM? It is only stated that they were 'adults'. It is also stated that cilia tufts in straight rpgrip1l-/- fish were morphologically normal but 'less dense'- this was not obvious from the figure. Can density be quantified? (otherwise, data does not support the statement). Similarly, can the statement that "cilia of mono-ciliated ependymal cells showed abnormal irregular structures compared to controls, with either bulged or thinner parts" be supported with measurements/quantification? 

      • The SEM study was performed on 3 months old fish, 3 controls and 5 mutants. We added this information in the figure legend. We could not quantify the number of ciliary tufts in the brain ventricle of the sole straight mutant that was analyzed. We therefore removed the statement that cilia were less dense in the straight mutant. Along the same lines, we mentioned that we could find mutant cilia of irregular shape as shown in Supplementary Figure S1, F”,G’’, H’’ and H’’’) (page 4, lines 124-129). 

      (11) Supplementary Figure 1D-E is never mentioned in the text. The Supplemental Figure legend also refers to a graph of cilia length that is not in the figure itself. As a result, many of the subsequent panel references are out of register. 

      • We now provide the correct version of the legend and refer to Sup Fig 1D-E in the text (page 3, lines 79-81) and its legend, page 53, lines 1616-1620.

      (12) Supplementary Figure 2A-F: Of interest, in panels C and F, it looks as though sspo:GFP is accumulating on cilia within the ventricles of rpgrip1l mutants. Can this be explored? Is it possible that abnormal aggregation of SSPO on cilia is ultimately leading to cilia loss, as you report for multi-ciliated cells surrounding the subcommissural organ? This could be a very interesting finding and possible mechanism for cilia loss.

      • Our observation of all brain sections led us to conclude that the majority of Sspo-GFP aggregates were floating within the brain ventricles of rpgrip1l-/- fish while a portion of aggregates were stuck on ventricle walls, in close contact with cilia as now shown on Supplementary figure S2 B’, outlined in legend page 54, lines 1634-1637. We agree that the contact between Sspo aggregates and cilia might have damaging consequences, either on cilia maintenance or on immune reaction induction and we now mention these possibilities in the discussion page16, lines 524-526. These research lines will be explored in the near future.

      (13) Supplementary Figure 5A-F is not mentioned in the manuscript. Please clarify the role of Anxa2 in neuroinflammation. Is increased Anxa2 expression in rpgrip1l mutant zebrafish reduced after anti-inflammatory drug treatment? What is the expression level of anxa2 in cep290 mutant zebrafish? 

      • We have now added mention to Supplementary Figure 5A-F in the text page 10 lines 328-331. 

      • We unfortunately did not have enough histological material to test Anxa2 staining on NACET treated fish after performing GFAP and Lcp1 staining, neither for dilatation measurement or multiciliated cells quantification. We agree this would have helped to better define which defect might be an indirect consequence of an inflammatory environment.

      • We tested the expression level of Anxa2 in cep290-/- fish. No labelling above control level was detected on cep290-/- brain sections that were positive for GFAP (N = 5). As GFAP staining in 3-4 weeks cep290-/- was not as intense and widespread as in adult rpgrip1l-/- (50% of GFAP + cells compared to 100% in the SCO for example), we concluded that Anxa2 expression may be upregulated after widespread or long-term astrogliosis/inflammation. Alternatively, Anxa2 overexpression could be specific to rpgrip1l-/- fish. 

      (14) A summary diagram at the end would be helpful for understanding the main findings. 

      We added a Graphical Abstract summarizing the main conclusions and hypotheses of this study. It is mentioned and explained in the Discussion section, p. 16 lines 504-508 and 516-529. 

      (15) The sspo-GFP zebrafish line should be listed in the STAR methods section: 

      The sspo-GFP line is now listed in the STAR methods, Scospondin-GFPut24, (Troutwine et al., 2020 PMID: 32386529), p.43, last line.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1) The CIRSPR screen designed to detect regulators of damage-induced transcriptional repression is based on EU incorporation following a 7-day selection of stable knockout cells. As the authors point out, cell cycle arrest reduces rDNA transcription on its own. The screen, which assesses changes in sgRNA distribution in EU high cells, is thus likely to be dominated by factors that affect cell cycle progression. This is exemplified in the analyses of top hits related to neddylation. The screen's limitations in terms of identifying DDR effectors of damage-induced silencing need to be clearly stated. 

      Notably, our screen did identify known DNA damage response effectors of damage-induced silencing, for example ATM was a top hit, as discussed in the paper and shown in Fig. 5B. We consider that our unbiased approach had advantages because in addition to finding known DDR effectors, we uncovered novel requirements, such as the need for cells to be cycling, for transcriptional silencing in response to DNA damage. We didn’t find the canonical key cell cycle regulators in our screen. One possibility might be that cell cycle arrest or cell death upon their knock down may lead to out-competition during the seven-day treatment with doxycycline resulting in depletion from, rather than enrichment in, the targeting gRNAs from cells that maintain transcription 7 days after DNA damage.

      Comment 2) The authors confirm previous findings of DNA damage-induced repression of rDNA and histone gene transcription. The authors propose that these highly transcribed genes are more susceptible to silencing than the bulk of protein-coding genes and propose a global damage-induced signaling event that is independent of DNA breaks in cis. While this is possible, it is not demonstrated in this manuscript, and the authors should acknowledge alternative explanations. For example, the loci found to be repressed by bulk IR are highly repetitive gene arrays that tend to form nuclear sub-compartments (nucleoli, histone bodies). As such, their likelihood of being in the vicinity of DNA damage is high, at least for a fraction of gene copies. The findings, therefore, remain consistent with cis-induced silencing. Moreover, silencing may spread through the relevant nuclear sub-compartments, consistent with the formation of DNA damage compartments described recently (PMID: 37853125). 

      The reason for us “suggest(ing) that the reduced bulk abundance of nascent transcripts after IR may occur in trans as a programmed event” was based on the gene length-independent and IR dose-independent nature of the gene silencing shown in Fig. 2D and Fig. 4C), not that rDNA and histone gene expression went down the most after IR. Indeed, we stated that “Those genes that were normally most highly transcribed were repressed after IR, while genes that were normally expressed at intermediate or low levels tended to be induced after IR (Fig. 4A). The mechanistic reason for this is unclear.” We thank the reviewer for the suggestion that this may be due to these genes existing in nuclear sub-compartments. We have now incorporated this possibility into the discussion.

      Other comments: 

      (1) The statement that silencing is due to transcription initiation rather than elongation is not sufficiently supported by the data. Could equivalent nascent transcript reduction not be the result of the suppression of elongating RNA PolII? To draw the proposed conclusion, the authors would need to demonstrate that RNA PolII initiation is altered, using RNA PollII ChIP and/or analysis of relevant RNA PolII phosphorylation patterns. 

      Figure 4F shows the distribution of nascent transcript reads throughout the open reading frame of the repressed genes. It shows that the transcript abundance throughout the ORF, including at the 5’ end, is reduced. This pattern is consistent with a defect in initiation. We have now clarified the description of these results to state that: “Our data is consistent with the possibility that the major mechanism for the repression of the ~1,000 protein coding genes after IR is at the transcriptional initiation stage. However, our data do not rule out that elongation may be additionally repressed after IR, as this would not be observed in our analyses due to concomitant repression of transcriptional initiation.” 

      (2) The lack of rDNA silencing in arrested cells is interesting, though the underlying mechanism remains unclear. To further corroborate the proposed defect in ATM-mediated signaling, the authors should look directly at ATM and Treacle phosphorylation upstream of TOPBP1. 

      We would love to have shown that ATM dependent phosphorylation does not occur upon IR. We had attempted this multiple times but unfortunately the available phospho Treacle antibodies were not suitable for rigorous analyses in our hands.

      (3) The "change in relative heights of the EU low (G1) and EU high (S/G2) peaks" in Figures 5D, 5E, and 6B is central to the proposed model of transcriptional changes being affected by cell cycle arrest. These differences should be visualized more clearly and quantified across independent experiments. Ideally, the cell cycle stage should be dissected as in Figure 2B. How do the authors envision cell cycle arrest triggers the defect in transcriptional silencing? 

      In the previous version, the last paragraph described one possibility for how rDNA may fail to be repressed in arrested cells after IR, based on the results shown in Fig. 7F and G.  We have now added a paragraph in the discussion section beginning “Why would cell cycle arrest in G1 or G2 phases of the cell cycle prevent transcriptional repression of rDNA and histone genes after IR?”

      Reviewer #2:

      (1) Define ERCC normalization. 

      We apologize for this omission. We now have explained ERCC normalization and have added a citation to a commentary that we wrote on spike-in controls 2015 for further explanation.

      (2) On page 8, the authors speculate that genes involved in immune response after IR was activated due to cytoplasmic DNA in pre-B cells. Where are these cytoplasmic DNAs from? Is there any literature indicating that 30 30-minute IR treatment can induce cytoplasmic DNA? 

      We have removed this speculation, as there is no evidence currently to support it.

      (3) Related to the points above, are ERVs or repetitive DNA elements up-regulated upon IR treatment, which in turn results in increased expression of genes involved in immune response? 

      The induction of cytokines as a rapid response to irradiation is a major part of the immediate early gene program induced in response to ROS (and now is explained in the manuscript).

      (4) Please explain in the result section how overlap levels of transcription determined by EU are reduced after IR, and yet the number of genes with increased expression upon IR treatment is much more than that of genes with reduced expression. 

      We have explained that while less genes have reduced expression after IR than the number of genes that increase expression after IR, those genes that have reduced expression are extremely highly expressed to start off with. As a result, the bulk amount of transcripts is reduced after IR.

      (5) Do cells treated with MLN4924 block the down-regulation of histone genes and ribosomal genes? 

      We have not addressed this directly. However, given that the reduction of gene expression that occurs after IR is largely due to repression of histone and rDNA genes, it is safe to speculate that these are the genes that are no longer repressed during cell cycle arrest.

      (6) Is IR-induced down-regulation of histone genes due to cell cycle changes? 

      We do not know for sure if this is the case. It is relevant to note that even without IR, histone expression per se is regulated by cell cycle changes, being lower outside of S phase – and the majority of  non-arrested cells in our study are in S phase (Fig. 2B). As such, arrest of cells per se outside of S phase would be sufficient to reduce histone expression level.

      We would like to thank the reviewers again for their insightful suggestions and comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript dissects the contribution of the CaBP 1 and 2 on the calcium current in the cochlear inner hair cells. The authors measured the calcium current inactivation from the double knock-out CaBP1 and 2 and showed that both proteins contribute to voltage-dependent and calcium-dependent inactivation. Synaptic release was reduced in the double KO. As a consequence, the authors observed a depressed activity within the auditory nerve. Taken together, this study identifies a new player that regulates the stimulation-secretion coupling in the auditory sensory cells. 

      Strengths: 

      In this study, the authors bring compelling evidence that CaBP 1 and 2 are both involved in the inactivation of the calcium current, from cellular up to system level, and by taking care to probe different experimental conditions such as different holding potentials and by rescuing the phenotype with the re-expression of CaBP2. Indeed, while changing the holding potential worsens the secretion, it completely changes the kinetics of the inactivation recovery. It alerts the reader that probing different experimental conditions that may be closer to physiology is better suited to uncovering any deleterious phenotype. This gave pretty solid results. 

      Weaknesses: 

      Although this study clearly points out that CaBP1 is involved in the calcium current inactivation, it is not clear how CaBP1 and CaBP2 act together (but this is probably beyond the scope of the study). Another point is that the authors re-express CaBP2 to largely rescue the phenotype in the double KO but no data are available to know whether the re-expression of both CaBP1 and CaBP2 would achieve a full recovery and what would be the effect of the sole re-expression of CaBP1 in the double KO.

      We would like to thank the reviewer for the appreciation of our work. We agree that the effect of the sole re-expression of CaBP1 in the double KO remains elusive and have planned to address this question in a follow-up study. 

      Reviewer #2 (Public Review): 

      Summary: 

      In the manuscript by Oestreicher et al, the authors use patch-clamp electrophysiology, immunofluorescent imaging of the cochlea, auditory function tests, and single-unit recordings of auditory afferent neurons to probe the unique properties of calcium signaling in cochlear hair cells that allow rapid and sustained neurotransmitter release. The calcium-binding proteins (CaBPs) are thought to modify the inactivation of the Cav1.3 calcium channels in IHCs that initiate vesicle fusion, reducing the calcium-dependent inactivation (CDI) of the channels to allow sustained calcium influx to support neurotransmitter release. The authors use knockout mice of Cabp1 and Cabp2 in a double knockout (Cabp1/2 DKO) to show that these molecules are required for enabling sustained calcium currents by reducing CDI and enabling proper IHC neurotransmitter release. They further support their evidence by re-introducing Cabp2 using an injection of AAV containing the Cabp2 sequence into the cochlea, which restores some of the auditory function and reduces CDI in patch-clamp recordings. 

      Strengths: 

      Overall the data is convincing that Cabp1/2 is required for reducing CDI in cochlear hair cells, allowing their sustained neurotransmitter release and sound encoding. Figures are well-prepared, recordings are careful and stats are appropriate, and the manuscript is well-written. The discussion appropriately considers aspects of the data that are not yet explained and await further experimentation.

      Weaknesses: 

      There are some sections of the manuscript that pool data from different experiments with slightly different conditions (wt data from a previous paper, different calcium concentrations, different holding voltages, tones vs clicks, etc). This makes the work harder to follow and more complicated to explain. However, the major conclusion, that cabp1 and 2 work together to reduce calcium-dependent inactivation of L-type calcium channels in cochlear inner hair cells, still holds. 

      Another weakness is that the authors used injections of AAV-containing sequences for Cabp2, but do not present data from sham surgeries. In most cases, the improvement of hearing function with AAV injection is believable and should be attributed to the cabp2 function. However, in at least one instance (Figure 4B), the results of the AAV injection experiments may be overinterpreted - the authors show that upon AAV injection, the hair cells have a much longer calcium current recovery following a large, long depolarization to inactivate the calcium channels. Without comparison to sham surgery, it is not known if this result could be a subtle result of the surgery or indeed due to the Cabp2 expression.  It would be great to see the auditory nerve recordings in AAV-injected animals that have a recovery of ABRs. However, this is a challenging experiment that requires considerable time and resources, so is not required.

      We would like to thank the reviewer for the appreciation of our work. We agree with the reviewer that sham surgery may convey more information that might benefit the interpretation of our data. The recovery experiments were very tedious and these long patch-clamp paradigms required extremely stable recordings. Based on our observations, we plan to address the recovery kinetics into more detail in the follow-up study. However, we would consider off-side effects of the surgery (as it may mainly affect middle ear function) and of the empty AAV-vector on inner hair cell calcium current recovery rather unlikely, but we cannot exclude them. We thus added a sentence in the discussion to alert to that. Based on previously published data of the effect of PHP.eB-Cabp2eGFP in WT animals we expect some (mild) adverse effects on hearing from overexpression of CaBP2 and/or eGFP in the inner ear. In the future, we thus plan to further optimize the treatment. In terms of the in vivo recordings from the auditory nerve fibers of the rescued mice, we could not agree more. That is in plan for the follow-up study.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors attempted to unravel the role of the Ca2+-binding proteins CaBP1 and CaBP2 for the hitherto enigmatic lack of Ca2+-dependent inactivation of Ca2+ currents in sensory inner hair cells (IHCs). As Ca2+ currents through Cav1.3 channels are crucial for exocytosis, the lack of inactivation of those Ca2+ currents is essential for the indefatigable sound encoding by IHCs. Using a deaf mouse model lacking both CaBP1 and CaBP2, the authors convincingly demonstrate that both CaBP1 and CaBP2 together confer a lack of inactivation, with CaBP2 being far more effective. This is surprising given the mild phenotype of the single knockouts, which has been published by the authors before. Readmission of CaBP2 through viral gene transfer into the inner ear of double-knockout mice largely restored hearing function, normal Ca2+ current properties, and exocytosis. 

      Strengths: 

      (1) In vitro electrophysiology: perforated patch-clamp recordings of Ca2+/Ba2+ currents of inner hair cells (IHCs) from 3-4 week-old mice - very difficult recordings - necessary to not interfere with intracellular Ca2+ buffers, including CaBP1 and CaBP2. 

      (2) Capacitance (exocytosis) recordings from IHCs in perforated patch mode. 

      (3) The insight that a negative holding potential might underestimate the impact of lack of CaBP1/2 on the inactivation of ICa in IHCs. As the physiological holding potential is much more positive than a preferred holding potential in patch clamp experiments it has a strong impact on inactivation in the pauses between depolarization mimicking receptor potentials. This truly advances our thinking about the stimulation of IHCs and accumulating inactivation of the Cav1.3 channels. 

      (4) Insight that the voltage sine method with usual voltage excursions (35 mV) to determine the membrane capacitance (for exocytosis measurements) also favors the inactivated state of Cav1.3 channels 

      (5) Use of double ko mice (for both CaBP1 and CaBP2, DKO) and use of DKO with virally injected CaBP2eGFP into the inner ear. 

      (6) Use of DKO animals/IHCs/SGNs after virus-mediated CaBP2 gene transfer shows a great amount of rescue of the normal ICa inactivation phenotype.

      (7) In vivo measurements of SGN AP responses to sound, which is highly demanding. 

      (8) In vivo measurements of hearing thresholds, DPOAE characteristics, and ABR wave I amplitudes/latencies of DKO mice and DKO+injected mice compared to WT mice. 

      Very thorough analysis and presentation of the data, excellent statistical analysis.

      The authors achieved their aims. Their results fully support their conclusions. The methods used by the authors are state-of-the-art. 

      The impacts on the field are the following:

      Regulation of inactivation of Cav1.3 currents is crucial for the persistent functioning of Cav1.3 channels in sensory transduction. 

      The findings of the authors better explain the phenotype of the human autosomal recessive DFNB93, which is based on the malfunction of CaBP2. 

      Future work - by the authors or others - should address the molecular mechanisms of the interaction of CaBP1 and 2 in regulating Cav1.3 inactivation. 

      Weaknesses: 

      I do not see weaknesses. 

      What is not explained (but was not the aim of the authors) is how the CaBPs 1 and 2 interact with the Cav1.3 channels and with each other to reduce CDI. Also, why DFNB93, which is based on mutation of the CaBP2 gene, lead to a severe phenotype in humans in contrast to the phenotype of the CaBP2 ko mouse.

      We would like to thank the reviewer for the appreciation of our work and the amount of effort that went into these experiments. These are the questions that we are posing ourselves as well and would like to address them in the future.   

      Recommendations for the authors:

      Reviewing editor: 

      In the Introduction, the authors may also mention that Ca2+-dependent and voltage-dependent inactivation of L-type Ca channels has been reported at ribbon synapses of retinal bipolar cells (see von Gersdorff & Mathtews, J Neurosci. 1996, 16(1):115-122). These are critical retinal interneurons involved in the continuous exocytosis of synaptic vesicles onto retinal ganglion cells. 

      We would like to thank the reviewing editor for pointing that out, we have added the reference in the revised version of the manuscript.

      Reviewer #1 (Recommendations For The Authors): 

      Conditions worsen with age but no numbers regarding the threshold shift are provided. 

      For better readability, we now included click threshold values for both genotypes and age groups in the MS text, results section.   

      Do the authors correlate the re-expression level of CaBP2 using GFP to the rescuing phenotype (for exocytosis or BK channels immunostaining)?

      The restoration of BK expression in the virus-treated IHC was a side observation of our study, which was not performed in sufficient replicates for proper quantification. In the future, we will address this question into greater detail, possibly with improved viral constructs. In a previous study, we attempted to correlate eGFP fluorescence intensity with residual depolarization-evoked calcium current in CaBP2-injected IHC of Cabp2 single KO animals. At that time, we were unable to establish a convincing correlation. This could be related to (i) large variability in the data, possibly requiring much larger datasets to observe potential correlation above the noise, (ii) variable imaging conditions from prep to prep, or (iii) additional parameters that could influence the outcome of the current rescue, e.g. uncontrolled expression of the transgene. However, we did analyse the correlation between ABR click thresholds and mean IHC eGFP fluorescence in another, preliminary set of data that included different viruses at different titres. There, we were able to observe a relatively good correlation. Interestingly, some of the highest expression levels resulted in poorer threshold recovery, which could indicate harmful overexpression. Moreover, the correlation was only detected when the difference of the mean eGFP expression levels per organ was large. Furthermore, significantly less efficient ABR threshold recovery was observed in the non-injected contralateral ears, which showed a significantly lower viral expression of the transgene. In our follow-up study, we will investigate the question of dose dependence of rescue in more detail.  

      Reviewer #2 (Recommendations For The Authors): 

      -  There are two paragraphs in the results text about supplemental figure #2, which suggests that it should be moved to the main figures. 

      We would like to thank the reviewer for this suggestion. Figure S2 has now been moved to the main figures (as current Figure 5) and has been modified to accommodate the BK cluster analysis panel. The histogram with the number of ribbon synapses was removed as the data was redundant with the numbers given in the MS text.  

      -  Overall it is hard to distinguish between dark blue and black in many figures, including the dual-color asterisks.

      To improve the readability and clarity of the figures, we exchanged dark blue with magenta.  Dual-color asterisks in Fig. 3 were changed to single-color asterisks and what they refer to is explain in the figure legend.  

      -  Figure 4 legend - there is a mis-spelling of cabp in the fourth line from the bottom. 

      -  Figure 4 legend - the last line does not make sense - describes recovery as being both 'much faster' and 'slowest'.

      -  Figure 6 title - consider removing 'nearly blocked' and replacing it with 'impaired'.

      We would like to thank the reviewer for noticing these mistakes that have been corrected in the revised version, as suggested.

      -  The calculations of VDI and CDI could be better explained, specifically detailing that VDI is calculated first from currents using barium as a divalent, followed by the calculation of CDI. 

      We included an explanatory sentence in the results section as suggested and are additionally referring the readers to the methods section for the mathematical formulas.

      -  Why were two different tests (one parametric and one non-parametric) used for the Figure 3B data? 

      We performed a point-by-point-comparison of data. The choice of test was made based on the distribution and the variance of the data points. We now opted for a unified test, t test with Welch correction, which assumes that samples come from populations with normal distribution, but does not make assumption about equal variances. The outcome of these tests were similar. 

      -  The much broader tuning of the auditory nerve fibers is interesting, consider including this in a figure. 

      For recording tuning curves, we use an automated algorithm which adapts the tone burst intensity and frequency depending on the preceding results. The threshold criterion is an increase of spiking by 20Hz above spontaneous rate. This routine works fairly well in wild-type animals. However, DKO SGNs typically had very high thresholds at >80 dB across all frequencies, which can partly be explained by the fact that they had very low spike rates and did not reach that criterion. Besides tuning curve runs, we also tried systematic frequency sweeps and manual frequency control to determine a best frequency, followed by a rate intensity function at that frequency to determine “best threshold”. 

      All this was difficult, because in the DKO SGNs, sound threshold detection was challenged by the strong dependence of spiking on the duration of the preceding silent interval. A preceding stimulus outside the frequency response area or below the activation threshold of the SGN would thus improve spiking by allowing for longer recovery, while a preceding efficient stimulus would reduce it. Thus, the sound threshold determined in a rate level sweep varied depending on the interstimulus interval and possibly even on the (randomized) order at which the intensities were played. 

      A meaningful threshold measure would require long silent interstimulus intervals, i.e. a long recording time. As tuning curves require multiple threshold measures, it seemed impossible to obtain a useful dataset at high quality. As we deemed the spike rate dependence on interstimulus intervals more important than the tuning we rather focused on tone burst responses acquired at frequency/intensity combinations at which the hair cells and their synapses were maximally activated. In wild-types, these would be tone bursts at characteristic frequency or noise bursts in the saturated part of the rate intensity function, which typically has a dynamic range of 10-25dB. As we assume (based on DPOAE) that cochlear micromechanics and amplification are mostly normal in the DKOs, we hypothesize that the sensitivity and dynamic range of basilar membrane motion and  inner hair cell transduction are normal and that the increase in single unit thresholds and loss of sharp tuning are another readout of synaptic dysfunction. 

      - Figure S2 - please show separate panels for each channel, it is very difficult to make out the changes by eye in the merged panels. 

      Done.  

      - Figure S2 G - the results text stated that the BK channel clusters 'appeared' smaller - why was this not measured? 

      We have performed additional experiments to enable proper analysis of the BK channel clusters. The analysed data shows that the BK clusters are considerably larger and more abundant in the WT as compared to CaBP1/2-deficient IHCs of approx. 4-week-old mice. The results of the analysis are included in the immunohistochemistry figure (now Fig. 5) and are further commented in the results section.  

      Reviewer #3 (Recommendations For The Authors): 

      I have only a few minor points on the MS: 

      (1) Some labels in Figure 1 are too small and hard to read, e.g. y-axis in B-F. Wherever you use subscripts on the axes, the labeling needs to be larger.

      (2) Fig. 1A: the colors for CaM and CaBP1.2 are too similar, at least on my printout. Please use more distant colors.

      (3) Reference 24 should be corrected (no longer in press).

      These points have been addressed in the revised version of the MS.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1:

      Comment 1:

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      Despite the merits of the paper, readers may get confused as to whether authors are referring to “age at breast cancer onset” or “age at breast cancer diagnosis”. I suppose the title refers to the latter, in which case it will be best to be consistent in using “age at breast cancer diagnosis” throughout the manuscripts. I would recommend a revision to the title to make it explicit that the authors are referring to “age at breast cancer diagnosis”.

      Thank you for your nice comments and suggestions. Yes, as you mentioned, in this study, we focused on age at breast cancer diagnosis, which was obtained from the cancer registry data in the UK Biobank and was used in all the analyses. We agree with you that it would be better to consistently use “age at diagnosis of breast cancer” throughout the manuscripts for a better understanding; therefore, we have replaced “age at breast cancer onset” with “age at diagnosis of breast cancer”.

      Change in the manuscript:

      “Age at breast cancer onset” was replaced with “age at diagnosis of breast cancer” in the title and throughout the manuscripts.

      Recommendations For The Authors:

      Kindly review the references for the location of the full stop. Putting the full stop at the end of the parenthesis makes reading smother than its current form as it is difficult to know when the new sentence begins.

      Thank you for your suggestion. We have made revisions to the location of the full stop next to a reference.

      Change in the manuscript:

      The full stop was put at the end of the parenthesis of a reference throughout the manuscripts.

      Response to Reviewer #2:

      Comment 1:

      This is a well-presented large analysis from the UK Biobank of nearly 250,000 female adults. The authors examined the associations of breast cancer diagnosis with incident myocardial infarction and heart failure by different onset age groups. Based on results from a series of statistical analyses, the authors concluded that younger onset age of breast cancer was associated with myocardial infarction and heart failure, highlighting the necessity of careful monitoring of cardiovascular status in women diagnosed with breast cancer, especially those younger ones.

      Comments to consider:

      It’s thoughtful for the authors to have included and adjusted for menopausal status, breast cancer surgery, and hormone replacement therapy in their sensitivity analysis. It would be informative if the authors presented the number and percentages of menopause and cancer treatments.

      Thank you for your comments. As suggested, we have provided more detailed information on the number and percentage of menopausal status and breast cancer treatments.

      Change in the manuscript:

      Page 11, Lines 208 to 211: added “Among participants with breast cancer, 11 460 (70.6%) participants were postmenopausal, 14 255 (87.6%) participants had undergone breast cancer surgery, and 6 784 (41.8%) participants had received hormone replacement therapy.”

      Change in the supplementary material:

      The number and percentage of menopausal status, breast cancer surgery, and hormone replacement therapy were added to Table S13.

      aAdjusted for age, ethnicity, education, current smoking, current drinking, obesity, exercise, low-density lipoprotein cholesterol, depressed mood, hypertension, diabetes, antihypertensive drug use, antidiabetic drug use, statin use, menopausal status, breast cancer surgery, and hormone replacement therapy.

      HR, hazard ratio; CI, confidence interval.

      Comment 2:

      The analytical baseline used for follow-up should be pointed out in the methods section. It’s confusing whether the analytic baseline was defined as the study baseline or the time at breast cancer diagnosis.

      We apologize for the confusion. In this study, the analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010) and we have pointed it out in the methods section as suggested.

      Change in the manuscript:

      Page 9, Lines 165 to 166: added: “The analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010).”

      Comment 3:

      Did the older onset age group have a longer follow-up duration? Could the authors provide information on the length of follow-up by age of onset in Supplementary Table S4? It would give the readers more information regarding different age groups.

      Thank you for your question. We compared the time of follow-up among the three diagnosis age groups and found that although the durations of follow-up among the three groups were quite similar (as shown in Table S4), statistical analysis revealed a significant difference with the older diagnosis age group demonstrating a longer follow-up duration (P for Kruskal-Wallis test <0.001). This is understandable as with large sample sizes, even a slight difference could lead to statistical significance. According to your suggestion, we have added information on the length of follow-up by age of diagnosis in Supplementary Table S4.

      Change in the supplementary material:

      Added the median and interquartile range of follow-up in Supplementary Table S4.

      The results are presented as the mean ± standard deviation, or No. (%).

      aThe effect sizes are standardized mean differences for continuous outcomes and the Phi coefficient for dichotomous outcomes.

      LDL-C, low-density lipoprotein cholesterol.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study addresses the temporal patterning of a specific Drosophila CNS neuroblast lineage, focusing on its larval development. They find that a temporal cascade, involving the Imp and Syb genes changes the fate of one daughter cell/branch, from glioblast (GB) to programmed cell death (PCD), as well as gates the decommissioning of the NB at the end of neurogenesis.

      I believe there are some inaccuracies in this summary. We address temporal patterning during larval and pupal stages until the adult stage. The Imp and Syp genes change the fate of one daughter cell/branch from survival to programmed cell death (PCD). The change from glioblast (GB) to PCD, which occurs at an early time point, is not addressed here. The main point of the paper is missing:

      • Last-born MNs undergo apoptosis due to their failure to express a functional TF code, and this code is post-transcriptionally regulated by the opposite expression of Imp and Syp in immature MNs.

      Reviewer #2 (Public Review):

      Summary:

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators.

      The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

      I cannot summarize better the paper.

      Strengths:

      A major strength is the use of a genetic labeling tool that allows the authors to specifically analyze and manipulate one neuronal lineage. This allows for simultaneous study of both the progenitors and post-mitotic progeny. As a result the paper conveys a lot of useful information for this particular neuronal lineage. Furthermore addressing the association of cell fate specification, taking advantage of this lab's extensive prior work in the system, with developmentally-regulated programmed celldeath is an important contribution to the field.

      Beyond Imp/Syp, additional characterization of this model system is provided in characterizing a previously unrecognized death of a hemilineage in early-born neurons.

      Thanks!

      Weaknesses:

      The main observations that distinguish this study from others that have investigated Imp/Syp in the fly nervous system is the role played in late-born post-mitotic neurons to regulate programmed cell death. This is an important and plausible (based on the presented findings) newly discovered role for these proteins. However the precision of experiments is not particularly strong, which limits the authors claims. The genetic strategy used to manipulate Imp/Syp or the TF code appears to be done throughout the entire lineage, or all neuronal progeny, and not restricted to only the late born cells. Can the authors rule out survival of the early born hemi-lineage normally fated to die? Therefore statements such as this: 

      To further investigate this possibility, we used the MARCM technique to change the TF code of lastborn MNs without affecting the expression of Imp and Syp should be qualified to specify that the result is obtained by misexpressing these factors throughout the entire lineage.

      We agree that our genetic manipulations affect the entire lineage or all neuronal progeny. We do not have genetic tools to gain such precision. We have changed our descriptions to specify the entire lineage or all neuronal progeny. As the reviewer raised, we were also concerned about the possibility that the overexpression of Imp or knockdown of Syp could induce the survival of the early-born hemilineage. We have two experiments that rule out this possibility:

      (1) In late LL3 larvae, Imp OE or syp MARCM clones do not change the number of cells in LL3 larvae (see Guan et al., 2022), indicating that the hemilineage that died by PCD is not affected. If Imp or Syp played a role in the survival of the hemilineage, we would see at least a 50% increase in the number of MNs at this stage.

      (2) The MARCM experiment using the VGlut driver to overexpress P35 or Imp allows us to manipulate only elav+ VGlut+ neurons. The hemilineage removed by PCD is elav- VGlut- and is not affected by this experiment. Consequently, the increase in MNs in adults with genetic manipulation can only be the result of the survival of the other hemilineage (elav+, VGlut+). Moreover, this experiment shows an increase in the number of neurons in the adult but not in LL3, demonstrating that the hemilineage (elav- VGlut-) is still removed by PCD with this genetic manipulation.

      The authors make an observation that differs from other systems in which Imp/Syp have been studied: that the expression of the two proteins appears to be independent and not influenced by cross-regulation. However there is a lack of investigation as to what effect this may have on how Imp/Syp regulate temporal identity. A key implication of the previously observed cross-regulation in the fly mushroom body is that the ratio of Imp/Syp could change over the life of the NB which would permit different neuronal identities. Without cross-regulation, do the authors still observe a gradient in the expression pattern of time? Because the data is presented with Imp and Syp stained in different brain samples, and without quantification across different stages, this is unclear. The authors use the term 'gradient' but changes in levels of these factors are not evident from the presented data.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time using smFISH. We have also quantified the relative expression of Imp and Syp protein in the NB over time by co-immunostaining. Additionally, we quantified the relative expression of Imp and Syp protein in postmitotic neurons as a function of their birth order in late LL3 larvae. All these data show an opposite temporal gradient of Imp and Syp in the NB and an opposite spatial gradient in immature neurons according to their birth order (Figure. 4). How these gradients are established in our system remains to be elucidated. 

      Reviewer #3 (Public Review):

      This study by Guan and co-workers focuses on a model neuronal lineage in the developing Drosophila nervous system, revealing interesting aspects about: a) the generation of supernumerary cells, later destined for apoptosis; and, b) new insights into the mechanisms that regulate this process. The two RNA-binding proteins, Imp and Syp, are shown to be expressed in temporally largely complementary patterns, their expression defining early vs later born neurons in this lineage, and thus also regulating the apoptotic elimination. Moreover, neuronal 'fate' transcription factors that are downstream of Imp and signatures of early-born neurons, can also be sufficient to convert later born cells to an earlier 'fate', including survival.

      The authors provide solid evidence for most of their statements, including the temporal windows during which the early and the later-born motoneurons are generated by this model lineage, how this relates to patterns of cell death by apoptosis and that mis-expression of early-born transcription factors in later-born cells can be sufficient to block apoptosis (part of, and perhaps indicative of the late-born identity).

      Other studies have previously outlined analogous, mutually antagonistic roles for Imp and Syp during nervous system development in Drosophila, in different parts and at different stages, with which the working model of this study aligns.

      Overall, this study adds to and extends current working models and evidence on the developmental mechanisms that underlie temporal cell fate decisions.

      I cannot summarize better the paper.

      Reviewer #1 (Recommendations For The Authors):

      While this is an interesting topic, I raised two issues in my original review.

      (1) Against the backdrop of numerous previous studies linking many developmental regulators, including tTFs, to programmed cell death in the developing CNS, which in several cases have involved identifying key PCD genes and decoding the molecular regulatory interplay between regulators and PCD genes, this study does not provide any new insight into the regulation of developmental PCD in the CNS.

      The authors have not added any new data to address this shortcoming.

      I agree with the reviewer that we did not attempt to link Imp/Syp with the temporal transcription factor (tTF) cascade or spatial selectors such as Hox genes. However, this decision was intentional as our primary focus was on studying immature MNs. It is worth noting that the decommissioning of NBs by autophagic cell death or terminal differentiation, which is mediated by Imp/Syp in other lineages, has not been correlated with tTFs or spatial selectors. Although we have not directly examined the involvement of the hb + sv > kr > pdm > cas > cas-svp > Grh cascade in the decommissioning of the Lin A neuroblast, our preliminary data indicate that Hb, Sv, Pdm, and Cas are not expressed in the Lin A NB, while Grh is consistently expressed in the NB (Wenyue et al., 2022). Thus, it is less likely that this particular tTF cascade is not implicated in Lin A neuroblast decommissioning. In contrast, spatial selectors, such as the Hox gene Antp, play an opposing role compared to HOX transcription factors in abdominal NBs. In the Lin A lineage, Antp promotes survival (Baek, Enriquez, & Mann, 2013). Here, to avoid repeating what has already been described in the literature, we focused on the role of Imp/Syp in postmitotic neurons and revealed that the precise elimination of MNs is linked to the control of TFs expressed in the MNs.

      (2) I raised the issue that it is unclear if Imp/Syp acts in the NB, and/or in IMC/GMC, and/or in the daughter cells generated from these.

      I agree with the reviewer's concern regarding the unclear function of Imp/Syp, i.e., whether it acts in the NB, IMC/GMC, or daughter cells. To address this, one possible approach would be to attempt rescuing Imp and Syp mutants by transgenic expression in specific cell types, such as NBs, IMC/GMC, or GB/daughter cells. However, we have not conducted such experiments as we were skeptical about the outcome. Previous published work has used drivers expressed in NBs, IMC/GMC, or postmitotic neurons to decipher the function of a gene in a specific cell type. But the results of these experiments must be taken with caution. Using NB/GMC drivers to study gene function can lead to effects not only in the NB but also in its progeny, including GMC or postmitotic neurons, due to the perdurance and stability of the Gal4 and UAS-gene expression system. For instance, dpn-Gal4 UASGFP not only labels the NB but also many of its progeny, even if Dpn is only expressed in NBs. And elav-Gal4 is expressed in the NB and GMCs.

      However, our overexpression of Imp in immature neurons using Vglut demonstrates that Imp promotes cell survival through an autonomous function in these neurons. This driver is only expressed in postmitotic neurons (elav+) and not in the NB, IMC/GMC, or in the hemilineage eliminated by cell death (elav-vglut-).

      Reviewer #2 (Recommendations For The Authors):

      Oddly knockdown of Imp in the neuroblast (Fig. 5D) only led to death at 8h APF, when Imp is no longer expressed. Do the authors have an explanation as to how the stem cell can survive until this point? A discussion would be helpful.

      The simple explanation is the efficiency of RNAi. The imp-/- MARCM clones (Guan et al., 2022) lead to a stronger reduction of MNs in LL3.

      A simple experiment I would recommend is to repeat the antibody stainings of staged larvae/pupae (Fig. 4) having the anti-Imp/Syp antibodies in the same brain sample, and perhaps a quantification of the ratio in the NB. Given the species in which the ABs were raised seem compatible, this should be feasible. As it stands now, there is no indication of whether the ratio of Imp vs Syp change over time.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time and quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be 

      Minor errors/suggestions:

      Fig 4. Time legend at the top goes A, B, C, E, F (no D). So it doesn't match the panels below

      Yes, we have made the corrections.

      Sentence repeated in Intro:

      The process of terminating NB neurogenesis through autophagic cell death or terminal differentiation is commonly referred to as decommissioning.

      Yes, corrections have been made.

      IN FIGURE 1 THEY SAY 'TYPE IB' AND IN FIGURE 2 THEY SAY 'TYPE 1B'

      We have changed it to type 1b.

      In Fig2A-It's hard to see lack of Elav and Fig2G-It's hard to see presence of Dcp1. Panels could be adjusted to emphasize these results

      We have increased the size of the panels and made two separate panels where only the elav and Dcp1 signals are present.

      Observations that the result is equivalent in all thoracic segments is expected, since all legs need the same number of neurons. This is nice to have but can be in the supplement.

      Overall the figure number seems excessive, especially considering much of the results included(particularly the NB results) are findings consistent with previous papers and some is characterization of the system that does not fit well with the main focus regarding Imp/Syp (i.e death of one hemi-lineage:

      Figure 5 and 6 can be joined as one.

      We have combined Figures 5 and 6, showing only the T1 segments.

      There is some discrepancy between graphs Fig7F and K: At LL3 the number of neurons is different for the control in 7F and the count in K

      Yes, because the genetic backgrounds are not the same and we are not counting the same type of cells. In 7F, we are counting the elav+ and VGlut+ cells, whereas in Figure 7K, we are counting all the elav+ in Lin A, including those elav+ VGlut-. VGlut expression arrives a bit later after elav+, which is why we have fewer elav+ cells in 7F. In other words, VGlut MARCM clones do not label all Lin A elav+ cells. I have clarified this in the figure.

      Reviewer #3 (Recommendations For The Authors):

      Main comment: on the notion of Imp and Syp gradients:

      p. 5, related to figure 4 - there are clearly distinct windows for predominantly (if not exclusively) Imp, and later, Syp expression in lineage 15, with a phase of co-expression.

      However, based on the data shown, it is unclear whether these windows represent gradients, as repeatedly stated. If the notion of gradients is derived from other studies, on other lineages, then this would be good to clarify. Alternatively, the idea of temporally opposing gradients of Imp and Syp would need to be demonstrated for this lineage.

      For example, a more accurate way to describe this study's data is given on p.7 "In conclusion, our findings demonstrate that the opposite expression pattern of Imp and Syp in postmitotic neurons precisely shapes the size of Lin A/15 lineage by controlling the pattern of PCD in immature MNs (Fig. 8)."

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be identified.

      Minor points:

      p.6, related to figure 7: Are numbers of EDU- early born and EDU+, late born, MNs expressed as means in the main text? As written, it suggests absence of any variability, which one would expect and which is shown in Fig.7 data.

      Yes, we have added averages in the text.

      Methods: the author name 'Lacin' has been mis-spelled

      Sorry about that, it's been corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper proposes a valuable new method for the assessment of the mean kurtosis for diffusional kurtosis imaging by utilizing a recently introduced sub-diffusion model. The evidence supporting the claims that this technique is robust and accurate in brain imaging is incomplete. The work could be of interest in the research and clinical arena.

      We thank the editors for their assessment and the reviewers for their careful reading and feedback that helped to improve the manuscript. We have addressed all the reviewers’ concerns and would like to request an update of the assessment to reflect the revisions we have made.

      Below, we address the reviewers’ comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study introduces an innovative method for assessing the mean kurtosis, utilizing the mathematical foundation of the sub-diffusion framework. In particular, a new fitting technique that incorporates two different diffusion times is proposed to estimate the parameters of the sub-diffusion model. The evaluation of this technique, which generates kurtosis maps based on the sub-diffusion framework, is conducted through simulations and the examination of data obtained from human subjects.

      We thank Reviewer #1 for pointing out the novelty and innovation of our work.

      Strengths:

      The utilization of the sub-diffusion model for tissue characterization is a significant conceptual advancement for the field of diffusion MRI. This study adeptly harnesses this approach for an accurate estimation of the parameters of the widely employed diffusion model, DKI, leveraging their established analytical interconnection as evidenced in prior research. Notably, this approach not only proposes a robust, fast, and accurate technique for DKI parameter estimation but also underscores the viability of deploying the sub-diffusion model for tissue characterization, substantiated by both simulated and human subject analyses. The paper is very-well written; well-organized; and coherent. The simulation study included different aspects of water diffusion as captured by diffusion-weighted MRI such as varying diffusion times and different b-value subpopulations, resulting in a comprehensive and thorough discussion.

      We thank Reviewer #1 for highlighting the the strengths of our work.

      Weaknesses:

      The primary objective of this study is to demonstrate a robust approach for estimating DKI parameters by directly calculating them using the parameters of the sub-diffusion model. This premise, however, relies on the assumption that the sub-diffusion model effectively characterizes the diffusion MRI signal and that its parameters are both robust and accurate. Throughout the manuscript, the term "ground truth kurtosis K" is frequently used to denote the "true K" value in the context of the simulation study. Nonetheless, given that the data is simulated using the new sub-diffusion model - an approximation of the DKI-based signal expression- this value cannot truly be considered the "ground truth K". The simulation study highlights the robustness and accuracy of D* and K*, but it inherently operates under the assumption that the observed data is in the form of the sub-diffusion model.

      It is correct that our study operates under the assumption that the observed data is in the form of the sub-diffusion model, and indeed one of the key outcomes of this work is to demonstrate the effectiveness of that assumption and the new possibilities it brings. Naturally, using any mathematical model at all carries assumptions. Over the past two decades, many mathematical and biophysical models have been proposed to characterise diffusion MRI signals. However, model validation remains an open challenge in the field. In this, as well as in our previous work (Yang et al, NeuroImage, 2022), we have shown that our proposed sub-diffusion model not only provides a much better fitting compared to the traditional DKI method, overcoming the major limitation of the traditional DKI method on the maximum b-value, but also generates brain maps with superior tissue contrast and elucidates previously unseen structure.

      We have replaced the term “ground truth kurtosis K” with “true kurtosis K”.

      The comment “… using the new sub-diffusion model – an approximation of the DKI-based signal expression…” is a bit misleading. In fact we propose that the reverse interpretation is the more suitable way to view the relationship: the DKI model is a degree-2 approximation of the sub-diffusion model, as in eq. (7).

      Reviewer #2 (Public Review):

      Summary: The authors present a technique for fitting diffusion magnetic resonance images (dMRI) to a sub-diffusion model of the diffusion process within brain imaging. The authors suggest that their technique provides robust and accurate calculation of diffusional kurtosis imaging parameters from which high quality images can be calculated from short dMRI data acquisitions at two diffusion times.

      Strengths: If the authors can show that the dMRI signal in brain tissue follows a sub-diffusion model decay curve then their technique for accurately and robustly calculating diffusional kurtosis parameters from multiple diffusion times would be of benefit for tissue microstructural imaging in research and clinical arenas.

      In Figure 7, we showed that the diffusion MRI signals follow the sub-diffusion model decay curves.

      Weaknesses: The applied sub-diffusion model has two parameters that are invariant to diffusion time, D_β and β which are used to calculate the diffusional kurtosis measures of a diffusion time dependent D* and a diffusion time invariant K*. However, the authors do not demonstrate that the D_β, β and K* parameters are invariant to diffusion time in brain tissue.

      In our proposed sub-diffusion model, D_β and β are assumed to be time-independent parameters, which is a key strength of the approach. The goal is to characterise tissue-specific properties (D_β for diffusivity and β for the extent of tissue complexity) that do not rely on the diffusion time setting in diffusion MRI experiments. To extract such time-independent properties, we proposed a new sampling and fitting strategy – fitting at least two diffusion time data together.

      The authors' results visually show that there is time dependence of the K* measure (in Figure 6) that is more apparent in white matter with K* values being higher for diffusion times of ∆=49 ms than ∆ = 19 ms. The diffusion time dependence of K* indicates there is also diffusion time dependence of β.

      The discrepancies in the fitted K* for ∆ = 19 ms and ∆ = 49 ms separately do not necessarily imply that there is a true time dependence in these parameters. Rather, this can be explained by a deficiency of data when fitting a two-dimensional surface (S is a function of q and ∆) based on data along a single curve for a fixed value of ∆.  Without properly sampling the surface across two independent coordinates, one cannot expect a fully reliable fit.  Indeed, a great advantage of our proposed method is to allow fitting data with multiple values of ∆, and thereby getting a richer data set with which to fit the full signal surface S(q, ∆).  The results for fitting ∆ = 19 ms and ∆= 49 ms data together clearly show the benefits of this approach, with superior contrast achieved.

      Furthermore, Figure 7 shows that there is a tissue specific root mean squared error in model fitting over the two diffusion times which indicates greater deviation from the model fit in white matter than grey matter.

      Although the errors are not completely tissue-independent, please note the magnitude of the RMSE is very small. The quality of the fitting in both white and grey matter is shown in sub-figures (A)-(H) for several representative voxels.

      To show that the sub-diffusion model is robust and accurate (and consequently that K* is robust and accurate) the authors would have to demonstrate that there is no diffusion time-dependence in both D_β and β in application to brain imaging data for each diffusion time separately. Simulated data should not be used to demonstrate the robustness and accuracy of the sub-diffusion model or to determine optimization of dMRI acquisition parameters without first demonstrating that D_β and β are invariant to diffusion time. This is because simulated signals calculated by using the sub-diffusion characteristic equation of dMRI signal decay will necessarily have diffusion time invariant D_β and β parameters. Without further information demonstrating diffusion time invariance of D_β, β and K* it is not possible to determine whether the authors have achieved their aims or that their results support their conclusions.

      First, as explained above, the dMRI signal S is a function of q and ∆, i.e., a two-dimensional surface S(q, ∆), and hence fitting data sampled from single diffusion time (i.e., one curve on the surface) cannot provide reliable parameters, as seen in the discrepancies in K* in Figure 6 (bottom two rows). Our proposed new sampling and fitting strategy overcomes this issue. That is, to obtain a reliable fitting, one should fit data from at least two diffusion times together (i.e., sampling data from at least two curves on the signal surface).

      Second, to demonstrate that D_β and β are time invariant, one would require data at several diffusion times with high b values. Such data cannot be easily obtained. The data used in this current study is the MGH Connectome 1.0 human brain data, which only contains two diffusion times, ∆ = 19 ms and ∆ = 49 ms.

      Hence, we conducted numerical experiments to demonstrate our idea. In Figure 3, we showed that (i) the variability of the fitted parameters is significantly reduced when moving from fitting single diffusion time data to two diffusion time data, and (ii) the difference in fitting three diffusion times compared to two is very minor, indicating convergence towards the correct time-independent parameter values. The results from fitting human brain data (Figure 6 and Tables 2-4) agree with the expectations from our numerical experiments. Hence, we believe that we have provided sufficient evidence to support our proposed sub-diffusion model and its optimal fitting strategy.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is clear that the authors preferred generating the data by using sub-diffusion model's signal expression as it has many benefits, such as allowing different diffusion times to be incorporated, and hence investigation of the effect of the number of diffusion times on the accuracy of the parameter fitting. I recommend adding another simulation study by generating the data with the DKI model expression (as the goal of the study is to provide an accurate mapping of diffusional mean kurtosis), fitting the data to the sub-diffusion model's expression in Eq. (10), and then calculating K* and D* by Eqs. (8) and (9) only for a fixed diffusion time and one b-value subset.

      We appreciate the suggestion. However, unfortunately it is not appropriate to generate data with the DKI model, as the maximum b-value is limited to 2000~3000s/mm^2 and hence the DKI model cannot represent diffusion MRI signals from a full spectrum of b-values. A key strength of our proposed model is that it removes this limitation.

      There is a typo on Page 24, Line 581; "b<=2400" should be b>=2400.

      We have fixed this typo.

      Reviewer #2 (Recommendations For The Authors):

      As the authors state the sub-diffusion model has two parameters, D_β and β that are invariant to diffusion time, and give rise to a time-varying diffusion coefficient in mm^2s^-1 and a time invariant kurtosis. However, there is a need to be clearer and more specific about the implications of the sub-diffusion model. The manuscript would be improved by the authors:

      (a) Defining the time-varying diffusion coefficient that arises from the model, its functional form and properties.

      We refer Reviewer#2 to eq.(5) and eq.(8) for the definition of time-varying diffusion coefficients D* and D_SUB and their relationship.

      (b) Clearly discuss the implications of this with respect to other time-varying diffusion coefficient methods in the current literature.

      We refer Reviewer#2 to the section “Time-dependence of diffusivity and kurtosis” under “Discussions”.

      (c) Demonstrating that D_β and β do not vary with diffusion time when estimated from dMRI acquired on human participants.

      We have addressed this comment in the public review.

      The manuscript would benefit from increases in clarity in all sections and the authors identifying typographical errors.

      We have updated the relevant text in the revised manuscript to make it clearer, including fixing typos.

      Specific improvements to clarity in the methods and results section would include:

      Line 620: Why were parameter approximations for model fitting to simulated data restricted to the ranges D_β∈[10^(-4),10^(-3) ] and β∈[0.5,1] but in fitting to brain imaging data the ranges were D_β>0 and 0<β<=1.

      The parameter ranges for model fitting to both the simulated and human data were set to the same: D_β>0 and 0<β<=1. To generate simulated data, D_β and β ranges were restricted to reflect observations in human brain data. We have updated the text to make this clearer.

      Lines 622, 628 & 629: Which goodness of fit measure was used?

      The goodness of fit measure for all simulated results is the coefficient of determination, or R^2 value, as noted in the “Goodness-of-fit and region-based statistical analysis” section under Methods. We have updated the text to make this clearer.

      Line 666: The method for computation of R^2 within the coefficient of determination should be stated as there are several ways of calculating an R^2 value.

      The formula for computing R^2 has been added to the text.

      Line 685: A t-test is mentioned but it is not clear as to the inputs to this test, or where the results of this analysis are presented.

      We have updated the text to make this clearer. The results of this analysis are presented in Table 5. The entries identified in italic under the optimal b-value heading were found to be significantly different from the benchmark mean K* reported in Table 2.

      Line 696: It is not clear how the intra-class correlation coefficient histograms are computed from six subjects. This applies to results in Figure 10 that require greater clarity in the description.

      The formula for computing the intra-class correlation coefficient has been added to the sub-section “Scan-rescan analysis using intraclass correlation coefficient (ICC)” under “Methods”.

      It would be helpful if the authors primarily report results pertaining to the model parameters D_β and β. This is because D* and K* are calculated from D_β and β. Conditions for robust and accurate estimation of D_β and β will provide robust and accurate measures for D* and K*.

      Two new tables for the model parameters D_β and β have been added. Please see Tables 3 and 4 in the revised manuscript.

      The authors state that fitted model parameters are not affected by maximum b-value (paragraph beginning line 366). This statement is based on their model simulation results. Could the authors provide data to support this based on the application of their model to the human brain imaging data?

      We would like to clarify that our statement is indeed based on human brain imaging. As stated in the paragraph beginning line 366, both results in Table 2 (using full dataset) and Table 5 (using dataset with optimal b-value sampling) are generated from the Connectome human brain data. If maximum b-value dependence is present, benchmark (Table 2) versus optimal region-specific results (Table 5, or previously Table 3) should show some systematic difference.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth considering and exploring further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new Figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phases relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirps that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      We thank the author for the comments and we agree that the approach could have been better detailed. As anticipated by the Reviewer, the Boundary Element Method (BEM) model can be used simply to calculate the electric field and electric image at a specific point in time (instantaneously), regardless of EOD frequency. However, our model allows for the concatenation of consecutive instants and thus is able to render an entire sequence of electric fields - and resulting electric images - incorporating realistic EOD characteristics such as shape, duration, and frequencies (see Pedraja et al., 2014).

      Chirp-triggered EIs were modeled using real chirps produced by interacting fish. Each chirp was thus associated to its duration and peak parameters, as well as the fish positional information (distance and angle). 

      However, since we did not know the beat phase at which chirps were produced, we computed electric images for each fish position and chirp scenario by simulating various phases (here referred to the initial offset of the two EODs, set at 4 phases, equally spaced). These are intended as phases of the sender EOD and simply refer to the initial OFFSET between the two interacting EODs. However, since our simulations were run over a time window of 500 msec, all phases are likely to be covered, with a different temporal order relative to the chirp (always centered within the 500 msec).

      The simulation was run maintaining consistent timing for both chirp and non-chirp conditions, across approximately 800 body nodes. At each node, the current flow was calculated from the peak-to-peak of the EOD sum (i.e. the point-to-point of the difference between the beat positive and negative envelopes). Analyzing the EIs over this fixed time window enables us to assess the unitary changes of current flow induced by chirps over units of time (ΔI/Δt). From this, we can calculate a cumulative sum of current flow changes - expressed as delta(EI) and use it to show the effect of the chirps on the spatiotemporal EI (Figure 7C).

      One can express this cumulative change mapped onto the fish body (keeping the 800 points separated, as in Figure 7C) or further sum the current changes to obtain a single total (as shown in Figure 7D).

      One can check this by considering that a sum for example of a set of 500/800 points - judging from the size of the blue areas in C not all 800 points have a detectable change - each valued 0.1-to-0.3 mA/s, one could get circa 100 mA/s, which is what is shown in D. (is this what is happening ?)

      We do not know why chirps of different types triggered similar effects. It is possible that, since EI measurements are pooled over several chirps produced at different angles and distances, in case of a lower amount of chirps considered for a given type (as in the case of rises, very low) these measurements may not highlight more marked differences among types. In a publication we are currently working on, we are considering a larger dataset to better assess these results.

      The methods section has been edited to clarify the approach (not yet).

      Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation.

      Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that could have a great impact on the field.

      We thank the Reviewer for the extensive and constructive comments. We would like to add that, while it is true that many detailed studies have been published on the anatomy and physiology of the circuits implicated in the production and modulation of “electric chirps”, most of this  research assumed, and focused exclusively on, their possible role in communication.  In addition, most behavioral studies did the same and a meta-analysis of the existing literature on chirping allows to trace back the communication idea mainly to two studies: Hagedorn and Heiligenberg, 1985 (“Court and spark: electric signals in the courtship and mating of gymnotoid fish”) and Hopkins, 1974 (“Electric Communication: Functions in the Social Behavior of Eigenmannia Virescens”), among the main sources. Importantly, in these studies only contextual observations have been made (no playback experiment or other attempts to analyze more quantitatively the correlation of chirping with other behaviors).

      The authors do provide convincing evidence that chirps may function in homeoactive sensing. However, their evidence arguing against a role for chirps in communication is not as strong, and fails to sufficiently consider the evidence from a large body of existing research. Ultimately, the manuscript presents very interesting data that is sure to stimulate discussion and follow-up studies, but it suffers from dismissing evidence in support of, or consistent with, a communicative function for chirps.

      Although the tone of some statements present in our earlier draft may suggest otherwise, through our revisions, we have made an effort to clarify that we do not intend to dismiss a function of chirps in communication, we only intend to debate and discuss valid alternative hypothesis, advanced from reasonable considerations.

      Before writing this manuscript, we have attempted to survey  literally all the existing literature on chirps (including studies focused on behavior, peripheral sensory physiology as well as brain physiology). Although it is not unlikely that some studies have eluded our attention, an effort for a comprehensive review was made. Based on this survey we realized that none of the studies provided a clear  and  unambiguous piece of evidence to support the communication hypothesis (we refer here to the weak points highlighted in the discussion and mentioned in the previous comment). Which in fact does not come without its weak points and contradictions (see later comments).

      It follows a summary of the mentions made to the communication theory in the different section of the manuscript including several edits we have applied in response to the Reviewer’s concern:

      In the abstract we clearly state that we are considering an alternative that is only hypothetically complementary, not for sure.  Nonetheless, we have identified a couple of instances that could sound dismissive of the “communication hypothesis” in the following section.

      In the introduction we write in fact about the possibility of interference between communication signals and conspecific electrolocation cues, as they are both detected as beat perturbations. We did not mean to use “Interference” here as “reciprocal canceling”, rather we intended it as “partial or more or less conspicuous overlap” in the responses triggered in electroreceptors.

      Hoping to convey a clearer message, we have edited the related statement and changed it to “both types of information are likely to overlap and interact in highly variable ways”.

      We have also removed the statement: “According to this idea, beats and chirps are not only detected through the same input channel, but also used for the same purpose.” as at this point in the manuscript it may be too strong.

      In the results section we do not include statements that might be seen as dismissive of the communication hypothesis but only statements in support of the “probing with chirps” idea (which is the central hypothesis of the study).

      In the discussion paragraphs we elaborate on why the current functional view is either flawed or incomplete (first paragraph “existing functional hypotheses''). Namely: 1)  multiple triggering factors implied in chirp responses covary and need to be disentangled (example DF/ sex), 2) findings on brown ghosts and a few other gymnotiforms have been used to advance the hypothesis of “communication through chirps'' in all weakly electric fish (including pulse species). 3) social encounters - in which chirps are recorded - imply also other behaviors (such as probing) which have not been considered so far. This point is related to the first one on covariates. 4) most studies referring to big chirps as courtship chirps were not done in reproductive animals (added now)  and 5) no causal evidence has been provided so far to justify a role of chirps in social communication.

      We are discussing these points as challenges to the communication hypothesis, not to dismiss the hypothesis, but rather to motivate future studies addressing these challenges.

      We do not want to appear dismissive of the communication hypothesis and had therefore previously edited the manuscript to avoid the impression of exclusivity of the probing hypothesis. We have now gone over the manuscript once more and edited several sentences. Nevertheless, we want to point out again that - despite the large consensus - the communication hypothesis has, until now, never been investigated with the kind of rigor applied here.

      The authors do acknowledge that chirps could function as both a communication and homeactive sensing signal, but it seems clear they wish to argue against the former and for the latter, and the evidence is not yet there to support this.

      In both rounds of revision we have made an effort to convey a more inclusive interpretation of our findings. We tried our best to express our ideas as hypothetical, not as proof that communication through chirps does not exist. The aim of this study is to propose an alternative view, and this cannot be done without underlining the weak points of an existing hypothesis while providing and supporting reasonable arguments in favor of the alternative we advance. The actual evidence for a role of chirping in communication is much less strong than appears from the pure number of articles that have discussed chirps in this context.

      Regarding the weak evidence against communication, here we can list a few additional important points related to the proposed interpretations of chirp function (more specific than those made earlier):

      (1) A formally sound assessment of signal value/meaning - as typically done in animal communication studies should involve: 

      a) the isolation of a naturally occurring signal and determination of the context in which it is produced 

      b) the artificial replication of the signal

      c) the observation that such mimic is capable of triggering reliable and stereotyped responses in a group of individuals (identified by sex and/or species) under the same conditions (conditioned, unconditioned, state-dependent, etc.). As discussed for instance in Bradbury and Vehrencamp, 2011; Laidre and Johnstone, 2013; Wyatt, 2015; Rutz et al., 2023.

      This approach has so far not been applied to weakly electric fish. The initial purpose of the present study was in fact to conduct this type of validation.

      (2) The hypothesis of chirps used for DF-sign discrimination - for “social purposes” - although plausible in the face of theoretical considerations,  does not seem to be reasonable in practice, when one considers emission rates of 150 chirps per minute. We do find a strong correlation of chirp type with DF, which is often very abrupt and sudden (as if the fish were tracking beat frequency to guess its value) but the consideration made above on chirp rates seems to discourage this interpretation.

      (3) The hypothesis of chirp-patterning (i.e. chirping may have meaning based on the sequence of chirps of different types, a bit like syllables in birdsongs) - assessed by only one study conducted in our group - has not been enough substantiated by replication. We have surveyed all possible combinations of chirps produced by interacting pairs in different behavioral conditions using different value for chirp sequence size: 2, 3,... ,8 chirps (both considering the sender alone as well as sender+receiver together). In all cases we found no evidence for  a context dependent “modulation” of chirp types (i.e. no specific chirp type sequence in specific contexts).

      (4) The hypothesized role of “large chirps” as courtship signals could be easily criticized by noting the symmetrical distribution of these events around  a DF of 0 Hz . Although one could argue about a failure to discriminate DF-sign, to explain this well known pattern. However, we know from Walter Heiligenberg’s work and physiological considerations that such task can be solved easily through t-units and … in principle even just by motion (which would change the EOD phase in frequency dependent ways, thus potentially revealing the DF sign).

      Overall, these considerations made us think that certainly chirping occurs in a social context, but it is the meaning of this behavior that remains elusive.  We noticed that environmental factors are also strongly implied … we then formulate an alternative hypothesis to explain chirping but we do so  without dismissing the communication idea.

      All this seems to us just a careful way to critically discuss our results and those of other studies, without considering the issue resolved.

      In the introduction, the authors state, "Since both chirps and positional parameters (such as size, orientation or motion) can only be detected as perturbations of the beat, and via the same electroreceptors, the inputs relaying both types of information are inevitably interfering." I disagree with this statement, which seems to be a key assumption. Both of these features certainly modulate the activity of electroreceptors, but that does not mean those modulations are ambiguous as to their source. You do not know whether the two types of modulations can be unambiguously decoded from electroreceptor afferent population activity.

      We thank the Reviewer for noting this imprecision. We have addressed the Reviewer’s concern in another reply (see above).

      My biggest issue with this manuscript is that it is much too strong in dismissing evidence that chirping correlates with context. In your behavioral observations, you found sex differences in chirping as well as differences between freely interacting and physically separated fish. Chirps tended to occur in close proximity to another fish. Your model of chirp variability found that environmental experience, social experience, and beat frequency (DF) are the most important factors explaining chirp variability. Are these not all considered behavioral or social context? Beat frequency (DF) in particular is heavily downplayed as being a part of "context" but it is a crucial part of the context, as it provides information about the identity of the fish you're interacting with. The authors show quite convincingly that the types of chirps produced do not vary with these contexts, but chirp rates do.

      We believe the “perceived claim” may be an issue of unclear writing. We have now tried to better clarify that “context” affects chirp rates, but it does not affect chirp types as much (except when beat frequency is high).  

      We have edited two statements possibly susceptible to misinterpretation: 

      (1) In the results: “It also indicates that chirp parameters such as duration and FM do not seem to be associated with any particular context in a meaningful way, other than being affected by beat frequency.”

      (2) In the discussion: the statement

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context (Figure S2) although the variance of chirp parameters appears to be significantly affected by this factor (Figure 2). This may suggest that the effect of behavioral context is mainly detectable in the number of chirps produced (Figure S1), rather than the type (Figure S2).”

      has been changed to:

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context, except for those cases characterized by higher beat frequencies  (Figure S2). This suggests that the effect of behavioral context highlighted in our factor analysis (Figure 2) is mainly due to the number of chirps produced (Figure S1), rather than their type (Figure S2).”

      Eventually, in the results we emphasize the relatively higher impact of previously unexplored factors on chirp variance: “The plot of individual chirps (Figure 2C) shows the presence of clustering around different categorical variables and it reveals that experience levels or swimming conditions are important factors affecting chirp distribution (note for instance the large central “breeding” cluster in which fish are divided and the smaller ones in which fish are free). Sender or receiver identity does not individuate any clear clustering relative to either sex (see the overlap of male_s/male_r and female_s/female_r) or social status (dominant/subordinate). Chirps labeled based on tank experience (i.e. resident vs intruder) are instead clearly separated.”.

      Further, in your playback experiments, fish responded differently to small vs. large DFs, males chirped more than females, type 2 chirps became more frequent throughout a playback, and rises tended to occur at the end of a playback. These are all examples of context-dependent behavior.

      We do note that male brown ghosts chirp more than females. But we do also say - and show in figure 8 - that males move more in proximity to and around conspecifics. We do acknowledge that chirp time-course may be different during playbacks in a type-dependent manner. But how this can support the communication hypothesis - or other alternatives - is unclear. This result could equally imply the use of different chirp types for different probing needs. Since we cannot be sure about either, we do not want to put too much emphasis to it. Eventually, the fact that “context” (here meant broadly to define different experimental situations in which social but also physical and environmental parameters are altered) affects chirping is undeniable: cluttered and non-cluttered environments do represent different contexts which differently affect chirping in conspicuous ways.

      In the results, the authors state, "Overall, the majority of chirps were produced by male subjects, in comparable amounts regardless of environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) or social experience (novel or experienced; Figure S1D)." This is not what is shown in Figure S1. S1A shows clear differences between resident vs. intruder males, S1B shows clear differences between dominant vs. subordinate males, and S1D shows clear differences between naïve and experienced males. The analysis shown in Figure 2 would seem to support this. Indeed, the authors state, "Overall, this analysis indicated that environmental and social experience, together with beat frequency (DF) are the most important factors explaining chirp variability."

      The Reviewer is right in pointing at this imprecise reference and we are grateful for spotting this incongruence. The writing refers probably to an earlier version of the figure in which data were grouped and analyzed differently. We now edited the text and changed it to: “Overall, the majority of chirps were produced by male subjects, at rates that seemed  affected by environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) and social experience (novel or experienced; Figure S1D).”

      The choice of chirp type varied widely between individuals but was relatively consistent within individuals across trials of the same experiment. The authors interpret this to mean that chirping does not vary with internal state, but is it not likely that the internal states of individuals are stable under stable conditions, and that individuals may differ in these internal states across the same conditions? Stable differences in communication signals between individuals are frequently interpreted as reflecting differences between those individuals in certain characteristics, which are being communicated by these signals.

      It seems here we have been unclear in the writing: while it is true that behavioral states are stable and can imply stable chirp patterning (if the two are related), since chirp types vary abruptly and in a reliable DF-dependent manner, different types of chirps are unlikely to be matched to different internal states following the same temporal order in such a reliable way (similarly repeated through consecutive trials).

      This would imply the occurrence of different internal states in rapid sequence, reliably triggered by repeated EOD ramps, regardless of whether the playback is 20 sec long or 180 sec long.

      We have edited this paragraph to better explain this: “The reliability by which the chirping response adapts to both the rate and direction of beat frequency is variable across individuals but rather stable across trials (relative to a given subject), further suggesting that chirp type variations may not reflect changes in internal states or in the animal motivation to specific behavioral displays (which are presumably subject to less abrupt variations and stereotypical patterning based on DF).”

      I am not convinced of the conclusion drawn by the analysis of chirp transitions. The transition matrices show plenty of 1-2 and 2-1 transitions occurring.

      The only groups in which 1-2 and 2-1 transitions are as frequent as 1-1 and 2-2 (being 1 and 2 the numerical IDs of the two interacting fish) are F-F pairs. This is a result of the fact that in females chirp rates are so low that within-fish-correlations end up being as low as between-fish-correlations. We believe the impression of the Reviewer could be due to the fact that these are normalized maps (see legend of Figure 5A-B).

      Further, the cross-correlation analysis only shows that chirp timing between individuals is not phase-locked at these small timescales. It is entirely possible that chirp rates are correlated between interacting individuals, even if their precise timing is not.

      We agree with the Reviewer, this is a possibility. To address this point, we did edit the results section to acknowledge that what we see may be related to the time window chosen (i.e. 4 sec):

      “More importantly, they show that - at least in the social conditions analyzed here and within small-sized time windows - chirp time series produced by different fish during paired interactions are consistently independent of each other.”

      Further, it is not clear to me how "transitions" were defined. The methods do not make this clear, and it is not clear to me how you can have zero chirp transitions between two individuals when those two individuals are both generating chirps throughout an interaction.

      We thank the Reviewer for bringing up this unclear point. We have now clarified how transitions were calculated in the method section: “The number of chirp transitions present in each recording (dataset used for Figures 1, 2, 5) was measured by searching in a string array containing the 4 chirp types per fish pair, all their possible pairwise permutations (i.e. all possible permutations of 4+4=8 elements are: 1-1, 1-2, 1-3 … 7-6, 7-7, 7-8; considering the following legend 1 = fish1 type 1, 2 = fish 1 type 2, 3 = fish1 type 3 … 6 = fish2 type 2, 7 = fish2 type 3 and 8 = fish2 rise).”.

      Zero transitions are possible if two fish (or groups of fish) do not produce chirps of all types. Only transitions of produced types can be counted.

      In the results, "Although all chirp types were used during aggressive interactions, these seemed to be rather less frequent in the immediate surround of the chirps (Figure 6A)." A lack of precise temporal correlation on short timescales does not mean there is no association between the two behaviors. An increased rate of chirping during aggression is still a correlation between the two behaviors, even if chirps and specific aggressive behaviors are not tightly time-locked.

      The Reviewer is right in pointing out the limited temporal scaling of our observations/analysis. We have now edited the last paragraph of the results related to figure 6 to include the possibility mentioned by the Reviewer: “The significantly higher extent of chirping during swimming and locomotion, consistently confirmed by 4 different approaches (PSTH, TM, CN, MDS), suggests that - although chirp-behavior correlations may exist at time-scales larger than those here considered - chirping may be linked more strongly with scanning and environmental exploration than with a particular motivational state, thus confirming findings from our playback experiments.”

      The Reviewer here remarks an important point, yet, due to space limitations, we have considered only a sub-second scale. Most playback experiments in weakly electric fish implied the use of EOD mimics for a few tens of seconds - to avoid habituation in the fish behavioral responses -  while inter-chirp intervals usually range between a few hundreds of milliseconds to seconds (depending on how often a fish would chirp). This suggested to us that a 4 second time window may not be a bad choice to start with.

      In summary, it is simply too strong to say that chirping does not correlate with context, or to claim that there is convincing evidence arguing against a communication function of chirps. Importantly, however, this does not detract from your exciting and well-supported hypothesis that chirping functions in homeoactive sensing. A given EOD behavior could serve both communication and homeoactive sensing. I actually suspect this is quite common in electric fish (both gymnotiforms and mormyrids), and perhaps in other actively sensing species such as echolocating animals. The two are not mutually exclusive.

      We agree with the Reviewer that context - broadly speaking - does affect chirping (as we mentioned above). We hope we have improved the writing and clarified that we do not dismiss communication functions of chirping, but we do lean towards electrolocation based on the considerations above made and our results.

      We do conclude the manuscript remarking that communication and electrolocation are not mutually exclusive: ”probing cues could function simultaneously as proximity signals to signal presence, deter approaches, or coordinate behaviors like spawning, if properly timed (Henninger et al., 2018).” (see the conclusion paragraph of the discussion) .

      Therein, we further add “These findings aim to stir the pot and initiate a discussion on possible alternative functions of chirps beyond their presumed communication role.”.

      With this, we hope we’ve made it clear how we intend our manuscript to be read.

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish and as well as objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

      We thank the reviewer for the kind assessment.

      Weaknesses:

      My main criticism is that the alternative putative role for chirps as probe signals that optimize beat detection could be better developed. The paper could be clearer as to what that means precisely, especially since beating - and therefore detection of some aspects of beating due to the proximity of a conspecific - most often precedes chirping. One meaning the authors suggest, tentatively, is that the chirps could enhance electrosensory responses to the beat, for example by causing beat phase shifts that remediate blind spots in the electric field of view.

      We agree with the Reviewer that a better and more detailed explanation of how beat processing for conspecific electrolocation may be positively affected by chirps would be important to provide. We are currently working on a follow-up manuscript in which we intend to include these aspects. For space limitations and readability we had to discard from the current manuscript a lot of results that could further clarify these issues.

      A second criticism is that the study links the beat detection to underwater object localization. The paper does not significantly develop that line of thought given their data - the authors tread carefully here given the speculative aspect of this link. It is certainly possible that the image on the fish's body of an object in the environment will be slightly modified by introducing a chirp on the waveform, as this may enhance certain heterogeneities of the object in relation to its environment. The thrust of this argument derives mainly from the notion of Fourier analysis with pulse type fish EOD waveforms (see above, and radar theory more generally), where higher temporal frequencies in the beat waveform induced by the chirp will enable a better spatial resolution of objects. It remains to be seen whether experiments can show this to be significant.

      Perhaps the Reviewer refers to the last discussion paragraph before the conclusions in which we mention the performance of pulse or wave-type EODs in electrolocation (referring here to ideas illustrated in a recent review by Crampton, 2019). We added to this paragraph a statement which could better clarify that we do not propose that chirping could enhance object electrolocation. What we mean is that, in a context in which object electrolocation occurs through wave-type EODs - given the generally lower performance of such narrow-band signals in resolving the spatial features of any object, even a 3D electric field  - chirping could improve beat detection during social encounters by increasing the amount of information obtained by the fish.

      The edited paragraph now reads: “While broadband pulse signals may be useful to capture highly complex environments rich in foliage, roots and other structures common in vegetation featuring the more superficial habitats in which pulse-type fish live, wave-type EODs may be a better choice in the relatively simpler river-bed environments in which many wave-type fish live (e.g., the benthic zone of deep river channels; Crampton, 2019). In this case, achieving a good spatial resolution is critical during social encounters, especially considering the limited utility of visual cues in these low-light conditions. In such habitats, social encounters may “electrically” be less “abrupt”, but spatially less “conspicuous” or blurred (as a 3D electric field may be). In such a scenario, chirps could serve as a means to supplement the spatial information acquired via the beat, accentuating these cues during periods of reduced resolution.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      None, my points in the original review have been properly addressed in this resubmission.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript presented a useful toolkit designed for CyTOF data analysis, which integrates 5 key steps as an analytical framework. A semi-supervised clustering tool was developed, and its performance was tested in multiple independent datasets. The tool was compared to human experts as well as supervised and unsupervised methods. 

      Strengths: 

      The study employed multiple independent datasets to test the pipeline. A new semi-supervised clustering method was developed. 

      Weaknesses: 

      The examination of the whole pipeline is incomplete. Lack of descriptions or justifications for some analyses. 

      We thank the reviewer’s overall summary and comments of this manuscript. In the last part of the results, we showcased the functionalities of ImmCellTyper in covid dataset, including quality check, BinaryClust clustering, cell abundance quantification, state marker expression comparison within each identified cell types, cell population extraction, subpopulation discovery using unsupervised methods, and data visualization etc. We added more descriptions in the text based on the reviewer’s suggestions. 

      Reviewer #2 (Public Review): 

      Summary: 

      The authors have developed marker selection and k-means (k=2) based binary clustering algorithm for the first-level supervised clustering of the CyTOF dataset. They built a seamless pipeline that offers the multiple functionalities required for CyTOF data analysis. 

      Strengths: 

      The strength of the study is the potential use of the pipeline for the CyTOF community as a wrapper for multiple functions required for the analysis. The concept of the first line of binary clustering with known markers can be practically powerful. 

      Weaknesses: 

      The weakness of the study is that there's little conceptual novelty in the algorithms suggested from the study and the benchmarking is done in limited conditions. 

      We thank the reviewer’s overall summary and comments of this manuscript. While the concept of binary clustering by k-means is not novel, BinaryClust only uses it for individual markers to identify positive and negative cells, then combine it with the pre-defined matrix for cell type identification. This has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      As for the benchmarking, we limited the depth only to main cell types rather than subpopulations. The reason is because we only apply BinaryClust to identify main cell types; For the cell subsets discovery, unsupervised methods integrated in this pipeline has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Reviewer #3 (Public Review): 

      Summary: 

      ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis. 

      Strengths: 

      The proposed algorithm takes into account the prior knowledge. 

      The results on different benchmarks indicate competitive or better performance (in terms of accuracy and speed) depending on the method. 

      Weaknesses: 

      The proposed algorithm considers only CyTOF markers with binary distribution. 

      We thank the reviewer’s overall summary and comments of this manuscript. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages, particularly when the expression of the defining marker is not continuous. However, the limitation is for subpopulation identification, because a handful makers behave in a continuum manner, so we suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To address the reviewer’s concern, we considered the limitation of binary distribution, but it does not profoundly affect the application of the pipeline.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Many thanks for the reviewers’ comments and suggestions, please see below the point-to-point response:

      (1) The style of in-text reference citation is not consistent. Many do not have published years.

      The style of the reference citation has been revised and improved.  

      (2) The font size in the table of Figure 1 is too small, so is Figure 2. 

      The font size has been increased.

      (3) Is flowSOM used as part of BinaryClust? How should the variable running speed of BinaryClust be interpreted, given that it is occasionally slower and sometimes faster than flowSOM in the datasets?

      To answer reviewer’s question, flowSOM is not a part of BinaryClust. They are separate clustering methods that have been incorporated into the ImmCellTyper pipeline. As described in Figure 1, BinaryClust, a semi-supervised method, is used to classify the main cell lineages; while flowSOM, an unsupervised method, is recommended here for further subpopulation discovery. So, they operate independently of each other. To avoid confusions, we slightly modified Figure 1 for clarification.

      Regarding the variability in running speed in Figure 4. The performance of algorithms can indeed be influenced by the characteristics of the datasets, such as size and complexity. The differences observed between the covid dataset and the MPN dataset, such as marker panel, experimental protocol, and data acquisition process etc., could account for this variation. Our explanation is that flowSOM suits better the data structure of covid dataset, which might be the reason why it is slightly faster to analyse compared to the MPN dataset. Moreover, for the covid dataset, the runtime for both BinaryClust and flowSOM is less than 100s, and the difference is not notable. 

      (4) In the Method section ImmCellTyper workflow overview, it is difficult to link the description of the pipeline to Figure 8. There are two sub-pipelines in the text and seven steps in the figure. What are their relations? Some steps are not introduced in the text, such as Data transformation and SCE object construction. What is co-factor 5?

      Figure 8 provides an overview of the entire workflow for CyTOF data analysis, starting from the raw fcs file data and proceeding until downstream analysis (seven steps). But the actual implementation of the pipeline was divided into two separate sections, as outlined in the vignettes of the ImmCellTyper GitHub page (https://github.com/JingAnyaSun/ImmCellTyper/tree/main/vignettes).

      Users will initially run ‘Intro_to_batch_exam_correct’ to perform data quality check and identify potential batch effects, followed by ‘Intro_to_data_analysis’ for data exploration. We agree with the reviewer that the method for this section is a bit confusing, so we’ve added more description for clarification.

      In processing mass cytometry data, arcsine transformation is commonly applied to handle zero values, skewed distributions, and to improve visualization as well as clustering performance. The co-factor here is used as a parameter to scale down the data to control the width of the linear region before arcsine transformation. We usually get the best results by using co-factor 5 for CyTOF data.   

      (5) For differential analysis, could the pipeline analyze paired/repeated samples?

      For the statistical step, ImmCellTyper supports both two-study group comparison using Mann-Whitney Wilcoxon test, and multiple study group comparison (n>2) using Kruskal Wallis test followed by post hoc analysis (pairwise Wilcoxon test or Dunn’s test) with multiple testing correction using Benjamini-Hochberg Procedure.

      Certainly, this pipeline allows flexibilities, users can also extract the raw data of cell frequencies and apply suitable statistical methods for testing.

      (6) In Figure 2A, the range of the two axes is different for Dendritic cells, which could be misleading. Why the agreement is bad for dendritic cells?

      The range for the axes is automatically adapted to the data structure, which explains why they may not necessarily be equal. The co-efficient factor for the correlation of DCs is 0.958, compared to other cell types (> 0.99), it is relatively worse but does not indicate poor agreement.

      Moreover, the abundance of DCs is much less than other cell types, comprising approximately 2-5% of whole cells. As a result, even small differences in abundance may appear to as significant variations. For example, a difference of 1% in DC abundance represents a 2-fold change, which can be perceived as substantial.

      Overall, while the agreement for DCs may appear comparatively lower, it is not necessarily indicative of poor performance, considering both the coefficient factor and the relative abundance of DCs compared to other cell types.

      (7) In the Results section BinaryClust achieves high accuracy, what method was used to get the p-value, such as lines 212, 213, etc.?

      The accuracy of BinaryClust was tested using F-measure and ARI against ground truth (manual gating), the detailed description/calculation can be found in methods. For line 212 and 213, the p-value was calculated using ANOVA for the interaction plot shown in Figure 3. We’ve now added the statistical information into the figure legend.   

      (8) The performance comparison between BinaryClust and LDA is close. The current comparison design looks unfair. Given LDA only trained using half data, LDA may outperform BinaryClust.

      It is true that LDA was trained using half data, which is because this method requires manual gating results as training dataset to build a model, then apply the model to the rest of the files to label cell types. Here we used 50% of the whole dataset as training set. We are of course very happy to implement any additional suggestions for a better partition ratio.

      (9) There are 5 key steps in the proposed workflow. However, not every step was presented in the Results.

      Thanks for the comments. The results primarily focused on demonstrating the precision and performance of BinaryClust in comparison with ground truth and existing tools. Additionally, a case study showcasing the application/functions of the entire pipeline in a dataset was also presented. Due to limitation in space, the implementation details of the pipeline were described in the method section and github documentations, which users/readers can easily access.

      Reviewer #2 (Recommendations For The Authors): 

      The tools suggested by the authors could be potentially useful to the community. However, it's difficult to understand the conceptual novelty of the algorithms suggested here. The concept of binary clustering has been described before (https://doi.org/10.1186/s12859-022-05085-zhttps://doi.org/10.1152/ajplung.00104.2022), and it mainly utilizes k-means clustering set to generate binary clusters based on selected markers. Other algorithms associated with the package are taken from other studies. 

      We acknowledge the reviewer’s comment regarding the novelty of our method. While the concept of binary clustering by k-means has been previously described to transcriptome data, our approach applies it to CyTOF data analysis, which has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, as stated in the manuscript, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      In addition, the benchmarking of clustering performance, especially to reproduce manual gating and comparison to tools such as flowSOM is not comprehensive enough. The result for the benchmarking test could significantly vary depending on how the authors set the ground truth (resolution of cell type annotations). The authors should compare the tool's performance by changing the depth of cell type annotations. Especially, the low abundance cell types such as gdT cells or DCs were not effectively captured by the suggested methods. 

      Thanks for the comment. We appreciate the reviewer’s concern. However, as illustrated in figure 1, our approach uses BinaryClust, a semi-supervised method, to identify main cell types rather than directly targeting subpopulations. The reason is because semi-supervised method relies on users’ prior definition thus is limited to discover novel subsets. In the ImmCellTyper framework, unsupervised method was subsequently applied for subset exploration following the BinaryClust step.

      Regarding benchmarking, we focused on testing the precision of BinaryClust for main cell type characterization, because it is what the method is used for in the pipeline, and we believe this is sufficient. As for the cell subsets discovery, the unsupervised methods we integrated has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Moreover, as shown in Figure 3 and Table 1, our results indicated that the F-measure for DCs and gdT cells in BinaryClust is 0.80 and 0.92 respectively, which were very close to ground truth and outperformed flowSOM, demonstrating its effectiveness. 

      We hope these clarifications address the reviewer’s concern.

      Minor comments: 

      (1) In Figure 4, it's perplexing to note that BinaryClust shows the slowest runtime for the COVID dataset, compared to the MPN dataset, which features a similar number of cells. What causes this variation? Is it dependent on the number of markers utilized for the clustering? This should be clarified/tested. 

      Thanks for the comment, but we are not sure that we fully understand the question. As shown in figure 4 that BinaryClust has slightly higher runtime in MPN dataset than covid dataset, which is reasonable because and the cell number in MPN dataset is around 1.6 million more than covid dataset.

      (2) Some typos are noted: 

      - DeepCyTOF and LDA use a maker expression matrix extracted → "marker"?* 

      Corrected.

      - Datasets(Chevrier et al.)which → spacing* 

      Corrected.

      - This is due to the method's reliance → spacing*

      Corrected.

      Reviewer #3 (Recommendations For The Authors): 

      Is it possible to accommodate more than two levels within the clustering process, i.e., can the proposed semi-supervised clustering tool be extended to multi-levels instead of binary?

      Thanks for the comments. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages. However, the limitation is for subpopulation identification, because a handful of makers behave in a continuum manner, so we would suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To answer the reviewer’s question, it is possible to set the number to 3,4,5 rather than just 2, but considering the design and rationale of the entire framework (as describe in the manuscript and above), it doesn’t seem to be necessary.

      Could you please comment on why on the COVID dataset, BinaryClust was slower as compared to flowSOM?

      Thanks for the question. The performance of algorithms can indeed be affected by the characteristics of the datasets, such as their size and complexity. The covid and MPN datasets differ in various aspects including marker panel, experimental protocol, and data acquisition process, among others, which wound account for the observed variation in speed. So, our explanation is flowSOM suits better for the structure of covid dataset than MPN dataset.  Additionally, for covid dataset, both BinaryClust and flowSOM have runtimes of less than 100s, and the difference between the two isn’t particularly dramatic.

      Minor errors: 

      Line#215 "(ref) " reference is missing

      Added.

      Figure 3, increase the font of the text in order to improve readability. 

      Increased.

      Line#229 didn't --> did not. 

      Corrected

      Line#293 repetition of the reference. 

      The repetition is due to the format of the citation, which has been revised.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):  

      Summary: 

      In this study, Nandy and colleagues examine neural and behavioral correlates of perceptual variability in monkeys performing a visual change detection task. They used a laminar probe to record from area V4 while two macaque monkeys detected a small change in stimulus orientation that occurred at a random time in one of two locations, focusing their analysis on stimulus conditions where the animal was equally likely to detect (hit) or not-detect (miss) a briefly presented orientation change (target). They discovered two behavioral measures that are significantly different between hit and miss trials - pupil size tends to be slightly larger on hits vs. misses, and monkeys are more likely to miss the target on trials in which they made a microsaccade shortly before target onset. They also examined multiple measures of neural activity across the cortical layers and found some measures that are significantly different between hits and misses. 

      Strengths: 

      Overall the study is well executed and the analyses are appropriate (though multiple issues do need to be addressed). 

      We thank the reviewer for their enthusiasm and their constructive comments which we address below.

      Weaknesses: 

      My main concern with this study is that with the exception of the pre-target microsaccades, the physiological and behavioral correlates of perceptual variability (differences between hits and misses) appear to be very weak and disconnected. Some of these measures rely on complex analyses that are not hypothesis-driven and where statistical significance is difficult to assess. The more intuitive analysis of the predictive power of trial outcomes based on the behavioral and neural measures is only discussed at the end of the paper. This analysis shows that some of the significant measures have no predictive power, while others cannot be examined using the predictive power analysis because these measures cannot be estimated in single trials. Given these weak and disconnected effects, my overall sense is that the current results do not significantly advance our understanding of the neural basis of perceptual variability. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) Most of the effects are very small. For example, the difference in pupil size between hits and misses is ~0.08 z-score units. The differences in firing rates between hits and misses are in the order of 1-2% of normalized firing rates. While these effects may be significant, their contribution to perceptual variability could be negligible, as suggested by the analysis of predictive power at the end of the result section. On a related note, it would be useful to mention the analysis of predictive power earlier in the paper. The finding that some of the measures do not have significant predictive power w/r to behavioral outcome raises questions regarding their importance. Finally, it would strengthen the paper if the authors could come up with methods to assess the predictive power of the PPC and interlaminar SSC. Without such analyses, it is difficult to assess the importance of these measures. 

      We expect that relatively small differences in early to intermediate sensory areas could cumulatively result in large differences in higher areas and contribute to the binary distinction between hits and misses. We certainly do not claim that these results completely explain state-dependent differences that determine the outcome of these trials. Instead, we have focused on neural signatures at the level of the V4 columnar microcircuit that might ultimately contribute to the variability in perception.

      We would like to emphasize that, based on the reviewer’s recommendation, we have now analyzed our results separately for each animal (see below). The consistency and significance of our findings across both animals give us confidence that what we have reported here are important neural signatures underlying perceptual variability at threshold.

      We would also like to note that SSC and PPC are now part of the standard toolkit of systems neuroscience and have been employed in numerous studies to our knowledge. While all measures come with their set of caveats and limitations, these two measures provide a frequency-resolved metric of the relationship between two temporal processes (point or continuous), which we believe provide insights into the interlaminar flow of information that we report here.

      Unfortunately, limitations in the GLM method and the reliability of these analyses with limited data make it impossible for these two measures to be included. The GLM requires all variables to be defined for each trial in the input. SSC and PPC can be undefined at low firing rates and require a substantial amount of data to be reliably calculated. While we did consider imputing data or estimating SSC and PPC using multiple trials, we ultimately did not pursue this idea as the purpose of the GLM was to use simultaneous measurements from single trials. 

      (2) What is the actual predictive power of the GLM model (i.e., what is the accuracy of predicting whether a given held-out trial will lead to a hit or a miss)? How much of this predictive power is accounted for by the effect of microsaccades? 

      As the GLM is not a decoder, it does not classify whether a given left out trial will be a hit or a miss. However, the GLM was highly predictive compared to a constant model. This information has been added to Table 3. The deviance of the GLM with and without microsaccades as a variable was not significantly different (p >0.9).  

      (3) The role of stimulus contrast is not explained clearly. Are all the analyses and figures restricted to a single contrast level? Was the contrast the same on both sides? If multiple contrasts are used, could contrast account for some of the observed neural-behavioral covariations? 

      All of the analyses include stimuli of all tested contrast levels. Stimulus contrasts were the same at both locations (attended and unattended). We have added a more detailed description of the contrast in hit and miss trials (Lines 289-296 and reproduced that here: 

      “Non-target stimulus contrasts were slightly different between hits and misses (mean:

      33.1% in hits, 34.0% in misses, permutation test, 𝑝 = 0.02), but the contrast of the target was higher in hits compared to misses (mean: 38.7% in hits, 27.7% in misses, permutation test, 𝑝 = 1.6 𝑒 − 31). Firing rates were normalized by contrast in Figure 3. In all other figures, we considered only non-target stimuli, which had very minor differences in contrast (<1%) across hits and misses. While we cannot completely rule out any other effects of stimulus contrast, the normalization in Figure 3 and minor differences for non-target stimuli should minimize them.”

      (4) Do the animals make false alarms (i.e., report seeing a target in non-target epochs)?

      If not, then it is not clear that the animals are performing near their perceptual threshold. If the false-alarm rate is non-zero, it should be reported and analyzed for neural/behavioral correlates. Does the logistic regression fit allow for a false alarm rate? More generally, it would be useful to see a summary of behavioral performance, such as distribution of thresholds, lower and upper asymptotes, and detection rates on foil trials vs. matched target trials. 

      The logistic regression does allow for a false alarm rate. We have reported additional behavioral parameters in Figure 1-figure supplement 3A-G.  

      (5) As far as I can tell, all the analyses in the paper are done on data combined across the two animals. Given that these effects are weak and that the analyses are complex, it is important to demonstrate for each analysis/figure that the results hold for each animal separately before combining the data across animals. This can be done in supplementary figures. 

      We have updated the paper to include all main results plotted separately for each animal as supplementary figures. 

      - Figure 2-figure supplement 2

      - Figure 3-figure supplement 1

      - Figure 3-figure supplement 2

      - Figure 4-figure supplement 1

      - Figure 5-figure supplement 2

      - Figure 7-figure supplement 1

      All the results except for the canonical correlation analysis were present, consistent, and significant when we analyzed them in each monkey independently.

      (6) The selection of the temporal interval used for the various analyses appears somewhat post hoc and is not explained clearly. Some analyses are restricted to the period immediately before or during target onset (e.g., 400 ms before target onset for analysis of the effect of microsaccade, 60 ms before stimulus onset for the analysis of the effect of neural variability). Other analyses are done on non-target rather than target stimuli. What is the justification for selecting these particular periods for these analyses? The differences in firing rates between hits and misses are restricted to the target epoch and are not present in the non-target epochs. Given these results, it seems important to compare the effects in target and non-target epochs in other analyses as well.

      Restricting the analysis of the Fano Factor to 60 ms before non-target onset seems odd. Given that the duration of the interval between stimulus presentations is random, how could this pre-stimulus effect be time-locked to target onset? 

      We selected a 200ms time window during the pre-stimulus or stimulus-evoked period for almost all our analyses. The results relating to microsaccade occurrence were robust to narrower time windows more consistent with the other pre-stimulus windows we used, but we chose to use the 400ms window to capture a larger fraction of trials with microsaccades. 

      Only the Fano factor time window was selected post-hoc based on the traces in Figure 4A, and the result is robust across animals (new Figure 4-figure supplement 1). The inter-stimulus intervals are random, and we do not believe the neural variability is timelocked to upcoming stimuli, but that lower variability in this pre-stimulus window is characteristic of hits. 

      We believe that the consistency of our results across both animals provides further evidence that our time window selection was appropriate. 

      We are interested in the extent to which these effects would remain consistent when applied only to target stimuli. However, restricting our analyses to only target stimuli substantially reduces the amount of neural data available for analysis. We plan to explore target stimulus representation more thoroughly in future studies.   

      (7) Can the measured neural response be used to discriminate between target and nontarget stimuli? If so, is the discriminability between target and non-target higher in hits vs. misses? 

      Thank you for raising this interesting point. We performed this analysis and find that target stimuli are more discriminable from non-targets in hits compared to misses. This has been added as a new Figure 3A.  

      (8) How many trials were performed per session? Did miss probability tend to increase over time over the session? If so, could this slow change in hit probability account for some of the observed neural and behavioral correlations with perceptual decisions? 

      Monkeys initiated a median of 905 trials (range of 651 to 1086). This has been added to the manuscript (Line 106). Approximately 1/8 of those trials were at perceptual threshold. Hit probability at threshold does not change substantially over the course of the session. We now report this in new Figure 1- figure Supplement 3I (error bars show standard deviation). 

      (9) Did miss probability depend on the time of the change within the trial? If so, do any of the behavioral/neural metrics share a similar within-trial time course? 

      Change times were not significantly different across hit and miss trials (p=0.15, Wilcoxon rank sum test). We now report this in new Figure 1-figure supplement 3H.

      (10) "Deep layer neurons exhibit reduced low-frequency phase-locking in hit trials than in misses (Figure 5B), suggesting an improvement in pooled signal-to-noise among this neural population." - why does this metric suggest improved SNR? Is there any evidence for improved SNR in the data? Why just in deep layers? 

      Thank you for raising this question. We agree this statement is not fully supported by the data and have removed it.  

      (11) I may have missed this but what were the sizes of the Gabor stimuli? 

      This has been added to the methods section (Line 454). The Gaussian halfwidth was 2 degrees.  

      Reviewer #2 (Public Review):  

      In this manuscript, the authors conducted a study in which they measured eye movements, pupil diameter, and neural activity in V4 in monkeys engaged in a visual attention task. The task required the monkeys to report changes in the orientation of Gabors' visual stimuli. The authors manipulated the difficulty of the trials by varying the degree of orientation change and focused their analysis on trials of intermediate difficulty where the monkeys' hit rate was approximately 50%. Their key findings include the following: 1) Hit trials were preceded by larger pupil diameter, reflecting higher arousal, and by more stable eye positions; 2) V4 neurons exhibit larger visual responses in hit trials; 3) Superficial and deep layers exhibited greater coherence in hit trials during both the pre-target stimulus period and the non-target stimulus presentation period. These findings have useful implications for the field, and the experiments and analyses presented in this manuscript validly support the authors' claims. 

      Strengths: 

      The experiments were well-designed and executed with meticulous control. The analyses of both behavioural and electrophysiological data align with the standards in the field. 

      We thank the reviewer for their enthusiasm about our study and their constructive comments which we address below.

      Weaknesses: 

      Many of the findings appear to be incremental compared to previous literature, including the authors' own work. While incremental findings are not necessarily a problem, the manuscript lacks clear statements about the extent to which the dataset, analysis, and findings overlap with the authors' prior research. For example, one of the main findings, which suggests that V4 neurons exhibit larger visual responses in hit trials (as shown in Fig. 3), appears to have been previously reported in their 2017 paper. Additionally, it seems that the entire Fig1-S1 may have been reused from the 2017 paper. These overlaps should have been explicitly acknowledged and correctly referenced. 

      While the raw data used in this paper overlaps entirely with Nandy et al. (2017), all the analyses and findings in this manuscript are new and have not been previously reported. Figure 1-figure supplement 1 is modified and reproduced from that paper only to allow readers to understand the recording methods used to collect the data without needing to go back to the previous paper. We have added an explicit acknowledgment of this to the figure caption.

      Previous studies have demonstrated that attention leads to decorrelation in V4 population activity. The authors should have discussed how and why the high coherence across layers observed in the current study can coexist with this decorrelation. 

      We have updated the discussion section (Lines 347-351) to further elaborate on this interpretation. 

      Furthermore, the manuscript does not explore potentially interesting aspects of the dataset. For instance, the authors could have investigated instances where monkeys made 'false' reports, such as executing saccades towards visual stimuli when no orientation change occurred. It would be valuable to provide the fraction of the monkeys' responses in a session, including false reports and correct rejections in catch trials, to allow for a broader analysis that considers the perceptual component of neural activity over pure sensory responses. 

      We appreciate this feedback. While we agree these are interesting directions, we decided to limit the scope of this study to only focus on trials at threshold with an orientation change, and are considering these directions for future studies. 

      Reviewer #2 (Recommendations For The Authors): 

      • Figure Design: Since eLife does not impose space limitations, it is advisable for the authors to avoid using very small font sizes. Consistency in font size throughout the figures is recommended. Some figures are challenging to discern, for example, the mean+-sem in Fig. 2B, and the alpha values of green and purple colours for superficial/deep layers are too high, making them too transparent or pale. 

      We have increased the size of some small fonts and improved font size consistency throughout the figures. We have changed the layer colors to improve legibility. 

      • Line 119: trail, 

      This has been fixed.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I agree with the authors that a focus on ecological vs laboratory variables is a good one, although it might have been useful to reflect that in the title.

      I am happy to see that the authors included additional analyses using different definitions of FP and DLPFC in the supplementary material. As I said in my earlier review, the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital.

      We thank the reviewer for these positive remarks and for these very useful suggestions on the previous version of this article.

      I am sorry the authors are so dismissive of the idea of looking the models where brain size and area size are directly compared in the model, rather preferring to run separate models on brain size and area size. This seems to me a sensible suggestion.

      We agree with the reviewer 1 and the response of reviewer 3 also made it clear to us of why it was an important issue. We have therefore addressed it more thoroughly this time.

      First, we have added a new analysis, with whole brain volume included as covariate in the model accounting for regional volumes, together with the socio-ecological variables of interest. As expected given the very strong correlation across all brain measures (>90%), the effects of all socio-ecological factors disappear for both FP and DLPFC volumes when ‘whole brain’ is included as covariate. This is coherent with our previous analysis showing that the same combination of socio-ecological variables could account for the volume of FP, DLPFC and the whole brain. Nevertheless, the interpretation of these results remains difficult, because of the hidden assumptions underlying the analysis (see below).

      Second, we have clarified the theoretical reasons that made us choose absolute vs relative measures of brain volumes. In short, we understand the notion of specificity associated with relative measures, but 1) the interpretation of relative measures is confusing and 2) we have alternative ways to evaluate the specificity of the effects (which are complementary to the idea of adding whole brain volume as covariate). 

      Our goal here was to evaluate the influence of socio-ecological factors on specific brain regions, based on their known cognitive functions in laboratory conditions (working memory for the DLPFC and metacognition for the frontal pole). Thus, the null hypothesis is that socio-ecological challenges supposed to mobilize working memory and metacognition do not affect the size of the brain regions associated with these functions (respectively DLPFC and FP). This is what our analysis is testing, and from that perspective, it seems to us that direct measures are better, because within regions (across species), volumes provide a good index of neural counts (since densities are conserved), which are indicative fo the amount of computational resources available for the region. It is not the case when using relative measures, or when using the whole brain as covariate, since densities are heterogenous across brain regions (e.g. Herculano-Houzel, 2011; 2017, but see below for further details on this).

      Quantitatively, the theoretical level of specificity of the relation between brain regions and socio-ecological factors is difficult to evaluate, given that our predictions are based on the cognitive functions associated with DLPFC and FP, namely working memory and metacognition, and that each of these cognitive functions also involved other brain regions. We would actually predict that other brain regions associated with the same cognitive functions as DLPFC or FP also show a positive influence of the same socioecological variables. Given that the functional mapping of cognitive functions in the brain remains debated, it is extremely difficult to evaluate quantitatively how specific the influence of the socio-ecological factors should be on DLPFC and FP compared to the rest of the brain, in the frame of our hypothesis.

      Critically, given that FP and DLPFC show a differential sensitivity to population density, a proxy for social complexity, and that this difference is in line with laboratory studies showing a stronger implication of the FP in social cognition, we believe that there is indeed some specificity in the relation between specific regions of the PFC and socioecological variables. Thus, our results as a whole seem to indicate that the relation between prefrontal cortex regions and socio-ecological variables shows a small but significant level of specificity. We hope that the addition of the new analysis and the corresponding modifications of the introduction and discussion section will clarify this point.

      Similarly, the debate about whether area volume and number of neurons can be equated across the regions is an important one, of which they are a bit dismissive.

      We are sorry that the reviewer found us a bit dismissive on this issue, and there may have been a misunderstanding.

      Based on the literature, it is clearly established that for a given brain region, area volume provides a good proxy for the number of neurons, and it is legitimate to generalize this relation across species if neuronal densities are conserved for the region of interest (see for example Herculano-Houzel 2011, 2017 for review). It seems to be the case across primates because cytoarchitectonic maps are conserved for FP and DLPFC, at least in humans and laboratory primates (Petrides et al, 2012; Sallet et al, 2013; Gabi et al, 2016; Amiez et al, 2019). But we make no claim about the difference in number of neurons between FP and DLPFC, and we never compared regional volumes across regions (we only compared the influence of socio-ecological factors on each regional volume), so their difference in cellular density is not relevant here. As long as the neuronal density is conserved across species but within a region (DLPFC or FP), the difference in volume for that region, across species, does provide a reliable proxy for the influence of the socioecological regressor of interest (across species) on the number of neurons in that region.

      Our claims are based on the strength of the relation between 1) cross-species variability in a set of socio-ecological variables and 2) cross-species variability in neural counts in each region of interest (FP or DLPFC). Since the effects of interest relate to inter-specific differences, within a region, our only assumption is that the neural densities are conserved across distinct species for a given brain region. Again (see previous paragraph), there is reasonable evidence for that in the literature. Given that assumption, regional volumes (across species, for a given brain region) provide a good proxy for the number of neurons. Thus, the influence of a given socio-ecological variable on the interspecific differences in the volume of a single brain region provides a reliable estimate of the influence of that socio-ecological variable on the number of neurons in that region (across species), and potentially of the importance of the cognitive function associated with that region in laboratory conditions. None of our conclusions are based on direct comparison of volumes across regions, and we only compared the influence of socioecological factors (beta weights, after normalization of the variables).

      Note that this is yet another reason for not using relative measures and not including whole brain as covariate in the regression model: Given that whole brain and any specific region have a clear difference in density, and that this difference is probably not conserved across species, relative measures (or covariate analysis) cannot be used as proxies for neuronal counts (e.g. Herculano-Houzel, 2011). In other words, using the whole brain to rescale individual brain regions relies upon the assumption that the ratios of volumes (specific region/whole brain) are equivalent to the ratios of neural counts, which is not valid given the differences in densities.

      Nevertheless, I think this is an important study. I am happy that we are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is a important approach.

      We really thank the reviewer for these positive remarks, and we hope that this study will indeed stimulate others using a similar approach.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience. But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We are sorry that the reviewer still believes that these two points are major weaknesses.

      - We have added a point on lissencephalic species in the discussion. In short, we acknowledge that our work may not be applied to lissencephalic species because they cannot be studied with our method, but on the other hand, based on laboratory data there is no evidence showing that the functional organization of the DLPFC and FP in lissencephalic primates is radically different from that of other primates (Dias et al, 1996; Roberts et al, 2007; Dureux et al, 2023; Wong et al, 2023). Therefore, there is no a priori reason to believe that not including lissencephalic primates prevents us from drawing conclusions that are valid for primates in general. Moreover, as explained in the discussion, including lissencephalic primates would require using invasive functional studies, only possible in laboratory conditions, which would not be compatible with the number of species (>15) necessary for phylogenetic studies (in particular PGLS approaches). Finally, as pointed out by the reviewer, our study is also relevant for understanding human brain evolution, and as such, including lissencephalic species should not be critical to this understanding.

      - In response to the remarks of reviewer 1 on the first version of the manuscript, we had included a new analysis in the previous version of the manuscript, to evaluate the validity of our functional maps given another set of boundaries between FP and DLPFC. But one should keep in mind that our objective here is not to provide a definitive definition of what the regions usually referred to as DLPFC and FP should be from an anatomical point of view. Rather, as our study aims at taking into account the phylogenetic relations across primate species, we chose landmarks that enable a comparison of the volume of cortex involved in metacognition (FP) and working memory (DLPFC) across species. We have also updated the discussion accordingly.

      We agree that this is a difficult point and we have always acknowledged that this was a clear limitation in our study. In the light of the functional imaging literature in humans and non-human primates, as well as the neurophysiological data in macaques, defining the functional boundary between FP and DLPFC remains a challenging issue even in very well controlled laboratory conditions. As mentioned by reviewer 1, “the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital”. Again, an additional analyses using different boundaries for FP and DLPFC was included in the supplementary material to address that issue. Now, we are not aware of solid evidence showing that the boundaries that we chose for DLPFC vs FP were wrong, and we believe that the comparison between 2 sets of measures as well as the discussion on this topic should be sufficient for the reader to assess both the strength and the limits of our conclusion. That being said, if the reviewer has any reference in mind showing better ways to delineate the functional boundary between FP and DLPFC in primates, we would be happy to include it in our manuscript.

      - The question of development, which is an important question per se,  is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, major studies in the field do not mention development (e.g. Byrne, 2000; Kaas, 2012; Barton, 2012). De Casien et al (2022) even showed that developmental constraints are largely irrelevant (see Claim 4 of their article): [« The functional constraints hypothesis […] predicts more complex, ‘mosaic’ patterns of change at the network level, since brain structure should evolve adaptively and in response to changing environments. It also suggests that ‘concerted’ patterns of brain evolution do not represent conclusive evidence for developmental constraints, since allometric relationships between developmentally linked or unlinked brain areas may result from selection to maintain functional connectivity. This is supported by recent computational modeling work [81], which also suggests that the value of mosaic or concerted patterns may fluctuate through time in a variable environment and that developmental coupling may not be a strong evolutionary constraint. Hence, the concept of concerted evolution can be decoupled from that of developmental constraints »].

      Finally, when studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017; MacLean et al, 2012. Mars et al, 2018; 2021). Therefore, development does not seem to be a critical issue, neither for our article nor for the field.

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis).

      We thank the reviewer for his/her remarks, and for the clarification of his /her criticism regarding the use of relative measures. We are sorry to have missed the importance of this point in the first place. We also thank the reviewer for the cited references, which were very interesting and which we have included in the discussion. As the reviewer 1 also shared these concerns, we wrote a detailed response to explain how we addressed the issue above.

      First, we did run a supplementary analysis where whole brain volume was added as covariate, together with socio-ecological variables, to account for the volume of FP or DLPFC. As expected given the very high correlation across all 3 brain measures, none of the socio-ecological variables remained significant. We have added a long paragraph in the discussion to tackle that issue. In short, we agree with the reviewer that the specificity of the effects (on a given brain region vs the rest of the brain) is a critical issue, and we acknowledge that since this is a standard in the field, it was necessary to address the issue and run this extra-analysis. But we also believe that specificity could be assessed by other means: given the differential influence of ‘population density’ on FP and DLPFC, in line with laboratory data, we believe that some of the effects that we describe do show specificity. Also, we prefer absolute measures to relative measures because they provide a better estimate of the corresponding cognitive operation, because standard allometric rules (i.e., body size or whole brain scaling) may not apply to the scaling and evolution of FP and DLPFC in primates.. Indeed, given that we use these measures as proxies of functions (metacognition for FP and working memory for DLPFC), it is clear that other parts of the brain should show the same effect since these functions are supported by entire networks that include not only our regions of interest but also other cortical areas in the parietal lobe. Thus, the extent to which the relation with socio-ecological variables should be stronger in regions of interest vs the whole brain depends upon the extent to which other regions are involved in the same cognitive function as our regions of interest, and this is clearly beyond the scope of this study. More importantly, volumetric measures are taken as proxies for the number of neurons, but this is only valid when comparing data from the same brain region (across species), but not across brain regions, since neural densities are not conserved. Thus, using relative measures (scaling with the whole brain volume) would only work if densities were conserved across brain regions, but it is not the case. From that perspective, the interpretation of absolute measures seems more straightforward, and we hope that the specificity of the effects could be evaluated using the comparison between the 3 measures (FP, DLPFC and whole brain) as well as the analysis suggested by the reviewer. We hope that the additional analysis and the updated discussion will be sufficient to cover that question, and that the reader will have all the information necessary to evaluate the level of specificity and the extent to which our findings can be interpreted.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In my previous review of the present manuscript, I pointed out the fact that defining parts, modules, or regions of the primate cerebral cortex based on macroscopic landmarks across primate species is problematic because it prevents comparisons between gyrencephalic and lissencephalic primate species. The authors have rephrased several paragraphs in their manuscript to acknowledge that their findings do apply to gyrencephalic primates.

      I also said that "Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support". I insisted that the author should clarify their concept of homology of cerebral cortex parts, modules, or regions cross species (in the present manuscript, the frontal pole and the dorsolateral prefrontal cortex). Those are not trivial questions because any phylogenetic explanation of brain region expansion in contemporary phylogenetic and evolutionary biology must be rooted in evolutionary developmental biology. In this regard, the authors could have discussed their findings in the frame of contemporary studies of cerebral cortex evolution and development, but, instead, they have rejected my criticism just saying that they are "not relevant here" or "clearly beyond the scope of this paper".

      The question of development, which is an important question per se, is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, the major studies in the field do not mention development and some even showed that developmental constraints were not relevant (see De Casien et al., 2022 and details in our response to the public review). When studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017;  MacLean et al, 2012. Mars et al, 2018; 2021).

      If the other reviewers agree, the authors are free to publish in eLife their correlations in a vacuum of evolutionary developmental biology interpretation. I just disagree. Explanations of neural circuit evolution in primates and other mammalian species should tend to standards like the review in this link: https://royalsocietypublishing.org/doi/full/10.1098/ rstb.2020.0522

      In this article, Paul Cizek (a brilliant neurophysiologist) speculates on potential evolutionary mechanisms for some primate brain functions, but there is surprisingly very little reference to the existing literature on primate evolution and cognition. There is virtually no mention of studies that involve a large enough number of species to address evolutionary processes and/or a comparison with fossils and/or an evaluation of specific socio-ecological evolutionary constraints. Most of the cited literature refers to laboratory studies on brain anatomy of a handful of species, and their relevance for evolution remains to be evaluated. These ideas are very interesting and they could definitely provide an original perspective on evolution, but they are mostly based on speculations from laboratory studies, rather than from extensive comparative studies. This paper is interesting for understanding developmental mechanisms and their constraints on neurophysiological processes in laboratory conditions, but we do not think that it would fit it in the framework of our paper as it goes far beyond our main topic.

      Reviewer #3 (Recommendations For The Authors):

      Yes, I am suggesting that the authors also include analyses with brain size (rather than body size) as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size. In a very simplified theoretical scenario: two species have the same body sizes, but species A has a larger brain and therefore a larger FP. In this case, species A has a larger FP because of brain allometric patterns, and models including body size as a covariate would link FP size and socioecological variables characteristic of species A (and others like it). However, perhaps the FP of species A is actually smaller than expected for its brain size, while the FP of species B is larger than expected for its brain size.

      As explained in our response to the public review, we did run this analysis and we agree with the reviewer’s point from a practical point of view: it is important to know the extent to which the relation with a set of socio-ecological variables is specific of the region of interest, vs less specific and present for other brain regions. Again, we are sorry to not have understood that earlier, and we acknowledge that since it is a standard in the field, it needs to be addressed thoroughly.

      We understand that the scaling intuition, and the need to get a reference point for volumetric measures, but here the volume of each brain region is taken as a proxy for the number of neurons and therefore for the region’s computational capacities. Since, for a given brain region (FP or DLPFC) the neural densities seem to be well conserved across species, comparing regional volumes across species provides a good proxy for the contrast (across species) in neural counts for that region. All we predicted was that for a given brain region, associated with a given cognitive operation, the volume (number of neurons) would be greater in species for which socio-ecological constraints potentially involving that specific cognitive operation were greater. We do not understand how or why the rest of the brain would change this interpretation (of course, as discussed just above, beyond the question of specificity). And using whole brain volume as a scaling measure is problematic because the whole brain density is very different from the density of these regions of the prefrontal cortex (see above for further details). Again, we acknowledge that allometric patterns exist, and we understand how they can be interpreted, but we do not understand how it could prove or disprove our hypothesis (brain regions involved in specific cognitive operations are influenced by a specific set of socio-ecological variables). When using volumes as a proxy for computational capacities, the theoretical implications of scaling  procedures might be problematic. For example, it implies that the computational capacities of a given brain region are scaled by the rest of the brain. All other things being equal, the computational capacities of a given brain region, taken as the number of neurons, should decrease when the size of the rest of the brain increases. But to our knowledge there is no evidence for that in the literature. Clearly these are very challenging issues, and our position was to take absolute measures because they do not rely upon hidden assumptions regarding allometric relations and their consequence on cognition.

      But since we definitely understand that scaling is a reference in the field, we have not only completed the corresponding analysis (including the whole brain as a covariate, together with socio-ecological variables) but also expended the discussion to address this issue in detail. We hope that between this new analysis and the comparison of effects between non-scaled measures of FP, DLPFC and the whole brain, the reader will be able to judge the specificity of the effect.

      Models including brain (instead of body) size would instead link FP size and socioecological variables characteristic of species B (and others like it). This approach is supported by a large body of literature linking comparative variation in the relative size of specific brain regions (i.e., relative to brain size) to behavioral variation across species - e.g., relative size of visual/olfactory brain areas and diurnality/nocturnality in primates (Barton et al. 1995), relative size of the hippocampus and food caching in birds (Krebs et al. 1989).

      Barton, R., Purvis, A., & Harvey, P. H. (1995). Evolutionary radiation of visual and olfactory brain systems in primates, bats and insectivores. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 348(1326), 381-392.

      Krebs, J. R., Sherry, D. F., Healy, S. D., Perry, V. H., & Vaccarino, A. L. (1989). Hippocampal specialization of food-storing birds. Proceedings of the National Academy of Sciences, 86(4), 1388-1392. 

      We are grateful to the reviewer for mentioning these very interesting articles, and more generally for helping us to understand this issue and clarify the related discussion. Again, we understand the scaling principle but the fact that these methods provide interesting results does not make other approaches (such as ours) wrong or irrelevant. Since we have used both our original approach and the standard version as requested by the reviewer, the reader should be able to get a clear picture of the measures and of their theoretical implications. We sincerely hope that the present version of the paper will be satisfactory, not only because it is clearer, but also because it might stimulate further discussion on this complex question.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments:

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5)  Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

      Round 2 of reviews

      Reviewer 3:

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."

      This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream. In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      That is a good point – we have now reformulated this sentence to instead say “to avoid triggering premature movement, any pre-movement activity in the motor and dorsal premotor (PMd) cortices must engage the pyramidal tract neurons in a way that ensures their activity patterns will not lead to any movement”.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.

      It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      That is a good point: we have now edited the text after line 170 to make it clear that the underlying dynamics may not be confined to M1, and have referenced the later discussion there.

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

      This is a nice suggestion, and we thank the reviewer for pointing us to the Haith and Krakauer paper. We have now added this reference and extended the paragraph following line 815 to briefly discuss the possible decoupling between preparation and movement initiation that is shown in the Haith paper, emphasizing how this may affect the interpretation of the internal delay and comparisons with behavioral experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review: 

      This study used ATAC-Seq to characterize chromatin accessibility during stages of GABAergic neuron development in induced pluripotent stem cells (iPSCs) derived from both Dravet Syndrome (DS) patients and healthy donors. The authors report accelerated GABAergic maturation to a point, followed by further differentiation into a perturbed chromatin profile, in the cells from patients. In a preliminary analysis, valproic acid, an anti-seizure medication commonly used in patients with DS, increased open chromatin in both patient and control iPSCs in a nonspecific manner, and to different degrees in cultures derived from different patients. These findings provide new information about DS-associated changes in chromatin, and provide further evidence for developmental abnormalities in interneurons with DS. 

      Strengths:

      This is a novel study that aims to investigate the epigenetic changes that occur in a sodium channel model of epilepsy; these changes are often ignored but may be an interesting area for future therapeutics. In general, the flow of the paper is good, and the figures are well-designed.  Reply: Thank you for your positive feedback about our work. 

      Weaknesses:

      The most substantial weakness relates to the observation that DS is often viewed as a monogenic form of epilepsy. It is directly linked to SCN1A gene haploinsufficiency (Yu et al, 2006; Ogiwara et al, 2007). The gene product is Nav1.1, the alpha subunit of voltage-gated sodium channel type I that regulates neuronal excitability. Yet, analysis was conducted at time points of GABAergic interneuron differentiation in which SCN1A is likely not expressed. The paper would be strengthened if SCN1A expression and Nav1.1 protein were examined across the experimental time course. If SCN1A is not yet expressed, this would complicate any explanation of how the observed epigenetic changes might arise. It also seems counterintuitive that the absence of a sodium channel can accelerate differentiation, when, a priori, one might expect the opposite (a 'less neuronal' signal). 

      Thanks, this is an important point!  In our revised manuscript, we have incorporated data on the expression of SCN1A at d19 and d65 of GABAergic development in both the control and patient groups. We first retrieved data from our previous RNA-Seq analysis, showing SCN1A gene expression in our cells at both d19 and d65. We have now updated our text on the SCN1A gene expression in the revised manuscript (Revised Supplementary Figure 1A, revised text Line 108-109). Second, we confirmed the dynamics of SCN1A expression by real-time quantitative RT/PCR analysis at four time-pionts of GABAergic development (d0, d19, d35 and d65). Notably, expression of SCN1A was detected by qRT-PCR from d19 and the expression increased with differentiation. We have now included this information in the revised manuscript (Revised Supplementary Figure 1B, revised text Line 112). 

      Related to this, another important limitation of the study is that the controls are cells derived from healthy individuals and not from isogenic lines. The usage of isogenic lines is extremely relevant for every study in which iPSC-derived somatic cells are used to model a disease, but specifically in diseases like DS, in which the genetic background has an ascertained impact on disease phenotype (Cetica et al, 2017 and others). This serious limitation should be considered.

      Yes, we fully agree that isogenic and edited patient-derived iPSC would have been the ideal controls. At an early stage we therefore invested considerable time and efforts in order to generate isogenic lines from patientderived iPSC. However, editing of the SCN1A variants in patient-derived iPSC turned out unsuccessful after several trials and modifications so we finally turned to iPSC from healthy donors. This is now discussed together with other limitations of our study in the revised manuscript (end of discussion section, lines 499-506).

      In addition, the authors should provide data on variability across cell lines and differentiations to help convince the reader that the results can be attributed to genetic defects, rather than variability across individuals. 

      This is a valuable point. In the revised manuscript, we have now added plots and IF staining from individual samples to give the readers a complete picture on how they are distributed (Revised Supplementary Figure 1C, Revised Supplementary Figure 2, and Revised Supplementary Figure 4).

      In the revised manuscript, we incorporated an explanation on the strategy used to compare the two groups (cases vs. controls) in more detail. In our analysis, we first compared the dynamic changes of chromatin accessibility cell line by cell line across differentiation. We then extracted the common changes from different cell lines at each time point (Revised text line 152-155, line 226-228). Using this strategy, we extracted the common changes confined to the control and patient groups, respectively. With this approach we avoid to capture the variability across individuals.

      Additionally, the authors acknowledge the variability of the differentiations and cell lines, which is commendable, and they attribute this to "possibly reflecting cell line specific and endogenous differences reported previously", but could also have to do with cell death. This is a large confounding factor for ATAC-seq. Certainly, Sup Fig 1C shows lower FrIP scores, consistent with cell death, and there seems to be a lot of death in the representative images. Moreover, the iGABA neurons are very difficult to keep alive, especially to 65 days, without co-culturing with glia and/or glutamatergic neurons. The authors should comment on how much these factors may have influenced their results. 

      With this point in mind, we re-examined QC of our ATAC-Seq across all samples: As shown in revised

      Supplementary Figure 2C and Supplementary Figure 4C, our cutoff for FRiP is 15%, and all of samples have an FrIP of more than 15%. At the later time points (d35 or d65), we did not observe a FRiP <15%. We therefore feel confident that the quality of ATAC-Seq is good enough for downstream analysis and data interpretation.  

      Regarding the differentiation protocol, we are following a directed protocol of iPSC towards interneurons. The protocol is described in detail by Maroof et al (reference 34) and slightly modified in our lab (described in reference 13). With our modified protocol, GABAergic cells are viable beyond day 65 without the need of co-cultures with astrocyte or microglia. This is also reflected by the electrophysiological activity of interneurons at d65 and at later time points (reference 13). Additionally, our ambition was to obtain a homogeneous cell population for further analysis. Adding other cell types to the cultures would have interfered with downstream processes and a need for cell sorting. Using our protocol, we obtain viable GABA interneurons after up to 100 days in culture. To assess the viability of our cells at the point of sampling (other than by morphological assessment), we used Trypan blue staining and an automated cell counter. Only samples with a viability >90% were processed for ATAC seq. which is a commonly used cut-off for cell viability. We have now modified the method section in the revised version to describe the GABAergic differentiation and sampling (line 519-529).

      Finally, changes in gene expression are only inferred, as no RNA levels were measured. If RNA-seq was not possible it would have been good to see at least some of the key genes/findings corroborated with RNA/protein levels vs chromatin accessibility alone, particularly given that these molecular readouts do not always correlate. 

      In our revised manuscript, we include our recently published RNA-seq performed at d19 and d65. We also correlated the RNAseq and ATACseq data obtained from the same samples.  The Pearson correlations between gene expression and chromatin accessibility were within the range 0.49-0.57 (Revised Supplementary Figure 2G, Revised supplementary Figure 4G), which is acceptable according to standard criteria. The results confirmed that the quality of ATAC-Seq is good enough for analysis of expression levels and chromatin openness in key genes. We also added gene expression levels from RNA-seq (d19 and d65) in our revised manuscript (Revised Figure 1G, Revised Figure 2G). Finally, we performed qRT-PCR analysis of key genes in each cluster and the results are now included in the revised version (Revised Supplementary Figure 3E, Revised Supplementary Figure 5E)

      Additional Points:

      (1) Representative images for cell-identity markers for only D65 are shown, and not D0, D19, and D35 though it is stated in the text that this was performed. At a minimum, these representative images should be shown for all lines. 

      As suggested, we have now added images for cell identity markers of all iPSC lines in the revised version (Revised Supplementary Figure 1C).

      (2) What QC was performed on iPSC lines, i.e. karyotype/CNV analysis and confirmation of genotypes?

      All iPSC lines used in this study have been fully characterized according to standard and state-of-the art procedures: Expression of pluripotency and stemness genes has been shown by immunostaining, flow cytometry and scorecard analysis; integrity of the genome has been assessed by karyotyping using g-banding; differentiation capacity was characterized using an embryoid body assay in combination with scorecard analysis; and genotypes were verified by Sanger sequencing. Please, see the following publications for full datasets: Schuster et all, Neurobiol Dis 2019, Schuster et al Stem Cell Res 2019, Sobol et al Stem Cells and Development 2015. In our lab, the integrity of iPSC lines are routinely verified using flow cytometry (expression for TRA-1-60 and SSEA4), immunostaining (expression of NANOG, SOX2 and OCT4), Sanger sequencing (targeting variants in SCN1A gene), cell morphology analysis and analysis of mycoplasma by MycoAlert® (Lonza).

      (3) Were all experiments performed on a single differentiation? Or multiples? Were the differentiations performed with the same type? If not, was batch considered in the analysis? 

      Thank you for raising this question. The text Material and Methods has been modified as follows, to better describe the differentiation and sampling procedure:

      “GABAergic interneuron differentiation from iPSCs was performed as previously described (reference 13). The protocol utilizes DUAL SMAD inhibition to induce neurogenesis towards neural stem cells for 10 days, followed by patterning with high levels of sonic hedgehog for nine days towards cortically fated neuronal progenitor cells (NPC) and subsequent maturation for 46 days, i.e. a total of 65 days (Figure 1A). Neuronal cells at day 65 and onwards are healthy and viable as judged by morphological assessment by light microscopy. Differentiation was performed at least 3 times per cell line.  

      Cell cultures were sampled at days 0 (D0), D19, D35 and D65, respectively, by harvesting cells with TryplE and centrifugation (300 x g, 3 min). Harvested cells were counted and assessed for viability using trypan blue staining and an automated EVE cell counter (Nano Entek). Samples with a viability of >90% were chosen for ATAC-Seq library preparation (see below).”.  

      I also assume that technical replicates were merged, and then all three biological replicates were kept for each analysis and outliers were not removed, e.g. Control_D19_8F seems like an example of an outlier. 

      This is a valuable point. We agree on that there is variability across three health donors and patients, respevtively, but the quality of ATAC-Seq is good after multiple assessment of QC (Revised Supplementary Figure 2B-D). The color code in Supplementary Figure 1C may be mis-leading as the Pearsson correlation of all samples was displayed. Overall, the correlation from all ATAC-seq among replicates are over 0.8. At the same time, we observed that samples at d0 are clustered together, but not at the later time points. We interpret this as related to the cell-line specific plasticity of chromatin dynamic during differentiation. The observation agrees with our results from PCA (Revised Supplementary Figure 2F).  

      (4) In Figure 1C, it is intriguing that the ATACseq signal gets stronger in imN. One might expect it to be strongest in the iPSCs which are undifferentiated and have the highest levels of open chromatin. Is this a function of sequencing depth, or are all the Y-axes normalized across all time points? 

      This is another valuable point. Figure 1C present the average chromatin openness for clusters specific regions- not of chromatin openness from the entire genome, which is a reason for why the chromatin openness at

      D35 is higher than at other time-points. The genome-wide chromatin openness is presented in revised

      Supplementary Figure 2D and we have now updated the figure legend to avoid any potential misunderstanding. 

      The sequencing depth for each sample is extracted in a similar range. To give the readers a complete picture, we also present the depth of sequencing reads for each sample (Revised Supplementary Figure 2A and Revised Supplementary Figure 4A). The Y-axes of genome browser tracks were normalized, and we added the normalized value in the figures. 

      (5) In Figure 1F, are these all enriched terms, or were they prioritized somehow? 

      Yes, the enriched terms are prioritized based on biological meanings, and we have now clarified this in the updated legend of the manuscript. In addition, all enriched terms are now included in revised Supplementary Table 2 and Supplementary Table 4. 

      (6) In Figure 1G (also the same plots in Fig 2/3), are all these images normalized i.e. there is no scale bar for each track, and do they represent and aggregate BAM/bigwig?

      Yes, the genome browser tracks were normalized and we have now revised the figures by adding scale bars.

      It would be good to show in supplement the variability across cell lines/diffs - particularly given the variability in the heatmap/PCA - and demonstrate the rigor/reproducibility of these results. This comment applies to all these plots across the 3 figures, particularly as in some instances the samples appear to cluster by individual first and then time point (Sup Fig 3B). 

      Thanks. We have now revised the figure with plots showing individual samples. 

      How confident are the authors that these effects are driven by genotype and not a single cell line? In the Fig 3D representation of NANOG, it is very difficult to see any difference between patient and control. 

      In Figure 3D, we showed common chromatin dynamics in the control and patient groups. To avoid any misunderstanding, we have now updated our legend in the revised manuscript. 

      (7) For the changes in occupancy annotation (UTR/exon/intron etc), are these differences still significant after correcting for variability from cell line to cell line at each time point? I.e. rather than average across all three samples, what is the range?  Reply: Revised accordingly. 

      (8) The VPA timepoint is not well-justified. Given that VPA would be administered in patients with fully mature inhibitory neurons, it is difficult to determine the biological relevance. I appreciate that this is a limitation of the model, but this should at least be addressed in the manuscript. 

      We agree on that our model system of GABAergic interneuron development has limitations and that cells may not fully recapitulate the development and physiology in vivo. Obvious factors to consider in our system are the directed protocol to enrich for GABAergic interneurons and the differentiation time-line restricted to 65d. This is now discussed (lines 499-506).

      Recommendations for the authors:

      (1) The term 'mutation' has been replaced with the term ' pathogenic variant' or likely pathogenic variant depending on the context, please see PMID: 25741868 

      Thank you for pointing this out. We have replaced all instances of “mutation” with “pathogenic variant” throughout the manuscript.

      (2) It is unclear what the nomenclature for sample labelling is in Supplementary Figure 1, e.g. 7C, 8F, 1B.  

      We apologize for this confusion. There are cell lines names. We labeled all data and images according to cell line name, i.e. control lines: Ctl1B, Ctl7C and Ctl8F; patient lines: DD1C, DD4A, DD5A. To avoid any potential confusion, we have added a note in the revised legend of Supplementary Figure 1B.

      (3) Can the authors confirm that the Deseq2 FDR values are Benjamini-Hochberg procedure corrected per default settings? If so, this should ideally be added to methods or legend for clarity 

      Yes, default settings were used in Deseq2 FDR values, which is added in the method part of revised manuscript. 

      (4) While it makes sense that the authors present the data in the order of Figure 1, and Figure 2, this actually makes it quite difficult to compare the two datasets, especially for the functional enrichment in the "F" figures. It may be helpful to consider re-organizing the figure order. For instance, for the long-term potentiation signal in the DS-iPSCs, what does this mean in terms of biological relevance? Or maybe Figure 2 needs to be supplementary given that Figure 3 is a more direct comparison.  

      Thank you for the suggestions. We attempted to reorganize during our revision. We still believe it is easier for the audience to grasp the main message if we organize it according to our current workflow—first presenting an individual differential landscape for controls and patients, and then comparing the common and unique aspects among them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, entitled " Merging Mul-OMICs with Proteome Integral Solubility Alteration Unveils Antibiotic Mode of Acon", Dr. Maity and colleagues aim to elucidate the mechanisms of action of antibiotics through combined approaches of omics and the PISA tool to discover new targets of five drugs developed against Helicobacter pylori.

      Strengths:

      Using transcriptomics, proteomic analysis, protein stability (PISA), and integrative analysis, Dr. Maity and colleagues have identified pathways targeted by five compounds initially discovered as inhibitors against H. pylori flavodoxin. This study underscores the necessity of a global approach to comprehensively understanding the mechanisms of drug action. The experiments conducted in this paper are well-designed and the obtained results support the authors' conclusions.

      Weaknesses:

      This manuscript describes several interesting findings. A few points listed below require further clarification:

      (1) Compounds IVk exhibits markedly different behavior compared to the other compounds. The authors are encouraged to discuss these findings in the context of existing literature or chemical principles.

      This is a good point. We have added the following paragraph (Page No-13).

      “In several of our studies, compound IVk, which has a higher MIC, exhibits markedly different behavior. This difference in behavior may stem from different sources, including intercellular availability, inactivation inside the cell, or loss of target specificity. Multiple studies have previously demonstrated that there is only a 30% chance for a structurally similar compound to have similar biological activity32.”

      (2) The incubation me for treating H. pylori with the drugs was set at 4 hours for transcriptomic and proteomic analyses, compared to 20 min for PISA analysis. The authors need to explain the reason for these differences in treatment duration.

      This is now explained in Pages 17 and 19, where the following paragraphs have been included

      “The incubation time for transcriptomics and proteomics assays was determined based on the Time-Kill Curves assay (Fig. 6(A)). The 4-hour time point shows a significant amount of cell death compared to the control population.”

      “The target deconvolution method aims to evaluate the initial interaction with intracellular proteins. We selected a 20-minute time point based on intracellular ROS generation (not shown). It is a well-reported phenomenon that bactericidal drugs induce early production of ROS.”

      (3) The PISA method facilitates the identification of proteins stabilized by drug treatment. DnaJ and Trigger factor (g), well-known molecular chaperones, prevent protein aggregation under stress. Their enrichment in the soluble fraction is expected and does not necessarily indicate direct stabilization by the drugs. The possibility that their stabilization results from binding to other proteins destabilized by the drugs should be considered. To prevent any misunderstanding, the authors should clarify that their methodology does not solely identify direct targets. Instead, the combination of their findings sheds light on various pathways affected by the treatment.

      This is also a very valuable observation. We now clearly state that in new paragraphs at Pages 8 and 13

      Another target shared among several compounds is the chaperone protein trigger factor (Tig), which plays a crucial role in facilitating proper protein folding and is indispensable for the survival of bacterial cells. The solubility of this protein has been altered by all the compounds except IVk (Fig. 2(I-J)) in a concentration-dependent manner (Fig. S4(B, D, and E)). The possibility of Tig interacting with other proteins destabilized by the drug, along with the influence of the heat gradient during the PISA assay, may introduce potential noise in the data. Further investigation is required to confirm the interaction of the drug with Tig.

      “The module “black” associated with this compound contains Tig, which is involved in facilitating proper protein folding, as a target, and it down-regulates multiple proteins associated closely with S12 ribosomal protein of the 30S subunit (Fig. S9(D)) indicating its involvement in stabilization of ribosomal protein.”

      (4) At the end of the manuscript, the authors conclude that four compounds "strongly interact with CagA". However, detailed molecule/protein interaction studies are necessary to definitively support this claim. The authors should exercise caution in their statement. As the authors mentioned, additional research (not mandated in the scope of this current paper) is necessary to determine the drug's binding affinity to the proposed targets.

      We have modified the sentence (Page -15) to say:

      “This study identifies four out of our five compounds that induce significant change in the solubility of CagA, the major virulence factor of H. pylori.”

      (5) The authors should clarify the PISA-Express approach over standard PISA. A detailed explanation of the differences between both methods in the main text is important.

      This was already explained in Page 5 (no changes have been made)

      Reviewer #2 (Public Review):

      Summary:

      This work has an important and ambitious goal: understanding the effects of drugs, in this case antimicrobial molecules, from a holistic perspective. This means that the effect of drugs on a group of genes and whole metabolic pathways is unveiled, rather than its immediate effect on a protein target only. To achieve this goal the authors successfully implement the PISA-Express method (Protein Integral Solubility Alteration), using combined transcriptomics, proteomics, and drug-induced changes in protein stability to retrieve a large number of genes and proteins affected by the used compounds. The compounds used in the study (compound IVa, IVb, IVj, and IVk) were all derived from the precursors compound IV, they are effective against Helicobacter pylori, and their mode of action on clusters of genes and proteins has been compared to the one of the known pylori drug metronidazole (MNZ). Due to this comparison, and confirmed by the diversity of responses induced by these very similar compounds, it can be understood that the approach used is reliable and very informative. Notably, although all compound IV derivatives were designed to target pylori Flavodoxin (Fld), only one showed a statically significant shift of Fld solubility (compound IVj, FIG S11). For most other compounds, instead, the involvement of other possible targets affecting diverse metabolic pathways was also observed, notably concerning a series of genes with other important functions: CagA (virulence factor), FtsY/FtsA (cell division), AtpD (ATP-synthase complex), the essential GTPase ObgE, Tig (protein export), as well as other proteins involved in ribosomal synthesis, chemotaxis/motility and DNA replication/repairs. Finally, for all tested molecules, in vivo functional data have been collected that parallel the omics predictions, comforting them and showing that compound IV derivatives differently affect cellular generation of reactive oxygen species (ROS), oxygen consumption rates (OCR), DNA damage, and ATP synthesis.

      Strengths:

      The approach used is very potent in retrieving the effects of chemically active molecules (in this case antimicrobial ones) on whole cells, evidencing protein and gene networks that are involved in cell sensitivity to the studied molecules. The choice of these compounds against H. pylori is perfect, showcasing how different the real biological response is, compared to the hypothetical one. In fact, although all molecules were retrieved based on their activity on Fld, the authors unambiguously show that large unexpected gene clusters may, and in fact are, affected by these compounds, and each of them in different manners.

      Impact:

      The present work is the first report relying on PISA-Express performed on living bacterial cells. Because of its findings, this work will certainly have a high impact on the way we design research to develop effective drugs, allowing us to understand the fine effects of a drug on gene clusters, drive molecule design towards specific metabolic pathways, and eventually better plan the combination of multiple active molecules for drug formulation. Beyond this, however, we expect this article to impact other related and unrelated fields of research as well. The same holistic approaches might also allow gaining deep, and sometimes unexpected, insight into the cellular targets involved in drug side effects, drug resistance, toxicity, and cellular adaptation, in fields beyond the medicinal one, such as cellular biology and environmental studies on pollutants.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please modify these few concerns:

      -  It is unclear from the introduction and discussion whether conventional transcriptomic and proteomic analyses have previously been conducted on the compounds examined in this study. If only targeted studies have been performed please clarify this further.

      To make it more clear, we have added the following paragraph in Page 5:

      “Our investigation into understanding the mode of action of nitro-benzoxadiazole compounds commenced with a comparison of the conventional transcriptional and translational changes induced by these compounds, the vehicle control (DMSO), and the commercially used drug MNZ. RNA sequencing (RNA-seq) and expressional proteomics were employed to identify transcriptional and translational changes, respectively.”

      -  The decision to monitor the oxygen consumption rate (OCR) is based on the hypothesis that the drugs would impact flavodoxins function. Could the authors cite specific studies that suggest a reduction in flavodoxin leads to decreased OCR that can be measured?

      The reviewer is correct to say that we have done this study based on our hypothesis that a reduction in flavodoxin may lead to decreased OCR.  To our knowledge, there is no previous studies indicating that so we now clearly state (Page 14) that it is our hypothesis.

      “On the other hand, given that these drugs indicated involvement of multiple factors from the electron transport chain including flavodoxin and we observed significant drop in the ATP production rate (Fig. 6(D)) associated to compounds IV and IVj, we have investigated the changes in oxygen consumption rate (OCR) as we hypothesize that a reduction in soluble flavodoxin could lead to decreased OCR”

      -  Increase font size in some figures and supplemental materials for clarity.

      We acknowledge the reviewer's comment and have addressed it to the best possible extent in the figures.

      -  Correct figure references throughout the text (example of mistake p4, Fig S1D, p6 S1C).

      We have corrected the figure references.

      -  Check spelling errors, for example, Figure S1B: "library preparation".

      We have revised the figures and corrected spelling errors.

      -  Ensure H. pylori is in italics.

      Done!

      -  Figure S4: Replace (D) by (E).

      Done! Thank you.

      -  Page 7: Check the sentence: "...RpleE, InfC) and F Furthermore, we..." .

      Corrected!  

      “The 20 common essential targets are mostly associated with cell division (for example, FtsZ), small subunit ribosomal proteins (RspC, RspE, RspL, RplE, InfC). Furthermore, we identified a few unique changes for compound IV (DnaN, involved in DNA tethering and processivity of DNA polymerases, and C694_06445, which could be a functional equivalent of delta subunit of DNA polymerase III).”

      -  Page 9: Please modify the name of one compound "Compounds IV, IVj (and not IVk) and MnZ downregulate...".

      We have observed that both reviewers mentioned this point and we revisited the data, as suggested by Fig S8(B), that compounds IV, IVk, and MNZ cluster together and downregulate the genes associated with this pathway. Based on this, we have not changed anything in the text.  

      -  Figure S9: please clarify symbols (triangles and others) in the Figure legend.

      Done!

      -  Page 9: Is it the Figure S9B you are referring to? Talking about proteomics?

      Sorry, we have not understood the above comment.

      Reviewer #2 (Recommendations For The Authors):

      All figures are printed as one per page. In this format, almost all pictures suffer a severe problem with dimensions. Notably graph axes and axis values, subtitles, and legends within the pictures are too small, although the graphical part is almost always appropriate. Negative example (higher fonts are needed): Figure 1. Positive example (font ok): Figure 2A or Figure 3 right panels.

      We have carefully revised our figures to address the issues you mentioned, ensuring that elements are visible when printed one per page. In Fig 1: We have increased the font sizes of the graph axes, axis values, subtitles, and legends to improve readability. Additionally, we have color-matched different Gene Ontology (GO) terms for better rideability. In Fig 2: To enhance clarity, we have resized the figure by removing the top 10 protein list, now presented in a separate table. This ensures that the figure's main content remains prominent.  These modifications have been made across figures to maintain consistency and readability.

      For all figures, particularly for non-experts, not only a list of what is found in the picture should be provided, but also a minimal, simplified key of interpretation (of what is to be noticed). Particularly relevant for scatter plots.

      We have modified the legends to provide simplified key interpretation for the scatter plots. 

      In general for most analyses I see the involvement of FtsA, whereas most discussions concern FtsY and FtsZ. Maybe this point should be clarified. For example: i) FtsZ is quoted in the Second "Results" paragraph (page 6), but we can't find this gene in Figure 2, nor in the corresponding table (Figure 2A); ii) FtsY downregulation is quoted in the Fifth "Results" paragraph (page 9), but we can't find this gene in Figure 5, 9S or 10S.

      We are not entirely sure if we have understood the reviewer's comment correctly, as we did not mention FtsY in our discussion section. In the discussion section, we have focused on the involvement of FtsZ and FtsA with some of our compounds. We decided to discuss them together because FtsZ is the primary component that is recruited to the membrane by the actin-related protein FtsA, while the role of FtsY remains highly debated.

      Figure 1: same colour for the same GO: term in different panels should be used.

      Done!

      Figure 4: please specify (being it essential throughout the whole paper) that the group colouring only refers to Figure 4A, lower bar.

      Done!

      Figure 5, S9, and S10: having the combination of analysed sets (brown / IV , magenta / IVb, etc....) as a panel subtle is almost a necessity, to avoid constant page turning. I did rewrite all of them by hand to be able to follow the main text story.

      Done!

      What are the triangles? (this is not written anywhere).

      We have now explained this in the legends of Fig5.

      Figures S9 and S10 are too crowded (please refer to Figure 5 for a good format/size).

      For supplementary figures S9 and S10 we prefer to keep the gene names, but in order to make them more legible we have now added subtitles to each panel.

      Second and third "Results" paragraph. Explicitly saying that the Second is only focused on TOP 10 hits, at the beginning of the paragraph (while the third on essential genes) would help enormously the non-specialist in orienting among the different sections.

      On page 7, we have revised the text to indicate that the paragraph is only focused on the top 10 hits. Additionally, we have included a table of top 10 hits for better clarity and accessibility. 

      Page 6: the following sentence should be in the introduction, to stress the novelty of the work: "This is the first me PISA assay, in the form of PISA-Express, has been successfully performed in living bacterial cells, with protocols adapted and modified from previous PISA studies in mammalian cells".

      Page- 2 

      We agree this is an important point. However, having we stated it in both the abstract and in the PISA section in the results we prefer not to state it once more in the Introduction.

      (no changes made)

      I couldn't find any reference to Figure S3 in the text.

      Included! (P 9)

      "Compounds IV, IVk, and MNZ downregulate the genes associated with this pathway (Fig. 4(B) & S8(B))": it seems to me that it is IVj rather than IVk to downregulate. Please check carefully.

      We have observed that both reviewers mentioned this point and we revisited the data, as suggested by Fig S8(B), that compounds IV, IVk, and MNZ cluster together and downregulate the genes associated with this pathway. Based on this, we have not changed anything in the text.  

      Page 12: of the pre-defined target like flavodoxin => of the pre-defined target flavodoxin.

      Thanks! We have removed “like” from the sentence.

      Metronidazol (=MNZ) only appears on page 13 (MNZ already on page 8).

      Corrected!  The correspondence is now first indicated in P. 3.

      Please resolve the ambiguity metronidazol/metronidazole (main text and figures).

      We now always say “metronidazole”

      The Sixth "Results" paragraph (pages 10-11) should be developed a bit more. All Figure 6 results are summarized in 8 lines at the end of the paragraph. This doesn't bring much, particularly to a non-specialist reader. Please, for each panel, clearly explain what is to be noticed and what main conclusion(s) can be extracted.

      We have improved the description of the section. The modified part now reads:

      …This indicates that the nitro-bearing groups have a higher propensity to generate ROS. We have also observed that the genes associated with the generation of ROS are significantly overexpressed for compounds IV, IVb, IVj, and MNZ (Fig. S12(A)). As described above and depicted in Fig. S12(B), multiple DNA damage repair proteins and genes are down-regulated in the presence of compounds IV, IVb, IVj, and MNZ. Additionally, DNA PolA was found to be a major target for compound IVj. Following these results, we investigated compound-induced DNA damage using the APO BrdU TUNEL assay. All the compounds, particularly IV and IVj, caused significant DNA damage (Fig. 6(C)).

      On the other hand, given that these drugs indicated involvement of multiple factors from the electron transport chain including flavodoxin and we observed significant drop in the ATP production rate (Fig. 6(D)) associated to compounds IV and IVj, we have investigated the changes in oxygen consumption rate (OCR) as we hypothesize that a reduction in soluble flavodoxin could lead to decreased OCR.  Though the signal-to-noise ratio of these data is poor…

      and we added figure S12 for clarity.   

      In the same section I found: "Compound IV and its derivatives cause a marked increase in ROS generation when compared to the control (DMSO)" => refers to THIS work or previous work? (in the later case, please quote it).

      This data is from our current paper, as shown in Fig 6(B).

      In the same paragraph, "the signal-to-noise ratio of these data is considerable" => does it mean that you have good (high signal-to-noise) data, or that you have too high noise for precise quantification? I rather understood the later, but this sentence definitely needs to be rewritten.

      Thank you for pointing out the mistake. Your interpretation is correct. We have corrected the sentence.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) The conclusions in the text are very broad and general but often based on a limited number of examples. It would be important that the authors hit the appropriate tone when most of the analysis (in Figure 5) is derived from n=3 events.

      We have tried to hit the correct tone here by modifying our manuscript text. In particular we have we have added a pie chart to Figure 4 (Figure 4C, that summarises data from all RBMX targets, not just the original n=3, and shows that most RBMX targets are rescued by RBMXL2).

      (2) The fractions of long/ultra-long exons actually bound by/regulated by RBMX are not clearly stated - which is in contrast to the general statement of the title (implying a global role for RBMX in proper splicing of ultra-long exons).

      (i) We have changed our title (now “An anciently diverged family of RNA binding proteins maintain correct splicing of a class of ultra-long exons through cryptic splice site repression”).

      (ii) We also include much more clear text about the fractions of long/ultralong exons bound by RBMX with the following text: 

      “…..This led us to test whether RBMX protein is preferentially associated with long exons. For this we plotted the distribution of internal exons bound and regulated by RBMX together with all internal exons expressed from HEK293 mRNA genes (Liu et al., 2017) (Figure 2 – Source Data 1). We found that RBMX controls and binds two different classes of exons: the first have comparable length to the average HEK293 exon, while the second were extremely long, exceeding 1000 bp in length (Figure 2F). We defined this second class as ‘ultra-long exons’, which represented the 18.9% of internal exons regulated by RBMX and 17.6% of the ones that contained RBMX iCLIP tags. These proportions were significantly enriched compared to the general abundance of internal ultra-long exons expressed from HEK293 cells, which was only 0.4% (Figure 2G)……”

      “…….We next wondered whether ultra-long exons regulated by RBMX (which represented 11.6% of all ultra-long internal exons from genes expressed in HEK293) had any particular feature compared to ultra-long exons that were RBMX-independent……..”

      (3) The authors should state what fraction of ultra-long exons show cryptic splicing in the RBMX siRNA that are corrected by RBMXL2 overexpression (rather than just showing the 3 events). There's some confusion about the global nature of the conclusions relative to the data displayed.

      This is a good point. We have used the RNAseq information as suggested, and included a pie chart (Figure 4C) that includes this information.

      (4) It would be helpful if the authors could identify if there are some motifs more present in ultra-long exons than others.

      Good point, we have included k-mer analysis of the ultra-long exons bound by RBMX, and also more generally ultra-long exons in the human genome, in Figure 2H and 2I. We also add the following text:

      K-mer analyses also showed that while ultra-long exons within mRNAs are rich in AT-rich sequences compared to shorter exons (Figure 2H), the ultra-long exons that are either regulated or bound by RBMX displayed enrichment of AG-rich sequences (Figure 2I), consistent with our identified RBMX-recognised sequences (Figure 2C).

      (5) The authors should evaluate if RBMX-repressed 3' splice sites have similar or low splice site scores/strengths than natural 3' splice sites.

      We have added splice site score analyses in Figure 1F and Figure 1 Supplement 1B. These show that the cryptic splice sites repressed by RBMX are not significantly different from those that are normally used. We add the following text to accompany these figure panels:

      “Furthermore, analysis of splice site strength revealed that, unlike splice sites activated by RBMX (Figure 1 – Figure supplement 1B), alternative splice sites repressed by RBMX have comparable strength to more commonly used splice sites (Figure 1F). This means that RBMX operates as a splicing repressor in human somatic cells to prevent use of ‘decoy’ splice sites that could disrupt normal patterns of gene expression.”

      (6) The section "RBMX protein-RNA interactions may insulate important splicing signals from the spliceosome." is a very preliminary look at possible mechanisms. Can you integrate the RNA Seq and CLIP datasets to generate "splicing maps" that would provide more generalized insights? In fact, where possible, it would be great to integrate the iCLIP data from the same cell types to generate RNA splicing maps (with the KD RNA-seq data)

      We have added “RNA map-type” plots to integrate iCLIP data with splicing patterns (Figure 2 Figure supplement 1D and 1E), and made corresponding changes to the text.

      Additional changes

      We also made some extra changes to respond to the further points raised by reviewers.

      (1) We have carried out gene ontology analysis of those genes that contain RBMX-regulated ultra-long exons versus all ultra-long exons (now Figure 3A, and also Figure 3- Figure supplement 1A and 1B).

      (2) We have corrected the cartoon summarising the branch point analysis (now Figure 3 – Figure Supplement 2F).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, by using simulation, in vitro and in vivo electrophysiology, and behavioral tests, Peng et al. nicely showed a new approach for the treatment of neuropathic pain in mice. They found that terahertz (THz) waves increased Kv conductance and decreased the frequency of action potentials in pyramidal neurons in the ACC region. Behaviorally, terahertz (THz) waves alleviated neuropathic pain in the mouse model. Overall, this is an interesting study. The experimental design is clear, the data is presented well, and the paper is well-written. I have a few suggestions.

      (1) The authors provide strong theoretical and experimental evidence for the impact of voltage-gated potassium channels by terahertz wave frequency. However, the modulation of action potential also relies on non-voltage-dependent ion channels. For example, I noticed that the RMP was affected by THz application (Figure 3F) as well. As the RMP is largely regulated by the leak potassium channels (Tandem-pore potassium channels), I would suggest testing whether terahertz wave photons have also any impact on the Kleak channels as well.

      Thank you for your positive comment and for providing us with this valuable suggestion. After testing the leak K+ current with and without HFTS on the SNI model, we observed a notable increase in the leak K+ current with HFTS when the holding potential surpassed -40 mV (please see the revised Figs. 2m and n). This finding prompted us to delve deeper into the shifts in the resting membrane potential (RMP). The data, along with statistical analysis, are detailed in Tables S1-3.

      (2) The activation curves of the Kv currents in Figure 2h seem to be not well-fitted. I would suggest testing a higher voltage (>100 mV) to collect more data to achieve a better fitting.

      Thanks for your advice. We repeated the experiment while maintaining the voltage of patched neurons at a higher level (>100 mV) to collect ample data for better fitting. The outcomes are illustrated in the revised Figs. 2g-j. Clearly, the data reveals a significant increase in K+ conductance in the HFTS group as compared to the SNI group. We have integrated these discoveries into the revised manuscript, replacing the earlier results.

      (3) In the part of behavior tests, the pain threshold increased after THz application and lasted within 60 mins. I suggest conducting prolonged tests to determine the end of the analgesic effect of terahertz waves.

      Thank you for your insightful comment. We echo your curiosity about the duration of the HFTS effect. In the process of revising our work, we conducted a comparative analysis of the analgesic duration resulting from 10-minute and 15-minute applications of HFTS. The findings are visualized in the revised Fig. 5c. Our observations indicate that after 160 minutes, the PWMT value for the 15-minute HFTS group decreased to a level comparable to that of the SNI group. Meanwhile, the analgesic effects persisted for 140 minutes in the case of the 10-minute HFTS application. These results imply a direct correlation between the duration of HFTS application and the duration of analgesia.

      (4) Regarding in vivo electrophysiological recordings, the post-HFTS recordings were acquired from a time window of up to 20 min. It seems that the HFTS effect lasted for minutes, but this was not tested in vitro where they looked at potassium currents. This long-lasting effect of HFTS is interesting. Can the authors discuss it and its possible mechanisms, or test it in slice electrophysiological experiments?

      Thank you for your comment. Based on the results from in vivo electrophysiological recordings, it was observed that the effect of HFTS can endure for a minimum of 20 minutes, and this duration was even more extended in behavioral assessments. Taking your advice, we employed slice electrophysiological recording for further testing. Following a 15-minute application of HFTS, we evaluated the K+ current at 5 and 20 minutes after incubation. Our observations clearly indicated a substantial and lasting increase in K+ current, with the effect persisting for at least 20 minutes (refer to Fig. 2l). This provides confirmation of the long-lasting influence of HFTS. The relevant data and statistical analysis are documented in Table S1-2.

      (5) How did the authors arrange the fiber for HFTS delivery and the electrode for in vivo multi-channel recordings? Providing a schematic illustration in Figure 4 would be useful.

      Thank you for your comment. To enhance the reader's understanding of the HFTS delivery device during multi-channel recording, we have included a schematic illustration in Fig. 4a in the revised manuscript. The top portion of Fig. 4a depicts a quantum cascade laser (QCL) with a center frequency located at approximately 36 THz. This laser is then connected to the recording electrode via a PIR fiber. The left section illustrates the detailed structure of the recording electrode.

      (6) Some grammatical errors should be corrected.

      Thank you for your thorough review. We have carefully checked and corrected grammar errors we found throughout the entire text to ensure that readers can better comprehend the content of the article.

      Reviewer #2 (Public Review):

      In this manuscript, Peng et al., reported that 36 THz high-frequency terahertz stimulation (HFTS) can suppress the activity of pyramidal neurons by enhancing the conductance of voltage-gated potassium channel. The authors also demonstrated the effectiveness of using 36THz HFTS for treating neuropathic pain.

      Strengths:

      The manuscript is well written and the conclusions are supported by robust results. This study highlighted the potential of using 36 THz HFTS for neuromodulation.

      Weaknesses:

      More characterization of HFTS is needed, so the readers can have a better assessment of the potential usage of HFTS in their own applications.

      Thank you for your suggestion. We have created schematic diagrams illustrating the HFTS delivery (Fig. 4a and Fig. 5a in the revised manuscript). Fig. 4a presents the structure designed for in vivo multi-channel recording. Fig. 5a shows the structure used in behavior test, the recording electrode is replaced by a metal hollow tube, allowing the PIR fiber to pass through the tube and target the ACC region of the mice.

      (1) It would be very helpful to estimate the volume of tissue that can be influenced by HFTS. It is not clear how 15 mins HFTS was chosen for this functional study. Does a longer time have a stronger effect? A better characterization of the relationship between the stimulus duration of HFTS and its beneficial effects would be very useful.

      Thank you for your feedback. The degree of tissue influence is directly related to the size of the spot emerging from the fiber outlet. In our experiment, we used a PIR fiber with a 630 nm inner core diameter to propagate high-frequency THz waves. This core features a refractive index of 2.15 and has an effective numerical aperture (NA) of 0.35 ± 0.05.

      Our decision to apply HFTS for 15 minutes in the behavioral study was primarily based on observations from in vivo multi-channel recordings. Specifically, we noticed a considerable reduction in the average firing rate of PYR cells after 15 minutes of HFTS exposure. To further investigate the correlation between the duration of HFTS stimulation and its effects, we conducted a comparative study using a 10-minute HFTS session. The results, depicted in revised Fig. 5c, reveal that the PWMT value decreased to the level seen in the SNI group after approximately 160 minutes following 15 minutes of HFTS, and after about 140 minutes with 10 minutes of HFTS. This suggests a direct relationship between the length of HFTS application and its beneficial outcomes.

      (2) How long does the behavioral effect last after 15 minutes of HFTS? Figure 5b only presents the behavioral effect for one hour, but the pain level is still effectively reduced at this time point. The behavioral measurement should last until pain sensitization drops back to pre-stim level.

      Thank you for your feedback. Similar question is also mentioned by reviewer 1. As depicted in Fig. 5c, it was observed that the analgesic effects lasted for 140-160 min with 10-15 minutes application of HFTS. Based on these findings, we can conclude that in the SNI model, targeting the ACC brain region with HFTS for a duration of 10-15 minutes results in an analgesic effect that lasts for roughly 140-160 minutes. This provides valuable insights into the potential clinical applications and duration of relief that can be achieved through HFTS treatment.

      (3) Although the manuscript only tested in ACC, it will also be useful to demonstrate the neural modulation effect on other brain regions. Would 36THz HFTS also robustly modulate activities in other brain regions? Or are different frequencies needed for different brain regions?

      Thank you for your comment. We hypothesize that light waves at a frequency of approximately 36 THz effectively modulate neuronal activities in various brain regions, primarily due to their impact on K channels. Additionally, we speculate that the application of THz waves at different frequencies may influence other channels, such as Na and Ca channels, potentially facilitating or inhibiting neuronal activities. We believe this is a fascinating and significant area of research to explore in the future.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript by Peng et al. presents intriguing data indicating that high-frequency terahertz stimulation (HFTS) of the anterior cingulate cortex (ACC) can alleviate neuropathic pain behaviors in mice. Specifically, the investigators report that terahertz (THz) frequency stimulation widens the selectivity filter of potassium channels thereby increasing potassium conductance and leading to a reduction in the excitability of cortical neurons. In voltage clamp recordings from layer 5 ACC pyramidal neurons in acute brain slice, Peng et al. show that HFTS enhances K current while showing minimal effects on Na current. Current clamp recording analyses show that the spared nerve injury model of neuropathic pain decreases the current threshold for action potential (AP) generation and increases evoked AP frequency in layer 5 ACC pyramidal neurons, which is consistent with previous studies. Data are presented showing that ex-vivo treatment with HFTS in slice reduces these SNI-induced changes to excitability in layer 5 ACC pyramidal neurons. The authors also confirm that HFTS reduces the excitability of layer 5 ACC pyramidal neurons via in vivo multi-channel recordings from SNI mice. Lastly, the authors show that HFTS is effective at reducing mechanical allodynia in SNI using both the von Frey and Catwalk analyses. Overall, there is considerable enthusiasm for the findings presented in this manuscript given the need for non-pharmacological treatments for pain in the clinical setting.

      Strengths:

      The authors use a multifaceted approach that includes modeling, ex-vivo and in-vivo electrophysiological recordings, and behavioral analyses. Interpretation of the findings is consistent with the data presented. This preclinical work in mice provides new insight into the potential use of directed high-frequency stimulation to the cortex as a primary or adjunctive treatment for chronic pain.

      Weaknesses:

      There are a few concerns noted that if addressed, would significantly increase enthusiasm for the study.

      (1) The left Na current trace for SNI + HFTS in Figure 2B looks to have a significant series resistance error. Time constants (tau) for the rate of activation and inactivation for Na currents would be informative.

      Thank you for your feedback. We have carefully considered your comments and made several adjustments in the revised Figs. 2b-f to improve clarity and accuracy. Firstly, we have conducted a comparison of the time constants (tau) between the SNI group and the SNI+HFTS group. These time constants represent the latency of Na current activation or inactivation relative to the half-activated/inactivated voltage. Our analysis reveals that there is no statistically significant difference in tau between the two groups for both activation and deactivation curves. Secondly, we have updated the sample traces in Fig. 2b of the revised manuscript. These new traces illustrate that tau does not significantly differ between the SNI and SNI+HFTS groups, providing a visual representation of our findings. We believe that these modifications strengthen the presentation of our study's details and results, making the data more accessible and understandable for readers.

      (2) It is unclear why an unpaired t-test was performed for paired data in Figure 2. Also, statistical methods and values for non-significant data should be presented.

      Thank you for your comment. I think you mean the results in Fig. 3. We agree with you that we should use one-way ANOVA to analyze the data since there are more than 2 groups for comparison. We thus re-analyzed the data by using one-way ANOVA in Figs. 3g-k, and have included detailed statistical methods and P values in the revised manuscript.

      (3) It would seem logical to perform HFTS on ACC-Pyr neurons in acute slices from sham mice (i.e. Figure 3 scenario). These experiments would be informative given the data presented in Figure 4.

      Thank you for your valuable advice. During the revision process, we performed HFTS on ACC-PYR neurons in acute slices obtained from sham mice. The findings from this experiment have been integrated into the updated Fig. 3, where the sham group is represented by the green line and histogram (the revised Fig. 3 in the manuscript). It is noteworthy that a significant decrease in spike frequency was observed in the sham mice following HFTS.

      (4) As the data are presented in Figure 4g, it does not seem as if SNI significantly increased the mean firing rate for ACC-Pyr neurons, which is observed in the slice. The data were analyzed using a paired t-test within each group (sham and SNI), but there is no indication that statistical comparisons across groups were performed. If the argument is that HFTS can restore normal activity of ACC-Pyr neurons following SNI, this is a bit concerning if no significant increase in ACC-Pyr activity is observed in in-vivo recordings from SNI mice.

      Thank you for highlighting the inaccuracies in the analysis. After reviewing the data, we re-analyzed it using alternative statistical methods. In the revised version, since the data did not follow a normal distribution, we employed Wilcoxon matched-paired signed rank tests within the sham and SNI groups, and Mann-Whitney tests between the sham and SNI groups.

      Upon comparing the statistical outcomes across the groups, we found that the mean firing rate of 130 ACC neurons in SNI mice was significantly higher compared to that of 108 ACC neurons in sham mice (P = 0.0447, Mann-Whitney test). Notably, the mean firing rate of ACC-PYR exhibited a more pronounced increase with a P value of 0.0274 in SNI pre-HFTS versus sham pre-HFTS, while the mean firing rate of ACC-INT did not display a significant change across the groups. These findings align with the observations we made in the slice, reinforcing the validity of our results.

      (5) The authors indicate that the effects of HFTS are due to changes in Kv1.2. However, they do not directly test this. A blocking peptide or dendrotoxin could be used in voltage clamp recordings to eliminate Kv1.2 current and then test if this eliminates the effects of HFTS. If K current is completely blocked in VC recordings then the authors can claim that currents they are recording are Kv1.1 or 1.2.

      Thank you for your kind suggestion. In our research, we employed the Kv1.2 structure as a model to determine the response frequency of terahertz waves. Through both in vitro and in vivo experiments, we were able to demonstrate that the frequency of approximately 36 THz affects the Kv channel and its corresponding spike frequency. Upon analyzing the action potential waveform, we observed a notable variance in the resting membrane potential (RMP). This RMP is predominantly controlled by leak potassium channels, specifically the Tandem-pore potassium channels. In accordance with the recommendation of reviewer 1, we have addressed this particular aspect of our experimentation in the revised manuscript.

      We agree that we should use blocking peptides or dendrotoxin to eliminate Kv1.2 current. However, we meet problems in purchasing and delivery of the drugs. We thus added some explanation in the Discussion part to emphasize the value for this pharmacological experiment and can further confirm this in the future works.

      (6) The ACC is implicated in modulating the aversive aspect of pain. It would be interesting to know whether HFTS could induce conditioned place preference in SNI mice via negative reinforcement (i.e. alleviation of spontaneous pain due to the injury). This would strengthen the clinical relevance of using HFTS in treating pain.

      Thank you for this valuable advice. We share your intrigue regarding this experiment, and we fully recognize the importance and potential of further exploring this area. At present, however, our equipment and platform limitations prevent us from conducting the necessary tests. However, we remain committed to pursuing relevant research opportunities in the future.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      (1) Study suggests that the effects of their tumor models of mouse behavioral are largely non-specific to the tumor as most behaviors are rescued by analgesic treatment. So, most of the changes were likely due to site-specific pain and not a unique signal from the tumor.

      The tumor generates pain at the site it is implanted, and it is likely amplified by the oral activities tumor bearing mice have to engage in. As there is no pain in the absence of the tumor, the pain is, by definition, caused by the tumor, not by the site. Concerning the relationship between pain and behavior, the behavioral assays undertaken in our study (nesting, cookie test, wheel running) were very limited in scope.  Two of these assays (nesting, cookie test) require use of the oral cavity. Only nesting and wheel running were assessed in the context of treatment for pain. Nesting behavior was completely restored with carprofen and buprenorphine treatment suggesting that in the absence of pain, mice were able to make perfect nests. Consistent with this, carprofen and buprenorphine treated animals also gained weight indicating that eating (another activity dependent on the oral cavity) was also restored.  Wheel running, an activity that does not rely on the oral cavity, was only partially restored with drug treatment. While additional behavioral tests are necessary to confirm this finding, the data suggest that there is pain-independent information relayed to the brain which accounts for this decline in wheel running.

      Reviewer #2:

      (1) The main claim is that tumor-infiltrating nerves underlie cancer-induced behavioral alterations, but the experimental interventions are not specific enough to support this. For example, all TRPV1 neurons, including those innervating the skin and internal organs, are ablated to examine sensory innervation of the tumor. Within the context of cancer, behavioral changes may be due to systemic inflammation, which may alter TRPV1 afferents outside the local proximity of tumor cells. A direct test of the claims of this paper would be to selectively inhibit/ablate nerve fibers innervating the tumor or mouth region.

      We agree with the reviewer that a direct test of the hypothesis would require selectively inhibiting the nerve fibers innervating the tumor and assessing the impact on behavior. Studies in the lab are on-going using pharmacological interventions to do this. These studies are beyond the scope of this current manuscript.

      (2) Behavioral results from TRPV1 neuron ablation studies are in part confounded by differing tumor sizes in ablated versus control mice. Are the differences in behavior potentially explained by the ablated animals having significantly smaller tumors? The differences in tumor sizes are not negligible. One way to examine this possibility might be to correlate behavioral outcomes with tumor size.

      As suggested by the reviewer, we have graphed nesting scores and time-to-interact (cookie test) relative to tumor volume.  In both cases, we used simple linear regression to fit the data and analyzed the slopes of the lines. In the case of nesting, there was no significant difference between the slopes. This is now included as Supplemental Figure 4A. In the case of the cookie test, there was a significant difference between the slopes. This is now included as Supplemental Figure 4B. Graphing the data in this way allows one to look at any given tumor volume and infer what the nesting score and the time-to-interact for the two groups of mice. The linear regression model fits the time to interact with the cookie reasonably well, thus from this graph, we can see that at any given tumor volume the time to interact with the cookie was generally shorter in TRPV1cre::DTAfl/wt animals as compared to C57BL/6 mice. Unfortunately, the linear regression does not fit the nesting data very well and thus it is more difficult to make the comparison of tumor volume and nesting score.

      The following text has been added to the results section.

      Given the impact of nociceptor neuron ablation on tumor growth, we wondered whether differences in tumor volume contributed to the behavioral differences we noted. Thus, the behavior data were graphed as a function of tumor volume (Supplemental Fig 4A, B). A simple linear regression model was used to fit the data. In the case of nesting scores, the linear regression did not fit the data points very well making it difficult to assess nesting scores at a given tumor volume (Supplemental Fig 4A). However, the linear regression model fit the time to interact data better. Here, the graph suggests that tumor volume did not influence behavior as at any given tumor volume the time to interact with the cookie is generally smaller in TRPV1-Cre::Floxed-DTA animals as compared to C57BL/6 animals (Supplemental Fig 4B).

      Reviewer #3:

      (1) The authors mention in their Discussion the need for additional experiments. Could they also include / comment on the potential impact on the anti-tumor immune system in their model?

      The following text has been added to the discussion:

      Neuro-immune interactions have been studied in the context of a variety of conditions including, but not limited to infection 109, inflammation 110,111, homeostasis in the gut 112-114, as well as neurological diseases115,116. Neuro-immune communications in the context of cancer and behavior have also been studied (e.g., sickness behavior, depression) 117-119 however, these studies did not assess these interactions at the tumor bed. Investigations into neuro-immune interactions occurring within primary malignancies which harbor nerves have shed light on these critical communications. In the context of melanoma, which is innervated by sensory nerves, we identified that release of the neuropeptide calcitonin gene related peptide (CGRP) induces immune suppression. This effect is mediated by CGRP binding to its receptor, RAMP1, which is expressed on CD8+ T cells 49. A study utilizing a different syngeneic model of oral cancer similarly found an immune suppressive role for CGRP 120-122. These studies demonstrate that neuro-immune interactions occur at the tumor bed. Our current findings indicating that tumor-infiltrating nerves connect to a circuit that includes regions within the brain suggest that neuro-immune interactions within the peripheral malignancy may contribute to the behavioral alterations we studied.

      (2) The authors mention the importance of inflammation contributing to pain in cancer but do not clearly highlight how this may play a role in their model. Can this be clarified?

      The following text has been added to the discussion section of the manuscript.

      Moreover, given that carprofen and buprenorphine decrease inflammation 104, their ability to restore normal nesting and cookie test behaviors (which require the use of the oral cavity where the tumor is located) suggests that inflammation at the tumor site contributed to the decline in these behaviors in vehicle-treated animals. Since both drugs were given systemically and each only partially restored wheel running, it suggests that systemic inflammation alone cannot fully account for the decline in wheel running seen in vehicle-treated animals. We posit that the inflammation- and pain-independent component of this behavioral decline is mediated via the transcriptional and functional alterations in the cancer-brain circuit.

      (3) The tumor model apparently requires isoflurane injection prior to tumor growth measurements. This is different from most other transplantable types of tumors used in the literature. Was this treatment also given to control (i.e., non-tumor) mice at the same time points? If not, can the authors comment on the impact of isoflurane (if any) in their model?

      Mice in all groups (tumor and non-tumor) were treated with isoflurane. This important detail has been added to the methods section.

      (4) The authors emphasize in several places that this is a male mouse model. They mention this as a limitation in the Discussion. Was there an original reason why they only tested male mice?

      The following text has been added in the discussion section:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      Therefore, another project in the lab has been to compare disease characteristics (including innervation and behavior) in male and female mice. The findings from this second study are the topic of a separate manuscript.

      Recommendations For The Authors:

      Reviewing editor:

      (1) Tumors can communicate with the brain via blood-borne agents from the tumor itself or immune cells that are activated by the tumor in addition to neurons that invade the tumor. The xia and malaise that accompanies some tumors can be mediated by direct innervation and/or the humoral factors because both can activate the same parabrachial pathway. This paper makes the case for the direct innervation being important but ignores the possibility of both being involved. The interesting observation that innervation supports tumor growth (perhaps via substance P) is troublesome because the slower appearance of behavioral consequences (Figures 4 & 5) could be attributed to the smaller tumor size. A nice control for humoral effects would be to implant the tumor cells someplace in the body where innervation does not occur (if possible) and then examine behavioral outcomes.

      In the course of several projects, we have implanted different tumor cell lines in different locations in mice (oral cavity, hind limb, flank, peritoneal cavity). In each location, tumor innervation occurs. This is not a phenomenon found only in mice as we completed an immunohistological survey of human cancers from different sites and found they are all innervated (PMID 34944001). These data are consistent with tumor and locally-released factors that recruit nerves to the tumor bed (PMID: 30327461)(PMID: 32051587)(PMID: 27989802). Thus, an implantation site that does not result in tumor innervation is currently unknown and likely does not exist.

      (2) The authors should address whether there is an inflammatory component in this tumor model.

      MOC2-7 tumors have been characterized as non-inflamed and poorly immunogenic 129-131.

      This information has been added to the methods section.

      (3) The RTX experiment in Figure 5 would be more compelling if the drug was injected directly into the tumor rather than injecting it in the flank, thus ablating all TRPV1-exressing neurons as in the genetic approach.

      While we agree with the reviewer that ablating the TRPV1-expressing neurons at the tumor site directly would be ideal, RTX treatment takes approximately one week for ablation to occur but a significant amount of inflammation is associated with this. Therefore, we wait a total of 4 weeks for the inflammation to resolve. By this time, tumors have generally reached sacrifice criteria. Thus, this approach would not enable the question to be answered Moreover, we are not aware of any studies in which RTX has been injected in the oral cavity or face. While RTX is utilized clinically to treat pain, it is typically administered intrathecally, epidurally or intra-ganglionically (PMID: 37894723).

      (4) The authors address affective aspects of pain but do not adequately address the sensory aspects, e.g., sensitivity to touch, heat and/or cold. They attribute the decrease in food disappearance (consumption) and nest building to oral pain, but it could be due to anhedonia and anorexia that can accompany tumor progression.

      Assaying for touch and heat/cold sensitivity in the oral cavity is a critical aspect of studying head and neck cancer that needs to be addressed. However, in rodents these assays are not trivial given that any touch/heat/cold in the area of the tumor (oral cavity) impacts the sensitive whiskers in that region which directly influence these assays. Thus, we have been refining assays (e.g., OPAD, facial von Frey) to address these important questions. The findings from these studies are beyond the scope of this manuscript.

      The reviewer makes a good point about anhedonia and anorexia. The following text has been added to the results section:

      Pain-induced anhedonia is mediated by changes in the reward pathway. Specifically, in the context of pain, dopaminergic neurons in the ventral tegmental area (VTA) become less responsive to pain and release less serotonin.  This decreased serotonin results in disinhibition of GABA release; the resulting increased GABA promotes an increased inhibitory drive leading to anhedonia  82 and, when extreme, anorexia. Carprofen and buprenorphine treatments completely reversed nesting behavior and significantly improved eating. Inflammation 83 and opioids 84 directly influence reward processing and though our tracing studies did not indicate that the tumor-brain circuit includes the VTA, this brain region may be indirectly impacted by tumor-induced pain in the oral cavity. Thus, an alternative interpretation of the data is that the effects of carprofen and buprenorphine treatments on nesting and food consumption may be due to inhibition of anhedonia (and anorexia) rather than, or in addition to, relieving oral pain.

      (5) Comment on why only males were used in this study.

      Please see response to public reviews.

      Reviewer #1:

      (1) Please provide a justification for the use of exclusively male mice and expand in the discussion if there is potential for these findings to be directly applicable to female mice as well.

      Please see response to public reviews.

      The following text has been added to the discussion:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      (2) When discussing the results shown in Figure 2, please include some mention of Fus, since it was the highest expressed transcript.

      The following text has been added to the results section regarding Fus.

      The gene demonstrating the highest increase in expression, Fus, was of particular interest; it increases in expression within DRG neurons following nerve injury and contributes to injury-induced pain 51,52. Of note, we purposefully used whole trigeminal ganglia rather than FACS-sorted tracer-positive dissociated neurons to avoid artificially imposing injury and altering the transcript levels of these cells 53,54. Thus, significantly elevated expression of Fus by ipsilateral TGM neurons from tumor-bearing animals suggests the presence of neuronal injury induced by the malignancy. This is consistent with our previous findings 55 and those of others 56 showing that tumor-infiltrating nerves harbor higher expression of nerve-injury transcripts and neuronal sensitization.

      (3) In line 197 please clarify the mice used. Were all mice tumor-bearing and some had nociceptors ablated, or was there a control (no tumor) group as well?

      Line 197 refers to Figure 4D. In this figure, panels B-D show quantification of cFos and DFosB in the spinal nucleus of the TGM (SpVc), The parabrachial nucleus (PBN) and the Central nucleus of the amygdala (CeA). These data are from C57BL/6 and TRPV1cre::DTAfl/wt animals all of whom had tumor. Supplementary Figure 3C also show quantification of cFos and DFosB but these are from control, non-tumor bearing animals. The fact that controls are non-tumor-bearing has been added to the supplemental figure legend and the text of the results section has been clarified as follows.

      While Fos expression was similar between non-tumor bearing mice of the two genotypes (Supplemental Fig. 3C-E), the absence of nociceptor neurons in tumor-bearing animals decreases cFos and DFosB in the PBN, and DFosB in the SpVc (Fig. 4B, C).

      (4) Overall it would improve the readability of the figures if the colors for the IHC channels were on the image itself and not exclusively in the figure legend.

      The colors for all the staining have been added to each panel.

      (5) It is not a problem that complete cartography was not done, but please include a justification for why the brain regions that were focused on were chosen.

      In order to ensure that our neural tracing technique captured only nerves present within the tumor bed, we restricted the injection of tracer to only 2 µl. We demonstrated that this small volume did not leak out of the tumor (Figure 1) and thus any tracer labeled neurons we identified were deemed as being connected in a circuit to nerves in the tumor bed. While we acknowledged that this calculated technical approach restricted our ability to tracer label all neurons in the tumor bed (as well as those they share circuitry with), it ensured no tracer leakage and inadvertent labeling of non-tumoral nerves. In non-tumor animals injected with 10 µl of tracer, labeled regions in the brain included the spinal nucleus of the trigeminal, the parabrachial nucleus, the central amygdala, the facial nucleus and the motor nucleus of the trigeminal. The regions that were tracer positive when tumor was injected were limited to the spinal nucleus of the trigeminal, the parabrachial nucleus and the central amygdala. Thus, the regions in the brain that we focused on were the areas that became tracer-positive following injection of tracer into the tumor.

      (6) Were the cells that were injected cultured in media with 10% fetal calf serum? If so was any inflammatory response seen? If not please state in the methods section the media that cells for injection were cultured in.

      The cells injected into animals were cultured in media containing 10% fetal calf serum. When cells are harvested for tumor injections, they are first washed two times with PBS and then trypsinized to detach the cells from the plate. Cells are collected, washed again with PBS and resuspended with DMEM without serum; this is what is injected into animals. We harvest cells in this way in order to eliminate any serum being injected into mice. This information has been added to the Methods section.

      (7) Would any of the differences in drug treatment (Carprofen vs Buprenorphine) be due to the differing routes of administration and metabolism of the drugs?

      Since carprofen and buprenorphine each resulted in similar behavioral impacts (nesting and wheel running), their different routes of administration seem to play a minor or no role in the behaviors assessed.

      (8) Please include in the methods section the specific approach and software that was used for processing calcium imaging data and calculating a relative change in fluorescence.

      The specific approach used for processing calcium imaging data and calculating relative change in fluorescence as well as the software used are all included in the methods section. Please see below:

      Ca2+ imaging. TGM neurons from non-tumor and tumor-bearing animals (n=4-6 mice/condition) were imaged on the same day. Neurons were incubated with the calcium indicator, Fluo-4AM, at 37°C for 20 min. After dye loading, the cells were washed, and Live Cell Imaging Solution (Thermo-Fisher) with 20 mM glucose was added. Calcium imaging was conducted at room temperature. Changes in intracellular Ca2+ were measured using a Nikon scanning confocal microscope with a 10x objective. Fluo-4AM was excited at 488 nm using an argon laser with intensity attenuated to 1%. The fluorescence images were acquired in the confocal frame (1024 × 1024 pixels) scan mode. After 1 min of baseline measure, capsaicin (300nM final concentration) was added. Ca2+ images were recorded before, during and after capsaicin application. Image acquisition and analysis were achieved using NIS-Elements imaging software. Fluo-4AM responses were standardized and shown as percent change from the initial frame. Data are presented as the relative change in fluorescence (DF/F0), where F0 is the basal fluorescence and DF=F-F0 with F being the measured intensity recorded during the experiment. Calcium responses were analyzed only for neurons responding to ionomycin (10 µM, positive control) to ensure neuronal health. Treatment with the cell permeable Ca2+ chelator, BAPTA (200 µM), served as a negative control.

      (9) Suggestions for Figure 1:

      - In Figures 1C, D, E, include labels for the days of tumor harvest.

      - Please make the size of the labels the same for 1K an 1L and align them.

      - Microscopy image in Figure 1L for SpVc looks like it may be at a different magnification.

      - If possible, include (either in the figure or the supplement) IHC images staining for Dcx and tau, which would complement the western blot data.

      The requested changes to the figures have been made. Unfortunately, we do not have Dcx and tau IHC staining of the day 4, 10 and 20 tumors.

      (10) Suggestions for Figure 2:

      - Include directly onto the graph in Figure 2a the legend for tumor-bearing (red) and non-tumor bearing (blue).

      - Keep consistent between Figure 2G and 2H/I if the tumor/nontumor will be labeled as T/N or Tumor/Control.

      The requested changes to the figures have been made.

      (11) Suggestions for Figure 3:

      - An example trace of calcium signal would complement Figure 3G, H well.

      Example tracings of calcium signal are already provided in Supplementary Figure 3A and B.

      Reviewer #2:

      (1) While the use of male mice is acknowledged, there is not a rationale for why female mice were not included in the study.

      Please see the response to Reviewer #1 (first question).

      (2) Criteria for euthanasia should be described in the Methods. This is especially needed for interpreting the survival curve in Figure 4H.

      Criteria for euthanasia in our IACUC approved protocol include:

      - maximum tumor volume of 1000mm3

      - edema

      - extended period of weight loss progressing to emaciation

      - impaired mobility or lesions interfering with eating, drinking or ambulation

      - rapid weight loss (>20% in 1 week)

      - weight loss at or more than 20% of baseline

      In addition to tumor size and weight loss, we use the body condition score to evaluate the state of animals and to determine euthanasia.  These details have been added to the Methods section.

      (3) At what stage in cancer progression were the Fos studies conducted for Figure 4A-D?

      The brains used for Fos staining (Fig 4B-D) were harvested at week 5 post-tumor implantation.

      (4) For Fos counts, what are the bregma coordinates for the sections that were quantified?

      SpVc:  -7.56 to -8.24mm

      PBN:  -4.96 to -5.52mm

      CeA:  -0.82mm to -1.94mm

      (5) Statistics are needed for the claim in Lines 171-173.

      The statistical analysis of Fos staining from tumor-bearing and non-tumor bearing brains are included in Figure 3D-F. The statistical analysis of ex vivo Ca+2 imaging of brains from tumor-bearing and non-tumor bearing animals are included in Figure 3 I and J.

      (6) How long was the baseline period for weight and food intake measurements? How long were the animals single-housed before taking the baseline measurements?  

      Baseline weight and food intake measurements were 2 weeks and animals were singly housed before baseline measurements for 2 weeks (a total of 4 weeks).

      Minor:

      (7) The authors might consider rewording the sentence on lines 59-62, given that it is abundantly clear from rodent studies that both the tumor and chemotherapy are associated with adverse behavioral outcomes.

      We have reworded the sentence as follows:  The association of cancer with impaired mental health is directly mediated by the disease, its treatment or both; these findings suggest that the development of a tumor alters brain functions.

      (8) Line 212 needs a space between the two sentences.

      This has been fixed.

      (9) Font size in Figure 2 is not consistent with the other figures.

      This has been fixed.

      (10) "DAPI" is the more conventional than "DaPi".

      This has been fixed.

      Editorial Comments and Suggestions:

      (1) The Abstract would be better if it were more concise, e.g. ~175 words.

      The abstract has been shortened as requested and now reads:

      Cancer patients often experience changes in mental health, prompting an exploration into whether nerves infiltrating tumors contribute to these alterations by impacting brain functions. Using a mouse model for head and neck cancer and neuronal tracing we show that tumor-infiltrating nerves connect to distinct brain areas. The activation of this neuronal circuitry altered behaviors (decreased nest-building, increased latency to eat a cookie, and reduced wheel running). Tumor-infiltrating nociceptor neurons exhibited heightened calcium activity and brain regions receiving these neural projections showed elevated cFos and delta FosB as well as increased calcium responses compared to non-tumor-bearing counterparts. The genetic elimination of nociceptor neurons decreased brain Fos expression and mitigated the behavioral alterations induced by the presence of the tumor. While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running indicating that pain is not the exclusive driver of such behavioral shifts. Unraveling the interaction between the tumor, infiltrating nerves, and the brain is pivotal to developing targeted interventions to alleviate the mental health burdens associated with cancer.

      (2) Lines 28, 104, 258, 486, 521, and many other places, "utilized" should be "used" because the former refers to an application for which it is not intended, e.g. a hammer was utilized as a doorstop.

      The requested changes have been made.

      (3) Lines 32 and 73, it is not clear whether the basal activity is heightened or whether excitability is increased. "manifest" might be better than "harbor" on line 73.

      We have changed the wording in the abstract to be clearer. Moreover, our finding that TGM neurons from tumor-bearing animals have increased expression of the s1-Receptor and phosphorylated TRPV1 (Fig 2G-I) indicate that these neurons have increased excitability.

      (4) Line 34 and elsewhere, it would be better to refer to Fos because the is no need to distinguish cellular, cFos, from viral, vFos, in this context.

      The requested changes have been made.

      (5) Line 38, It would be better to refer to what was actually measured rather than "oral movements".

      The requested changes have been made. The sentence now reads: “While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running.”

      (6) Line 84, CXCR3-null mouse on a C57BL/6 background.

      The requested change has been made.

      (7) Lines 86,129 wild-type, male mice.

      The requested change has been made.

      (8) Lines114-115, the brackets are not necessary.

      The requested change has been made.

      (9) Lines 118, 384, 409, 527, 589, 971, 974 always leave a space between numbers and units. Use Greek u for micro.

      The requested change has been made.

      (10) Lines 123-124, it is not clear that there is meaningful labeling within the CeA.

      We have replaced this image with a more representative one of the CeA from a tumor-bearing animal with clear tracer labeling.

      (11) Lines 125, 138, and 246 transcription was not measured, only transcript levels were measured.

      The requested changes have been made.

      (12) Line 133, I think >4 fold is meant.

      Thank you for catching that. I have fixed it to >4 fold.

      (13) Line 165, single-time-point assessment (add hyphens).

      The requested change has been made.

      (14) Line 181 and elsewhere including figure, the superscripts refer to alleles of the genes; hence approved gene names should be used in italics (as in Methods), TRPV1-Cre:: Floxed-DTA (without italics) would be acceptable.

      The requested changes have been made.

      (15) Line 182, nociceptor-neuron-ablated mice (add hyphens).

      The requested changes have been made.

      (16) Line 197, It is not clear that the "speed" of food disappearance was measured or that it is due to oral pain vs loss of appetite.

      The reviewer makes a good point. We have changed the sentence to read:

      To evaluate the effects of this disruption on cancer-induced behavioral changes, we assessed the animals’ general well-being through nesting behavior 32 and anhedonia using the cookie test 76,77, as well as  body weight and food disappearance as surrogates for oral pain and/or loss of appetite.

      (17) Line 199, The reduced tumor growth after ablation could account for most of the changes in the other parameters that were measured.

      We have graphed the nesting scores and time-to-interact with the cookie as a function of tumor volume.  These data are now included as Supplemental Figure 4 and suggest that at the same tumor volume, nesting scores and times-to-interact with the cookie are different between the groups.

      (18) Line 204 TPVP1 spelling. Is the TGN smaller after ablation of half of the neurons?

      The requested change has been made.

      (19) Line 235, "now" is not necessary.

      The requested change has been made.

      (20) Line 238-239 and elsewhere, a few references for to why the TGN-SpVc-PBN-CeA circuit is relevant would be helpful.

      The following references have been added regarding the relevance of this circuit to behavior:

      Molecular Brain 14: 94 (2021) (PMID 34167570)

      Neuropharmacology 198: 108757 (2021) (PMID 34461068)

      Frontiers in Cellular Neuroscience 16: 997360 (2022)  (PMID 36385947)

      Neuropsychopharmacology  49(3): 508-520 (2024) (PMID 37542159)

      (21) Lines 371, 434 and Figures, gm should be g or grams in scientific usage. Include JAX lab stock numbers for these mouse lines.

      The requested changes have been made.

      (22) Line 432, removing food for one hour is not a fast.

      The sentence has been reworded as follows: One hour prior to testing, mouse food is removed and the animals are acclimated to the brightly lit testing room.

      (23) Line 476, 5-um sections (add hyphen).

      The hyphen has been added.

      (24) Lines 988, and 1023, DAPI are usually shown this way.

      The requested change has been made.

      (25) Figure 1K, add Bregma levels to figures.

      SpVc: -8.12 mm

      PBN: -5.34 mm

      CeA: -1.34 mm

      (26) Figure 3 line 1033, "area under the curve" What curve was examined?

      The curve examined was the change in fluorescence over time. This curve has been added as Supplemental Figure 3C.

      (27) Figure 3B, the circled area is the lateral PBN. At first glance, I thought scp was meant as the label for the circled area.

      Scp is noted in the figure legend as a landmark.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      For the colony analysis, it is unclear from the methods and main text whether the initial individual sorted colonies were split and subject to different conditions to support the claim of bi-potency. The finding that 40% of colonies displayed tenogenic differentiation, may instead suggest heterogeneity of the sorted progenitor population. The methods as currently described, suggest that two different plates were subject to different induction conditions. It is therefore difficult to assess the strength of the claim of bi-potency.

      Thanks for your valuable comment. We are sorry for the confusing illustration of colony assay. In fact, we first obtained CD29+/CD56+ cells by FACs. Then these freshly isolated cells were randomly seeded to 96-well plate with density of 1 cell/well. Subsequently, the single cell in each plate was cultured with growth medium to form colonies for ten days. Then myogenic induction was performed in three 96-well plates and tenogenic induction was performed in another three 96-well plates for subsequent analyses. Thus, we agree with your point that the sorted progenitor population could be heterogeneous. Almost all the cells highly expressed myogenic progenitor genes PAX7/MYOD1/MYF5 (Figure 1g) and over 95% colonies successfully differentiated into myotubes (Figure 2g). Thus, we believe these obtained CD29+/CD56+ cells were myogenic progenitor cells, while a subgroup of these cells obtained bi-potency.

      This group uses the well-established CD56+/CD29+ sorting strategy to isolate muscle progenitor cells, however recent work has identified transcriptional heterogeneity within these human satellite cells (ie Barruet et al, eLife 2020). Given that they identify a tenocyte population in their human muscle biopsy in Figure 1a, it is critical to understand the heterogeneity contained within the population of human progenitors captured by the authors' FACS strategy and whether tenocytes contained within the muscle biopsy are also CD56+/CD29+.

      Thanks for your constructive suggestion. We will include more samples to perform scRNA-seq and reanalyze the data.

      The bulk RNA sequencing data presented in Figure 3 to contrast the expression of progenitor cells under different differentiation conditions are not sufficiently convincing. In particular, it is unclear whether more than one sample was used for the RNAseq analyses shown in Figure 3. The volcano plots have many genes aligned on distinct curves suggesting that there are few replicates or low expression. There is also a concern that the sorted cells may contain tenocytes as tendon genes SCX, MKX, and THBS4 were among the genes upregulated in the myogenic differentiation conditions (shown in Figure 3b).

      Thanks for your comment. Each group consisted of three samples for RNAseq analyses. We are sorry there exist a minor analysis mistake in Figure 3b and Figure 3c, which will be reanalyzed in the revised version. As for contamination of tenocytes, almost all the obtained cells highly expressed myogenic progenitor marker PAX7/MYOD1/MYF5 (Figure 1g-h). Low expression levels of tendon markers were identified in these cells (Figure 2a-c). Furthermore, although tendon genes slightly upregulated in myogenic differentiation conditions, these markers dramatically upregulated in tenogenic differentiation conditions (Figure 2c). Thus, we believe the tenogenic differentiation ability of sorted cells were mainly ascribed to CD29+/CD56+ myogenic progenitor cells.

      Reviewer #2 (Public Review):

      scRNAseq assay using total mononuclear cell population did not provide meaningful insight that enriched knowledge on CD56+/CD29+ cell population. CD56+/CD29+ cells information may have been lost due to the minority identity of these cells in the total skeletal muscle mononuclear population, especially given the total cell number used for scRNAseq was very low and no information on participant number and repeat sample number used for this assay. Using this data to claim a stem cell lineage relationship for MuSCs and tenocytes may not convincing, as seeing both cell types in the total muscle mononuclear population does not establish a lineage connection between them.

      Thanks for your constructive suggestion. We will include more samples to perform scRNA-seq and reanalyze the data.

      The TGF-b pathway assay uses a small molecular inhibitor of TGF-b to probe Smad2/3. The assay conclusion regarding Smad2/3 pathway responsible for tenocyte differentiation may be overinterpretation without Smad2/3 specific inhibitors being applied in the experiments.

      Thanks for your comment. We agree with your comment that we should revise it in the revision version.

      Reviewer #3 (Public Review):

      Comment: This dual differentiation capability was not observed in mouse muscle stem cells.

      Thanks for your comment. We have explored the tenogenic differentiation potential of mouse MuSCs both in vivo and in vitro. However, low tenogenic differentiation ability was revealed (Figure 4), which might be due to species diversity. Maybe it is more demanding for humans to maintain the homeostasis of the locomotion system and the whole organism locomotion ability in much longer life span and bigger body size. Thus, the current study also indicated that anima studies may not clinically relevant when investigating human diseases.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization, and fluidity which ultimately leads to antimicrobial resistance.

      Strengths:

      (1) The experiments were carried out methodically and logically.

      (2) An adequate number of replicates were used for the experiments.

      Weaknesses:

      (1) The introduction section needs to be more informative and to the point.

      (2) The weakest point of this paper is in the logistics through the results section. The way authors represented the figures and interpreted them in the results section (or the figure legends) does not match. The figures are difficult to interpret and are not at all self-explanatory.

      (3) There are too many mislabeling of the figure panels in the main text which makes it difficult to find out which figures the authors are explaining. There should be more explanation on why and how they did the experiments and how the results were interpreted.

      (1) We would like to extensive revise the introduction to make it more informative than the current version.

      (2) We will check the description in the text and labeling in the figures to make it is logic.

      (3) We will add the explanation of the experiments to make it clear that why we perform the assays.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and presented a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves.

      Strengths:

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also commend the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses:

      I believe there are weaknesses in the manuscript, however. The authors take for granted that the reader is familiar with all the assays utilized, and do not properly explain some experiments, and thus I highly suggest that the authors add a brief statement in each situation describing the rationale for each selected methodology (more details are in the private review to the authors). The Results section is also quite long and bogs down at times, and I suggest that the authors reduce its length by 10 to 20%. In contrast, the Introduction is sparse and lacks key aspects, for example, there should be mention of the study's main purpose and approaches, plus an introduction to the authors' choice of species and their known drug resistance properties, as well as the drug of choice (balofloxacin). Another notable weakness is that the authors evaluated Mg2+-induced phenotypic resistance only against two closely related species, and thus the generalizability of this mechanism of drug resistance is not known. The paper would be strengthened if the authors could demonstrate this type of phenotypic resistance in at least one more Gram-negative species and at least one Gram-positive species (antimicrobial susceptibility evaluations would suffice), each of which should be pathogenic to humans. Demonstrating magnesium-induced phenotypic drug resistance in the WHO Priority Bacterial Pathogens would be particularly important.

      We will add the explanation of the experiments to make it clear that why we perform the assays. And we will revise the introduction and shorten the length of the manuscript. Expanding the bacterial species is very good idea and we will perform such experiment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Odenwald and colleagues show that mutant biotin ligases used to perform proximity-dependent biotin identification (TurboID) can be used to amplify signal in fluorescence microscopy and to label phase-separated compartments that are refractory to many immunofluorescence approaches. Using the parasite Trypanosoma brucei, they show that fluorescent methods such as expansion microscopy and CLEM, which require bright signals for optimal detection, benefit from the elevated signal provided by TurboID fusion proteins when coupled with labeled streptavidin. Moreover, they show that phase-separated compartments, where many antibody epitopes are occluded due to limited diffusion and potential sequestration, are labeled reliably with biotin deposited by a TurboID fusion protein that localizes within the compartment. They show successful labeling of the nucleolus, likely phase-separated portions of the nuclear pore, and stress granules. Lastly, they use a panel of nuclear pore-TurboID fusion proteins to map the regions of the T. brucei nuclear pore that appear to be phase-separated by comparing antibody labeling of the protein, which is susceptible to blocking, to the degree of biotin deposition detected by streptavidin, which is not. 

      Strengths: 

      Overall, this study shows that TurboID labelling and fluorescent streptavidin can be used to boost signal compared to conventional immunofluorescence in a manner similar to tyramide amplification, but without having to use antibodies. TurboID could prove to be a viable general strategy for labeling phase-separated structures in cells, and perhaps as a means of identifying these structures, which could also be useful. 

      Weaknesses: 

      However, I think that this work would benefit from additional controls to address if the improved detection that is being observed is due to the increased affinity and smaller size of streptavidin/biotin compared to IgGs, or if it has to do with the increased amount of binding epitope (biotin) being deposited compared to the number of available antibody epitopes. I also think that using the biotinylation signal produced by the TurboID fusion to track the location of the fusion protein and/or binding partners in cells comes with significant caveats that are not well addressed here, mostly due to the inability to discern which proteins are contributing to the observed biotin signal. 

      To dissect the contributions of the TurboID fusion to elevating signal, anti-biotin antibodies could be used to determine if the abundance of the biotin being deposited by the TurboID is what is increasing detection, or if streptavidin is essential for this.

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both. However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen which is further dependent on the respective IF-conditions, and are therefore not directly comparible. Even if anti-biotin gives a better signal then anti-HA, this can be either caused by the increase in antigen-number (more biotin than HA-tag) or by the higher binding affinity, or by a combination of both, thus hard to distinguish. Nevertheless, we have tested monoclonal mouse anti-biotin targeting the (non-phase-separated) NUP158. We found the signal from the biotin-antibody to be much weaker than from anti-HA, indicating that, at least this particular biotin antibody, is not a very good binder in IF. 

      Alternatively, HaloTag or CLIP tagging could be used to see if diffusion of a small molecule tag other than biotin can overcome the labeling issue in phase-separated compartments. There are Halo-biotin substrates available that would allow the conjugation of 1 biotin per fusion protein, which would allow the authors to dissect the relative contributions of the high affinity of streptavidin from the increased amount of biotin that the TurboID introduces. 

      This is a very good idea, as in this case, the signals are both from streptavidin and are directly comparable. We expressed NUP158 with HaloTag and added PEG-biotin as a Halo ligand. However, PEG-biotin is poorly cell-permeable, and is in general only used on lysates. In trypanosomes, cell permeability is particular restricted, and even Halo-ligands that are considered highly cell-penetrant give only a weak signal. Even after over-night incubation, we could not get any signal with PEG-biotin. Our control, the TMR-ligand 647, gave a weak nuclear pore staining, confirming the correct expression and function of the HaloTag-NUP158.

      The idea of using the biotin signal from the TurboID fusion as a means to track the changing localization of the fusion protein or the location of interacting partners is an attractive idea, but the lack of certainty about what proteins are carrying the biotin signal makes it very difficult to make clear statements. For example, in the case of TurboID-PABP2, the appearance of a biotin signal at the cell posterior is proposed to be ALPH1, part of the mRNA decapping complex. However, because we are tracking biotin localization and biotin is being deposited on a variety of proteins, it is not formally possible to say that the posterior signal is ALPH1 or any other part of the decapping complex. For example, the posterior labeling could represent a localization of PABP2 that is not seen without the additional signal intensity provided by the TurboID fusion. There are also many cytoskeletal components present at the cell posterior that could be being biotinylated, not just the decapping complex. Similar arguments can be made for the localization data pertaining to MLP2 and NUP65/75. I would argue that the TurboID labeling allows you to enhance signal on structures, such as the NUPs, and effectively label compartments, but you lack the capacity to know precisely which proteins are being labeled.  

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is confirmed by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      Reviewer #2 (Public Review): 

      Summary: 

      The authors noticed that there was an enhanced ability to detect nuclear pore proteins in trypanosomes using a streptavidin-biotin-based detection approach in comparison to conventional antibody-based detection, and this seemed particularly acute for phase-separated proteins. They explored this in detail for both standard imaging but also expansion microscopy and CLEM, testing resolution, signal strength, and sensitivity. An additional innovative approach exploits the proximity element of biotin labelling to identify where interacting proteins have been as well as where they are. 

      Strengths: 

      The data is high quality and convincing and will have obvious application, not just in the trypanosome field but also more broadly where proteins are tricky to detect or inaccessible due to phase separation (or some other steric limitations). It will be of wide utility and value in many cell biological studies and is timely due to the focus of interest on phase separation, CLEM, and expansion microscopy. 

      Thank you! We are glad you liked it.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors aimed to investigate the effectiveness of streptavidin imaging as an alternative to traditional antibody labeling for visualizing proteins within cellular contexts. They sought to address challenges associated with antibody accessibility and inconsistent localization by comparing the performance of streptavidin imaging with a TurboID-HA tandem tag across various protein localization scenarios, including phase-separated regions. They aimed to assess the reliability, signal enhancement, and potential advantages of streptavidin imaging over antibody labeling techniques. 

      Overall, the study provides a convincing argument for the utility of streptavidin imaging in cellular protein visualization. By demonstrating the effectiveness of streptavidin imaging as an alternative to antibody labeling, the study offers a promising solution to issues of accessibility and localization variability. Furthermore, while streptavidin imaging shows significant advantages in signal enhancement and preservation of protein interactions, the authors must consider potential limitations and variations in its application. Factors such as the fact that tagging may sometimes impact protein function, background noise, non-specific binding, and the potential for off-target effects may impact the reliability and interpretation of results. Thus, careful validation and optimization of streptavidin imaging protocols are crucial to ensure reproducibility and accuracy across different experimental setups. 

      Strengths: 

      - Streptavidin imaging utilizes multiple biotinylation sites on both the target protein and adjacent proteins, resulting in a substantial signal boost. This enhancement is particularly beneficial for several applications with diluted antigens, such as expansion microscopy or correlative light and electron microscopy. 

      - This biotinylation process enables the identification and characterization of interacting proteins, allowing for a comprehensive understanding of protein-protein interactions within cellular contexts. 

      Weaknesses: 

      - One of the key advantages of antibodies is that they label native, endogenous proteins, i.e. without introducing any genetic modifications or exogenously expressed proteins. This is a major difference from the approach in this manuscript, and it is surprising that this limitation is not really mentioned, let alone expanded upon, anywhere in the manuscript. Tagging proteins often impacts their function (if not their localization), and this is also not discussed.

      - Given that BioID proximity labeling encompasses not only the protein of interest but also its entire interacting partner history, ensuring accurate localization of the protein of interest poses a challenge. 

      - The title of the publication suggests that this imaging technique is widely applicable. However, the authors did not show the ability to track the localization of several distinct proteins on the same sample, which could be an additional factor demonstrating the outperformance of streptavidin imaging compared with antibody labeling. Similarly, the work focuses only on small 2D samples. It would have been interesting to be able to compare this with 3D samples (e.g. cells encapsulated in an extracellular matrix) or to tissues.  

      Recommendations for the authors:

      To enhance the assessment from 'incomplete' to 'solid', the reviewers recommend that the following major issues be addressed: 

      Major issues: 

      (1) Anti-biotin antibodies in combination with TurboID labeling should be used to compare the signal/labelling penetrance to streptavidin results. That would show if elevated biotin deposition matters, or if it is really the smaller size, more fluors, and higher affinity of streptavidin that's making the difference. 

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both, and whether the size matters (IgG versus streptavidin). However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen. Thus, even if antibiotin would give a better signal then anti-HA, this could be either caused by the increase in antigen-number (more biotin than HA-tag) or by the better binding affinity, or by a combination, and it would not allow to truly answer the question. We have now tested anti-biotin antibodies, also in repsonse to reviewer 1, and got a much poorer signal in comparison to anti-HA or streptavidin.

      Please note that we made another attempt using nanobodies to target phase-separated proteins, to see, whether size matters (Fig. 2I). The nanobody did not stain Mex67 at the nuclear pores, but gave a weak nucelolar signal for NOG1, which may suggest that the nanobody can slightly better penetrate than IgG, but it does not rule out that the nanobody simply binds with higher affinity. Reviewer 1 has suggested to use the Halo Tag with PEG-biotin: this would indeed allow to directly compare the streptavidin signal caused by the TurboID with a single biotin added by the Halo tag. Unfortunately, the PEG-biotin does not  penetrate trypanosome cells. In conclusion, we are not aware of a method that would allow to establish why streptavidin but not IgGs can penetrate to phase separated areas. We therefore prefer to not overinterpret our data, but stick to what is supported by the data: “the inability to label phase-separated areas is not restricted to anti-HA but applies to other antibodies”.

      (3) Figure 4 A-B. The validity of claiming the correct localization demonstrated by streptavidin imaging comes into question, especially when endogenous fluorescence, via the fusion protein, remains undetectable (as indicated by the yellow arrow at apex). 

      In this figure, the streptavidin imaging does NOT show the correct localisation of the bait protein, but it does show proteins from historic interactions that have a distinct localisation to the bait. We had therefore introduced this chapter with the paragraph below, to make sure, the reader is aware of the limitations (which we also see as an opportunity, if properly controlled):

      “We found that in most cases, streptavidin labelling faithfully reflects the steady state localisation of a bait protein, e.g., the localisation resembles those observed with immunofluorescence or direct fluorescence imaging of GFP-fusion proteins. For certain bait proteins, this is not the case, for example, if the bait protein or its interactors have a dynamic localisation to distinct compartments, or if interactions are highly transient. It is thus essential to control streptavidin-based de novo localisation data by either antibody labelling (if possible) or by direct fluorescence of fusion-proteins for each new bait protein.”

      In particular, on lines 450-460, there's a fundamental issue with the argument put forward here. It is not possible to formally know that the posterior labeling is ALPH1 vs. another part of the decapping complex that was associated with PABP2-Turbo, or if the higher detection capacity of the Turbo-biotin label is uncovering a novel localization of the PABP2. While it is likely that it is ALPH1, it is not possible to rule out other possibilities with this approach. These issues should be discussed here and more generally the possibility of off-target labeling with this approach should be addressed in the discussion. 

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is back-uped by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      (4) More discussion and acknowledgment of the general limitations in using tagged proteins are needed to balance the manuscript, especially if the hope is to draw a comparison with antibody labeling, which works on endogenous proteins (not requiring a tag). For example: (a) tagging proteins requires genetic/molecular work ahead of time to engineer the constructs and/or cells if trying to tag endogenous proteins; (b) tagged proteins should technically be validated in rescue experiments to confirm the tag doesn't disrupt function in the cell/tissue/context of interest; and (c) exogenous tagged proteins compete with endogenous untagged proteins, which can complicate the interpretation of data.  

      We have added this paragraph to the first paragraph of the discussion part:

      “Like many methods that are frequently used in cell- and molecular biology, streptavidin imaging is based on the expression of a genetically engineered fusion protein: it is essential to validate both, function and localisation of the TurboID-HA tagged protein by orthogonal methods. If the fusion protein is non-functional or mis-localised, tagging at the other end may help, but if not, this protein cannot be imaged by streptavidin imaging. Likewise, target organisms not amenable to genetic manipulation, or those with restricted genetic tools,  are not or less suitable for this method.”

      Also, we like to point out that for non-mainstream organisms like trypanosomes, antibodies are not commercially available and often genetic manipulation is more time-efficient and cheaper than the production of antiserum against the target protein.

      Also, the introduction would ideally be more general in scope and introduce the pros and cons of antibody labeling vs biotin/streptavidin, which are mentioned briefly in the discussion. The fact that the biotin-streptavidin interaction is ~100-fold higher affinity than an IgG binding to its epitope is likely playing a key role in the results here. The difference in size between IgG and streptavidin, the likelihood that the tetrameric streptavidin carries more fluors than a IgG secondary, and the fact that biotin can likely diffuse into phase-separated environments should be clearly stated. The current introduction segues from a previous paper that a more general audience may not be familiar with. 

      We have now included this paragraph to the introduction:

      “It remains unclear, why streptavidin was able to stain biotinylated proteins within these antibody inaccessible regions, but possible reasons are: (i) tetrameric streptavidin is smaller and more compact than IgGs (60 kDa versus a tandem of two IgGs, each with 150 kDa) (ii) the interaction between streptavidin and biotin is ~100 fold stronger than a typical interaction between antibody and antigen and (iii) streptavidin contains four fluorophores, in contrast to only one per secondary IgG.”

      Minor issues: 

      The copy numbers of the HA and Ty1 epitope tags vary depending on the construct being used. For example, Ty1 is found as a single copy tag in the TurboID tag, but on the mNeonGreen tag there are 6 copies of the epitope. It makes it hard to know if differences in detection are due to variations in copies of the epitope tags. Line 372-374: can the authors explain why they chose to use nanobodies in this case? It would be great to show the innate mNeonGreen signal in 2K to compare to the Ty1 labeling. The presence of 6 copies of the Ty1 epitope could be essential to the labeling seen here.

      We agree with the reviewer, that these data are a bit confusing. We have now removed Figure 3K, as it is the only construct with 6 Ty1 instead of one, and it does not add to the conclusions. (the mNeonsignal is entirely in the nucleolus, as shown by Tryptag). We have also added an explanation why we used nanobodies (“The absence of a nanobody signal rules out that its simply the size of IgGs that prevents the staining of Mex67 at the nuclear pores, as nanobodies are smaller than (tetrameric) streptavidin”). However, as stated above, we prefer not to overinterpret the data, as signals from different antibodies/nanobodies – antigen combinations are not comparable. Important to us was to stress that the absence of signal in phase-separated areas is NOT restricted to the anti-HA antibody, which is clearly supported by the data.

      What is the innate streptavidin background labeling look like in cells that are not carrying a TurboID fusion, from the native proteins that are biotinylated? That should be discussed. 

      We have now included the controls without the TurboID fusions for trypanosomes and HeLa cells: “Wild type cells of both Trypanosomes and human showed only a very low streptavidin signal, indicating that the signal from naturally biotinylated proteins is neglectable (Figure S8 in supplementary material).”

      Line 328-331: This is likely to be dependent on whether or not the protein moves to different localizations within the cell. 

      True, we agree, and we have added this paragraph:

      “The one exception are very motile proteins that produce a “biotinylation trail” distinct to the steady state localisation; these exceptions, and how they can be exploited to understand protein interactions, are discussed in chapter 4 below. “

      Line 304-305: Does biotin supplementation not matter at all? 

      No, we never saw any increase in biotinylation when we added extra biotin to trypanosomes. The 0.8 µM biotin concentration in the medium were sufficient.

      Line 326-327: Was the addition of biotin checked for enhancement in the case of the mammalian NUP98? I would argue that there is a significant number of puncta in Figure 1D that are either green or magenta, not both. The amount of extranuclear puncta in the HA channel is also difficult to explain. Biotin supplementation to 500 µM was used in mammalian TurboID experiments in the original Nature Biotech paper- perhaps nanomolar levels are too low. 

      We now tested HeLa cells with 500 µM Biotin and saw an increase in signal, but also in background; due to the increased background  we conclude that low biotin concentrations are more suitable . We have also repeated the experiment using 4HA tags instead of 1HA, and we found a minor improvement in the antibody signal for NUP88 (while the phase separated NUP54 was still not detectable). We have replaced the images in Figure 1D  (NUP88) and also in Figure 2F (NUP54) with improved images and using 4HA tags. However, we like to note that single nuclear pore resolution is beyond what can be expected of light microscopy.

      Line 371: In 2I, I see a signal that looks like the nucleus, similar to the Ty1 labeling in 2G, so I don't think it's accurate to say that that Mex67 was "undetectable". Does the serum work for blotting? 

      Thank you, yes, “undetectable” was not the correct phrase here. Mex67 localises to the nuclear pores, to the nuceoplasm and to the nucleolus (GFP-tagging or streptavidin). Antibodies, either to the tag or to the endogenous proteins, fail to detect Mex67 at the nuclear pores and also don’t show any particular enrichment in the nucleolus. They do, however, detect Mex67 in the (not-phase-separated) area of the nucleoplasm. We have changed the text to make this clearer. The Mex67 antiserum works well on a western blot (see for example: Pozzi, B., Naguleswaran, A., Florini, F., Rezaei, Z. & Roditi, I. The RNA export factor TbMex67 connects transcription and RNA export in Trypanosoma brucei and sets boundaries for RNA polymerase I. Nucleic Acids Res. 51, 5177–5192 (2023))

      Line 477: "lacked" should be "lagged".

      Thank you, corrected.

      Line 468-481: My previous argument holds here - how do you know that the difference in detection here is just a matter of much higher affinity/quantity of binding partner for the avidin?

      See answer to the second point of (3), above.

      483-491: Same issue - without certainty about what the biotin is on, this argument is difficult to make. 

      See answer to the second point of (3), above.

      Line 530: "bone-fine" should be "bonafide"

      Thank you, corrected.

      Line 602: biotin/streptavidin labeling has been used for expansion microscopy previously (Sun, Nature Biotech 2021; PMID: 33288959). 

      Thank you, we had overlooked this! We have now included this reference and describe the differences to our approach clearer in the discussion part:

      “Fluorescent streptavidin has been previously used in expansion microscopy to detect biotin residues in target proteins produced by click chemistry (Sun et al., 2021). However, to the best of our knowledge, this is the first report that employs fluorescent streptavidin as a signal enhancer in expansion microscopy and CLEM, by combining it with multiple biotinylation sites added by a biotin ligase. Importantly, for both CLEM and expansion, streptavidin imaging is the only alternative approach to immunofluorescence, as denaturing conditions associated with these methods rule out direct imaging of fluorescent tags.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      This study presents valuable framework and findings to our understanding of the brain as a fractal object by observing the stability of its shape property within 11 primate species and by highlighting an application to the effects of aging on the human brain. The evidence provided is solid but the link between brain shape and the underlying anatomy remains unclear. This study will be of interest to neuroscientists interested in brain morphology, whether from an evolutionary, fundamental or pathological point of view, and to physicists and mathematicians interested in modeling the shapes of complex objects.

      We now clarified the outstanding questions regarding if our model outputs can be related to actual primate brain anatomy, which we believe was mainly based on comments regarding the validity of our output of apparently thicker cortices than nature can produce.

      We address this point in more detail in the point-by-point response below, but want to address this misunderstanding directly here: Our algorithm does not produce thicker cortices with increasing coarse-graining scales; in fact, the cortical thickness never exceeds the actual cortical thickness in our outputs, but rather thins with each coarse-graining scale. In other words, we believe that our outputs are fully in line with neuroanatomy across species.

      Reviewer #2 (Public Review): 

      In this manuscript, the authors analyze the shapes of cerebral cortices from several primate species, including subgroups of young and old humans, to characterize commonalities in patterns of gyrification, cortical thickness, and cortical surface area. The authors state that the observed scaling law shares properties with fractals, where shape properties are similar across several spatial scales. One way the authors assess this is to perform a "cortical melting" operation that they have devised on surface models obtained from several primate species. The authors also explore differences in shape properties between brains of young (~20 year old) and old (~80) humans. A challenge the authors acknowledge struggling with in reviewing the manuscript is merging "complex mathematical concepts and a perplexing biological phenomenon." This reviewer remains a bit skeptical about whether the complexity of the mathematical concepts being drawn from are justified by the advances made in our ability to infer new things about the shape of the cerebral cortex. 

      To allow scientists from all backgrounds to adopt these complex ideas, we have made our code to “melt” the brains and for further downstream analysis publicly available. We have now also provided a graphical user interface, to allow users without substantial coding experience to run the analysis. We also believe that the algorithmic concepts are easy to understand due to the similarity to the coarse-graining procedures found in long-standing and well-accepted box-counting algorithms.

      Beyond the theoretical insight of the fractal nature of cortices and providing an explicit and crucial link between vastly different brains that are gyrified and those that are not, we believe that the advance gained by our methods for future applications is clearly demonstrated in our proof-of-principle with a four-fold increase in effect size. For reference, an effect size of 8 would translate to an almost perfect separation of groups, i.e. an ideal biomarker with near 100% sensitivity and specificity.

      (1) The series of operations to coarse-grain the cortex illustrated in Figure 1 produces image segmentations that do not resemble real brains.

      As re-iterated in our Methods and Discussion: “Note, of course, that the coarse-grained brain surfaces are an output of our algorithm alone and are not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Fig. 1 therefore serves as an explanation to the reader on the algorithmic outputs, but each melted brain is not supposed to be directly/visually compared to actual brains. Similar to algorithms measuring the fractal dimension, or the exposed surface area of a given brain, the intermediate outputs of these algorithms are not supposed to represent any biologically observed brain structures, but rather serve as an abstraction to obtain meaningful morphometrics.

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained and voxelised versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects/voxelisations themselves.

      The process to assign voxels in downsampled images to cortex and white matter is biased towards the former, as only 4 corners of a given voxel are needed to intersect the original pial surface, but all 8 corners are needed to be assigned a white matter voxel. The reason for introducing this bias (and to the extent that it is present in the authors' implementation) is not provided.

      This detail was in the Supplementary, and we have now added additional clarification on this specific point to our Supplementary:

      “In detail, we assign all voxels in the grid with at least four corners inside the original pial surface to the pial voxelization. This process allows the exposed surface to remain approximately constant with increasing voxel sizes. A constant exposed surface is desirable, as we only want to gradually ‘melt’ and fuse the gyri, but not grow the bounding/exposed surface as well. We want the extrinsic area to remain approximately constant as we decrease the intrinsic area via coarse-graining; it is like generating iterates of a Koch curve in reverse, from more to less detailed, by increasing the length of smallest line segment.

      We then assign voxels with all eight corners inside the original white matter surface to the white matter voxelization. This is to ensure integrity of the white matter, as otherwise white matter voxels in gyri may become detached from the core white matter, and thus artificially increase white matter surface area. Indeed, the main results of the paper are not very sensitive to this decision using all eight corners, vs. e.g. only four corners, as we do not directly use white matter surface area for the scaling law measurements. However, we still maintained this choice in case future work wants to make use of the white matter voxelisations or derivative measures.”

      Note on the point of white matter integrity that if both grey and white matter voxelisations require all 8 corner to be inside the respective mesh, there will be voxels not assigned to either at the grey/white matter interface, causing potential downstream issues.

      We further acknowledge:

      “Of course, our proposed procedure is not the only conceivable way to erase shape details below a given scale; and we are actively working on related algorithms that are also computationally cheaper. Nevertheless, the current version requires no fine-tuning, is computationally feasible and conceptually simple, thus making it a natural choice for introducing the methodology and approach.”

      The authors provide an intuitive explanation of why thickness relates to folding characteristics, but ultimately an issue for this reviewer is, e.g., for the right-most panel in Figure 2b, the cortex consists of several 4.9-sided voxels and thus a >2 cm thick cortex. A structure with these morphological properties is not consistent with the anatomical organization of typical mammalian neocortex. 

      We assume the reviewer refers to Fig. 1B with the panel on scale=4.9mm. We would like to point out that Fig. 1 serves as an explanation of the voxelisation method. For the actual analysis and Results, we are using re-scaled brains (see Fig. 2 with the ever decreasing brain sizes). The rescaling procedure is now expanded as below:

      “Morphological properties, such as cortical thicknesses measured in our ‘melted’ brains are to be understood as a thickness relative to the size of the brain. Therefore, to analyse the scaling behaviour of the different coarse-grained realisations of the same brain, we apply an isometric rescaling process that leaves all dimensionless shape properties unaffected (more details in Suppl. S3.1). Conceptually, this process fixes the voxel size, and instead resizes the surfaces relative to the voxel size, which ensures that we can compare the coarse-grained realisations to the original cortices, and test if the former, like the latter, also scale according to Eqn. (1). Resizing, or more precisely, shrinking the cortical surface is mathematically equivalent to increasing the box size in our coarse-graining method. Both achieved an erasure of folding details below a certain threshold. After rescaling, as an example, the cortical thickness also shrinks with increasing levels of coarse-graining, and never exceeds the thickness measured at native scale.”

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects themselves and their detailed anatomical features.

      (2) For the comparison between 20-year-old and 80-year-old brains, a well-documented difference is that the older age group possesses more cerebral spinal fluid due to tissue atrophy, and the distances between the walls of gyri becomes greater. This difference is born out in the left column of Figure 4b. It seems this additional spacing between gyri in 80 year olds requires more extensive down-sampling (larger scale values in Figure 4a) to achieve a similar shape parameter K as for the 20 year olds. The authors assert that K provides a more sensitive measure (associated with a large effect size) than currently used ones for distinguishing brains of young vs. old people. A more explicit, or elaborate, interpretation of the numbers produced in this manuscript, in terms of brain shape, might make this analysis more appealing to researchers in the aging field.

      We have removed the main results relating to K and aging from our last revision already to avoid confusion. This is now only in the supplementary analysis, and our claim of K being a more sensitive measure for age and ageing – whilst still true – will be presented in more detail in a series of upcoming papers.

      (3) In the Discussion, it is stated that self-similarity, operating on all length scales, should be used as a test for existing and future models of gyrification mechanisms. Given the lack of association between the abstract mathematical parameters described in this study and explicit properties of brain tissue and its constituents, it is difficult to envision how the coarse-graining operation can be used to guide development of "models of cortical gyrification."

      We have clarified in more detail what we meant originally in Discussion:

      “Finally, this dual universality is also a more stringent test for existing and future models of cortical gyrification mechanisms at relevant scales, and one that moreover is applicable to individual cortices. For example, any models that explicitly simulate a cortical surface as an output could be directly coarse-grained with our method and the morphological trajectories can be compared with those of actual human and primate cortices. The simulated cortices would only be ‘valid’ in terms of the dual universality, if it also produces the same morphological trajectories.”

      However, we agree with the reviewer that our paper could be misread as demanding direct comparisons of each coarse-grained brain with an actual brain, and we have now added the following text to clarify that this is not our intention for the proposed method or outputs.

      “Note, we do not suggest to directly compare coarse-grained brain surfaces with actual biological brain surfaces. As we noted earlier, the coarse-grained brain surfaces are an output of our algorithm alone and not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Indeed, the dual universality imposes restrictive constraints on the possible shapes of real cortices, but do not fully specify them. Presumably, the location of individual folds in different individuals and species will depend on their respective evolutionary histories, so there is no reason to expect a match in fold location between the ‘melted’ cortices of more gyrified species, on one hand, and the cortex of a less-gyrified one, on the other,  even if their global morphological parameters and global mechanism of folding coincide.

      (4) There are several who advocate for analyzing cortical mid-thickness surfaces, as the pial surface over-represents gyral tips compared to the bottoms of sulci in the surface area. The authors indicate that analyses of mid-thickness representations will be taken on in future work, but this seems to be a relevant control for accepting the conclusions of this manuscript.

      In the context of some applications and methods, we agree that the mid-surface is a meaningful surface to analyse. However, in our work, the mid-surface is not. The fractal estimation rests on the assumption that the exposed area hugs the object of interest (hence convex hull of the pial surface), as the relationship between the extrinsic and intrinsic areas across scales determine the fractal relationship (Eq. 2). If we used the mid-surface instead of the pial surface for all estimation, this would not represent the actual object of interest, and it is separated from the convex hull. Estimating a new convex hull based on the mid surface would be the equivalent of asking for the fractal dimension of the mid-surface, not of the cortical ribbon. In other words, it would be a different question, bound to yield a different answer.

      Hence, we indicated in our original response that we only have a provisional answer, but more work beyond the scope of this paper is required to answer this question, as it is a separate question. The mid-surface, as a morphological structure in its own right, will have its own scaling properties, and our provisional understanding is that these also yield a scaling law parallel to those of the cortical ribbon with the same or a similar fractal dimension. But more systematic work is required to investigate this question at native scale and across scales.

      Reviewer #3 (Public Review):

      Summary: Through a rigorous methodology, the authors demonstrated that within 11 different primates, the shape of the brain followed a universal scaling law with fractal properties. They enhanced the universality of this result by showing the concordance of their results with a previous study investigating 70 mammalian brains, and the discordance of their results with other folded objects that are not brains. They incidentally illustrated potential applications of this fractal property of the brain by observing a scale-dependant effect of aging on the human brain. 

      Strengths: 

      - New hierarchical way of expressing cortical shapes at different scales derived from previous report through implementation of a coarse-graining procedure 

      - Investigation of 11 primate brains and contextualisation with other mammals based on prior literature 

      - Proposition of tool to analyse cortical morphology requiring no fine tuning and computationally achievable 

      - Positioning of results in comparison to previous works reinforcing the validity of the observation. 

      - Illustration of scale-dependance of effects of brain aging in the human. 

      Weaknesses: 

      - The notion of cortical shape, while being central to the article, is not really defined, leaving some interpretation to the reader 

      - The organization of the manuscript is unconventional, leading to mixed contents in different sections (sections mixing introduction and method, methods and results, results and discussion...). As a result, the reader discovers the content of the article along the way, it is not obvious at what stages the methods are introduced, and the results are sometimes presented and argued in the same section, hindering objectivity. 

      To improve the document, I would suggest a modification and restructuring of the article such that: 1) by the end of the introduction the reader understands clearly what question is addressed and the value it holds for the community, 2) by the end of the methods the reader understands clearly all the tools that will be used to answer that question (not just the new method), 3) by the end of the results the reader holds the objective results obtained by applying these tools on the available data (without subjective interpretations and justifications), and 4) by the end of the discussion the reader understands the interpretation and contextualisation of the study, and clearly grasps the potential of the method depicted for the better understanding of brain folding mechanisms and properties. 

      We thank this reviewer again for their attention to detail and constructive comments. We have followed the detailed suggestions provided by us in the Recommendations For The Authors, and summarise the main changes here:

      - We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsections, we believe the structure is now more accessible to readers.

      -  We have now clarified the concept of “cortical shape”, as we use it in our paper in several places, by distinguishing clearly the object of study, and the morphological properties measured from it.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): None 

      Reviewer #3 (Recommendations For The Authors): 

      I once again compliment the authors for their elegant work. I am happy with the way they covered my first feedback. My second review takes into account some comments made by other reviewers with which I agree. 

      We thank this reviewer again for their attention to detail and constructive comments.

      Recommendations for clarifications: 

      General comments: The purpose of the article could be made clearer in the introduction. When I differentiate results from discussion, I think of results as objective measures or observations, while discussion will relate to the interpretation of these results (including comparison with previous literature, in most cases). 

      We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsection, we believe the structure is now more accessible to readers.

      - l.39: define or discuss "cortical shape" 

      We have gone through the entire paper and corrected for any ambiguities. We specifically distinguish between the cortex as a structure overall, shape measures derived from this structure, and coarse-grained versions of the structure.

      - l.48-74: this would match either an introduction or a discussion rather than a methods section. 

      Done

      - l.98-106: this would match a discussion rather than a methods section. 

      Done

      - l.111: here could be a good spot to discuss the 4 vs 8 corners for inclusion of pial vs white matter voxelization 

      We have discussed this in the more detailed Supplementary section now, as after restructuring, this appears to be the more suitable place.

      - l.140-180: it feels that this section mixes methods, results and discussion of the results 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      - l.183-217: mix of results and discussion 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      Small cosmetic suggestions: 

      - l.44: conservation of 'some' quantities: vague 

      Changed to conservation of morphological relationships across evolution

      - l.66: order of citations ([24, 22,23]) 

      Will be fixed at proof stage depending on format of references.

      - l.77: delete space between citation and period 

      Done

      - l.77: I would delete 'say' 

      Done

      - l.86: 'but to also analyse' -> 'to analyse' 

      Done

      - l.105: remove 'we are encouraged that' 

      Done

      - l.111: 'also see' -> 'see also' 

      Done

      - l.164: 'remarkable': subjective 

      Done

      - l.189: define approx. abbreviation 

      Done

      - l.190: 'approx' -> 'approx.' 

      Revised

      - l.195: 'dramatic': subjective 

      removed

      -l. 246: 'much' -> vague 

      explained

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Plasmacytoid dendritic cells (pDCs) represent a specialized subset of dendritic cells (DCs) known for their role in producing type I interferons (IFN-I) in response to viral infections. It was believed that pDCs originated from common DC progenitors (CDP). However, recent studies by Rodrigues et al. (Nature Immunology, 2018) and Dress et al. (Nature Immunology, 2019) have challenged this perspective, proposing that pDCs predominantly develop from lymphoid progenitors expressing IL-7R and Ly6D. A minor subset of pDCs arising from CDP has also been identified as functionally distinct, exhibiting reduced IFN-I production but a strong capability to activate T-cell responses. On the other hand, clonal lineage tracing experiments, as recently reported by Feng et al. (Immunity, 2022), have demonstrated a shared origin between pDCs and conventional DCs (cDCs), suggesting a contribution of common DC precursors to the pDC lineage.

      In this context, Araujo et al. investigated the heterogeneity of pDCs in terms of both development and function. Their findings revealed that approximately 20% of pDCs originate from lymphoid progenitors common to B cells. Using Mb1-Cre x Bcl11a floxed mice, the authors demonstrated that the development of this subset of pDCs, referred to as "B-pDCs," relied on the transcription factor BCL11a. Functionally, B-pDCs exhibited a diminished capacity to produce IFN-I in response to TLR9 agonists but secreted more IL-12 compared to conventional pDCs. Moreover, B-pDCs, either spontaneously or upon activation, exhibited increased expression of activation markers (CD80/CD86/MHC-II) and a heightened ability to activate T-cell responses in vitro compared to conventional pDCs. Finally, Araujo et al. characterized these B-pDCs at the transcriptomic level using bulk and single-cell RNA sequencing, revealing them as a unique subset of pDCs expressing certain B cell markers such as Mb1, as well as specific markers (Axl) associated with cells recently described as transitional DCs.

      Thus, in contrast to previous findings, this study posits that a small proportion of pDCs derive from B cell-committed lymphoid progenitors, and this subset of B-pDCs exhibits distinct functional characteristics, being less specialized in IFN-I production but rather in T cell activation.

      Strengths:

      Previously, the same research group delineated the significance of BCL11a as a critical transcription factor in pDC development (Ippolito et al., PNAS, 2014). This study elucidates the precise stage during hematopoiesis at which BCL11a expression becomes essential for the emergence of a distinct subset of pDCs, substantiated by robust genetic evidence in vivo. Furthermore, it underscores the shared developmental origin between pDCs and B cells, reinforcing prior research in the field that suggests a lymphoid origin of pDCs. Finally, this work attributes specific functional properties to pDCs originating from these lymphoid progenitors shared with B cells, emphasizing the early imprinting of functional heterogeneity during their development.

      Weaknesses:

      The authors delineate a subset of pDCs dependent on the BCL11a transcription factor, originating from lymphoid progenitors, and compare it to conventional pDCs, which they suggest differentiate from common DC progenitors of myeloid origin. However, this interpretation lacks support from the authors' data. Their single-cell RNA sequencing data identifies cells corresponding to progenitors (Prog2), from which the majority of pDCs, termed conventional pDCs, likely originate. This progenitor cell population expresses Il7r, Siglech, and Ly6D, but not Csfr1. The authors describe this progenitor as resembling a "pro-pDC myeloid precursor," yet these cells align more closely with lymphoid (Il7r+) progenitors described by Rodrigues et al. (Nature Immunology, 2018) and Dress et al. (Nature Immunology, 2019). Furthermore, analysis of their Mb1 reporter mice reveals that only a fraction of common lymphoid progenitors (CLP) express YFP, giving rise to a fraction of YFP+ pDCs. However, this does not exclude the possibility that YFP- CLP could also give rise to pDCs. The authors could address this caveat by attempting to differentiate pDCs from both YFP+ and YFP- CLPs in vitro in the presence of FLT3L. Additionally, transfer experiments using these lymphoid progenitors could be conducted in vivo to assess their differentiation potential in competitive settings.

      Dear Reviewer 1, we appreciate your thoughtful comments. We made the decision to address the Prog2 cluster as “pro-pDC myeloid precursor” because despite its lack of CSFR-1, its CIPR similarity score showed highest transcriptional similarity to the population “SC.CDP.BM” (GEO accession number: GSM791114), which is shown to be Sca1- Flt3+ cKitlo.

      A similar population identified as “common dendritic cell progenitor” is shown by Onai and colleagues (Onai et al. 2013, Immunity) to be capable of differentiating into pDCs by upregulating E2-2 and subsequently downregulating M-CSFR. In addition, we were unable to infer a developmental trajectory between Prog2 and B-pDCs using SimplePPT on Monocle3 (Figure 5B). Since we know our B-pDCs are CLP derived and most likely share a B cell progenitor population, we feel this lack of connectivity to the UMAP myeloid partition corroborates our assignment of Prog2 as a myeloid pDC progenitor (not CLP derived). Of note, recent work by Medina and colleagues has shown that while IL-7Rα knockout mice exhibit a block in B cell development at the all-lymphoid progenitor (ALP) stage, PDCA-1+ pDCs identified within the initially gated BLP population persisted (PLoS One, 2013), suggesting the IL7R chain is not required for the development of PDCA1+ cells. 

      Using their Mb1-reporter mice, the authors demonstrate that YFP pDCs originating from lymphoid progenitors are functionally distinct from conventional pDCs, mostly in vitro, but their in vivo relevance remains unknown. It is crucial to investigate how Bcl11a conditional deficiency in Mb1-expressing cells affects the anti-viral immune response, for example, using the M-CoV infection model as described by Sulczewski et al. in Nature Immunology, 2023. Particularly, the authors suggest that their B-pDCs act as antigen-presenting cells involved in T-cell activation compared to conventional pDCs. However, these findings contrast with those of Rodrigues et al., who have shown that pDCs of myeloid origin are more effective than pDCs of lymphoid origin in activating T-cell responses. The authors should discuss these discrepancies in greater detail. It is also notable that B-PDCs acquire the expression of ID2 (Figure S3A), commonly a marker of conventional/myeloid DCs. The authors could analyze in more detail the acquisition of specific myeloid features (CD11c, CX3CR1) by this B-PDCs subset and discuss how the expression of ID2 may impair classical pDC features, as ID2 is a repressor of E2-2, a master regulator of pDC fate.

      Both reviewers expressed the need to further investigate how Bcl11a conditional deficiency in Mb1-expressing cells affects anti-viral responses of B-pDCs. While the functional characterization of B-pDC in the context of infection could be highly informative, it is really outside the scope of the present study. Our discovery that B-pDCs expand robustly upon TLR-9 agonist challenges in vivo and can prime T cells in vitro efficiently, however, suggests that these cells might play an important role during viral infections or anti-cancer immunity.

      Finally, through the analysis of their single-cell RNA sequencing data, the authors show that the subset of B-pDCs they identified expresses Axl, confirmed at the protein level. Given this specific expression profile, the authors suggest that B-pDCs are related to a previously described subset of transitional DCs, which were reported to share a common developmental path with pDCs, (Sulczewski et al. in Nature Immunology, 2023). While intriguing, this observation requires further phenotypic and functional characterization to substantiate this claim.

      We agree with the reviewer’s comments. We are currently preparing a separate manuscript addressing the commonalities between human transitional DCs and murine non-conventional pDCs.

      Reviewer #2 (Public Review):

      Summary:

      The origin of plasmatoid dendritic cells and their subclasses continues to be a debated field, akin to any immune cell field that is determined through the expression of surface markers (relative to clear subclass separation based on functional biology and experimentation). In this context, in this manuscript by Araujo et al, the authors attempt to demonstrate that a subtype of pDCs comes from lymphoid origin due to the presence of some B cell gene expression markers. They nomenclature these cells as B-pDCs. Strikingly, pDCs function via expression of IFNa where as B-pDCs do not express IFNa - thereby raising the question of what are their physiological or pathophysiological properties. B-pDCs also express AXL, a marker not seen in mouse pDCs but observed in human pDCs. Overall, using a combination of gene expression profiling of immune cells isolated from mice via RNA-seq and single-cell profiling the authors propose that B-pDCs are a novel subtype of pDCs in mice that were not previously identified and characterized.

      Weaknesses:

      My two points of discussion about this manuscript are as follows.

      (1) How new are these observations that pDCs could also originate from common lymphoid progenitors. This fact has been previously outlined by many laboratories including Shigematsu et al, Immunity 2004. These studies in the manuscript can be considered new based on the single-cell profiling presented, only if the further characterization of the isolated B-pDCs is performed at the functional biology level. Overlapping gene expression profiles are often seen in developing immune cell types- especially when only evaluated at the RNA expression level- and can lead to cell type complexity (and identification of new cell types) that are not biologically and functionally relevant.

      Dear reviewer 2, we appreciate your thoughtful comments. We believe our single cell seq analysis adds new information to the studies mentioned because of our broader approach to BM profiling. By using only one marker (PDCA1+), scRNA-seq allowed us to dissect not only several subpopulations of pDCs that to our knowledge were not previously dissected in mice, but also linked the transcriptional similarity of B-pDCs to myeloid derived pDCs (and even other myeloid cell types), as well as B cells.

      (2) The authors hardly perform any experiments to interrogate the function of these B-pDCs. The discussion on this topic can be enhanced. Ideally, some biological experiments would confirm that B-pDCs are important.

      Dear reviewer 2, we appreciate your thoughtful comment and agree about the need for further functional characterization of B-pDCs (please see comments directed to reviewer 1 above).

      (1) Considering that Bcl11a conditional deficiency severely impacts the B cell lineage, there is a possibility that such an effect on B cells may indirectly influence pDC development. To address this, the authors could repeat their bone marrow transfer experiments in a competitive setting by mixing both Bcl11a WT and CKO BM cells (using congenic markers to track the origin of the BM cells) and then specifically assess whether BM cells originating from Bcl11a CKO donors have impaired pDC output.

      Dear reviewer 2, while the comment above is valid (that the reduced number of mature B cells in our Bcl11a conditional knockout might indirectly impact B-pDC development), we and many others have previously shown that lack of transcriptional regulation of E2-2 and other pDC differentiation modulators by Bcl11a  (including ID2 and MTG16) intrinsically and selectively disrupts the pDC lineage. At the current stage, we feel rederiving Bcl11a cKOs and performing bone marrow transfers (which usually take several months) only to investigate indirect effects of B cells on pDC developments is outside the scope of this publication.

      (2) As mentioned earlier, it is important to assess the potential of CLP, whether YFP- or YFP+, in their ability to give rise to pDCs both in vitro and in vivo. This is also crucial since the authors previously demonstrated that Bcl11a deficiency in all hematopoietic cells had a more drastic impact on pDC development than mb1-cre specific deficiency.

      We agree the manuscript could be strengthened by differentiation experiments. However, in our previous publication (mentioned above by the reviewer), we specifically show that although fewer overall LSK progenitors were detected in Vav-Cre+ F/F mice, both MDP and CDP progenitor populations persisted within the Flt3+ compartment in cKO mice at percentages similar to controls. MDP (Lin– Flt3+ Sca-1− CD115+ c-kithi); CDP (Lin– Flt3+ Sca-1− CD115+ c-kitlo). This data confirms that CLPs give rise to a substantial pool of pDC subpopulations. Other works have shown this as well, both in vivo and in vitro (Wang et al. Immunity 2004;  Karsunky et al, JEM 2003, etc). We therefore feel that confirming the previous observations that CLPs can give rise to pDCs is unnecessary, as our main goal in this manuscript was to describe a new pDC subpopulation that emerges primarily from CD79a+ B cell biased progenitors.

      (3) The authors show a more severe impact of Bcl11a CKO on pDC depletion in the spleen than in the BM. Is this effect specific to the spleen, or can it also be observed in lymph nodes? What is the overall impact of Bcl11a conditional deficiency on pDC distribution in tissues such as the liver and lung? These questions are important to address to understand whether the heterogeneity of pDCs is differentially affected by their localization.

      We agree heterogeneity of pDCs can be affected by their microenvironment. Although phenotyping of lymph nodes in Bcl11a cKOs would greatly add to our manuscript, the genetically altered strains required are no longer being bred in our facility and resurrecting them from frozen sperm is outside the realm of this publication.

      (4) Regarding the functional study of pDCs, as emphasized previously, it is important to assess the in vivo relevance of B-pDCs in infectious settings.

      Dear reviewer 2, we appreciate your thoughtful comment. Please see our response directed to reviewer 1 above.

      (5) The authors injected CpG-ODN into mice and analyzed pDC phenotype upon activation. It is important to note that upon activation, especially upon induction of IFN-I production in vivo, mPDCA1 expression is no longer specific to pDCs  (Blasius et al, Journal of Immunology, 2006). Therefore, to specifically characterize pDC phenotype upon activation, a differential gating strategy is required (CD11c, B220, Ly6C, and Siglec H) to ensure that bona fide pDCs are analyzed.

      We agree with the reviewer that this would be a more appropriate characterization. Regarding PDCA1 promiscuity in activated states, we are not aware of any cell types that express very high levels of B220 and PDCA1 simultaneously other than pDCs. We therefore firmly believe that our assignment is valid. Interestingly, gating B220+ cells of Cpg challenged mice that show intermediate expression of PDCA1 results in an increase in the frequency of CD19+ B cells, which we were careful to avoid by gating only the cells that most strongly express PDCA1.

      (6) How does pDC activation regulate their mb1 expression? Could conventional pDCs, upon activation, become B-PDCs? Could activation and induction of IFN-I production in vivo also affect CLP and increase the amount of YFP+ lymphoid progenitors and thus B-pDC output?

      Dear reviewer, we agree with your concern, albeit beyond the scope of the present study. While changes in YFP MFI via flow cytometry upon vaccination was not substantial, we have included the following comment in the manuscript discussion, acknowledging the aforementioned possibility: “Of note, whether induction of IFN-I production in vivo could also affect CLP and increase the amount of YFP+ lymphoid progenitors and thus B-pDC output is unclear. Further research is required to answer this question.”

      (7) If pDCs are preferentially expanding upon in vivo stimulation, it would be informative to assess their Ki67 profile. This is a surprising observation since pDCs are generally considered quiescent cells that were previously described to die in response to activation and IFN-I (Swiecki et al, Journal of Experimental Medicine, 2011).

      We agree and have entered the following statement to address this concern: “Functionally, they expand more readily after TLR9 engagement than classical pDCs (either through increased proliferation or differentiation of other cell types) and excel at activating T cells in culture.”

      (8) How does the conditional deficiency of BCL11a affect the production of IFN-I and IL-12 in vivo (serum) upon CpG-ODN stimulation?

      Dear reviewer 2, we are currently unable to rederive the conditional knockout mouse strain in a timely fashion. However, our ELISA experiments performed under controlled in vitro activation conditions, along with the in vivo findings of Zhang et al.(PNAS 2017) warrants the hypothesis that B-pDCs most likely exhibit a similar cytokine secreting profile under inflammatory conditions.

      (9) Given that B-PDCs show downregulation of pDC canonical markers, including IRF8 and TLR7, could the authors address how B-PDCs respond to TLR7 stimulation in vitro and assess a broader spectrum of cytokines produced by pDCs in response to such stimulation (IL-6, TNFa, CXCL10...)?

      Dear reviewer 2, although expanding our findings to include B-pDC responses to TLR-7 stimulation would greatly enhance our manuscript, a technical deterrent stands in our way. As mentioned prior, sorting B-pDCs for new experiments using reporter YFP mice is currently not possible, as we have retired this mouse strain. Sorting of live CD79a+ BpDCs via FACS is also not feasible, as CD79a staining with most antibody clones requires permeabilization of cells for easier access to the intra-membrane portion of CD79a.

      (10) It would be informative to compare scRNA sequencing data between control and Bcl11a CKO mice to ascertain their contribution to B-PDCs and whether this deficiency may affect other pDC clusters and/or progenitors.

      We are unable to sort B-pDCs for new experiments, as we unfortunately retired the transgenic colony.

      (11) Transitional DCs were reported to give rise to a subset of cDC2. Given that the authors claim that B-PDCs are related to this subset of transitional DCs, could the authors observe any YFP staining in cDC2 upon the generation of their BM chimeras?

      We saw no YFP positivity in CD11c hi cells (cDCs) via flow or through scRNA-seq, indicating CD79a expression is unique in mature B cells and B-pDCs.

      (12) Most of the statistical analysis is done with a student test. This requires a normal distribution of the sample which is highly unlikely given the size of the sample. Therefore, the authors shall rather use a non-parametric test (Mann Whitney) to compare their samples.

      We agree and have redone our statistical analyses using non-parametric test (Mann Whitney).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1)  In the subsets of the γδ T cells that exhibit reduced BLK expression in B6. SAP KO mice, have the authors examined the expression of Lck and/or Fyn? 

      The reviewer raises an excellent point. We have included in the revised manuscript additional data on Lck and Fyn expression in our scRNAseq dataset in (new Suppl. Fig. 1 and new Suppl. Fig. 4). These data revealed that in contrast to Blk, which appears primarily restricted to the γδT17 clusters, Lck and Fyn exhibit a much broader distribution and lack restriction to specific clusters. We did note that, like Blk, Lck and Fyn transcripts were abundant in SAP-dependent C2 cluster cells. Pseudobulk analysis on the immature clusters revealed that, neither Fyn nor Lck expression level differences reached our cut-off of 0.5 log2 FC (log2 FC Blk = 1.06), leading us to conclude that Blk is particularly dependent on SAP. We did note, however, that the magnitude of Lck differential expression was close to the 0.5 log2 FC cut-off and that its expression was increased in B6.SAP-/- γδ T cells (Suppl. Fig. 4). These results have been added to lines 202-212 in the Results section and lines 491-499 in the Discussion section.

      (2)  Does BLK directly associate with SLAM F1 and or SLAM F6 receptors? 

      The reviewer raises an interesting question given previous reports that BLK, LCK, and FYN have all been implicated in γδ T cell development. While SAP has a well-known ability to recruit FYN to SLAMF1 and there is evidence of a similar SAP-mediated recruitment of LCK to SLAMF6, we are not aware of any evidence a SAP-BLK interaction or of a direct binding of BLK to SLAM family receptors. Future experiments to investigate this possiibility are certainly warranted. In the revised ms, we have included additional discussion of these possibilities (lines 491- 499).  

      (3)  Given the emerging role of γδ T cells in host immunity, it would be useful if the authors could add a discussion of how their findings are relevant in disease conditions such as cancer. 

      We agree and have included new text in the Introduction (lines 37-45). 

      (4)  Delete repeated words in lines 546 and line 553. 

      Thank you—this has been corrected in the revised manuscript.

      Reviewer #2:

      This is a very complete study and requires no additional experimentation. One thing to keep in mind in assessing the ultimate fate of the "ab wannabe cells" is that mechanisms exist to silence the gd TCR as cells differentiate to the DP stage and so their presence as diverted DP cells may not be evident by staining for gdTCR expression - and will only be evident transcriptomically. 

      We appreciate this helpful comment from the reviewer which we will take into consideration in our future experimental design.

      There are a couple of minor points to raise: 

      (1)  Figure 3C is not called out in the text. 

      Thank you—this has been corrected in the revised manuscript.

      (2)  Line 546 - "dependent" is repeated.

      Thank you—this has been corrected in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to reviewers (minor points):

      We thank all reviewers for their very helpful suggestions and greatly appreciate their positive evaluation of our work.

      Reviewer #1:

      Ad 1) The reviewer states: Fig 5 While the data very nicely show that CPX and Syt1 have interdependent interactions in the chromaffin neurons, this seems to be not the case in neurons, where the loss of complexins and synaptotagmins have additive effects, suggesting independent mechanisms (eg Xue et al., 2010). This would be a good opportunity to discuss some possible differences between secretion in endocrine cells vs neurons.

      We greatly appreciate the insightful suggestion by the reviewer. To accommodate the reviewer’s suggestion, we now discuss this issue on page 21, line 486-491: “In murine hippocampal neurons, loss of CpxI and Syt1 has additive effects on fast synchronous release, suggesting independent mechanisms (Xue et al., 2010). On the other hand, the same study also showed that Syt1 heterozygosity fails to reduce release probability in wild-type neurons, but does so in the absence of Cpx, again suggesting that Cpx and Syt1 may functionally interact in Ca2+-triggered release.”

      Ad 2) The reviewer states: Fig 8 Shows an apparent shift in Ca sensitivity in N-terminal mutants suggesting a modification of Ca sensitivity of Syt1. Could there be also an alternative mechanism, that explains this phenotype which is based on a role of the n-term lowering the energy barrier for fusion, that in turn shifts corresponding fusion rates to take place at lower Ca saturation levels?

      We fully agree with the reviewer. While our data indicate that Cpx and Syt1 act in a dependent manner in accelerating exocytosis, they do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523529: ”The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+-sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+triggered fusion.”

      Reviewer #2:

      Ad 1) The reviewer states: The authors provide a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. With the exception of mammalian renal ribbon synapses (and some earlier RNAi knockdown studies that had off-target effects), there is very little evidence for a "fusion-clamp"-like function of Cplxs in mammalian synapses. At conventional mammalian synapses, genetic loss of Cplx (i.e. KO) consistently decreases AP-evoked release, and generally either also decreases spontaneous release rates or does not affect spontaneous release, which is inconsistent with a "fusion-clamp" theory. This is in stark contrast to invertebrate (D. m. and C. e.) synapses where genetic Cplx loss is generally associated with strong upregulation of spontaneous release, providing support for Cplx acting as a "fusion-clamp".

      We agree with the reviewer that it is difficult to reconcile contradictory findings regarding the role of Cpx in membrane fusion in vertebrates and invertebrates or between murine hippocampal neurons and neuroendocrine cells. On the other hand, we respectfully disagree with the statement of providing a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. In fact, a large number of model systems (in vitro and in vivo studies) support a scenario where complexin takes center stage in clamping of premature vesicle release. For example, in vitro analyses using a liposome fusion assay (Schaub et al., 2006, Nat Struct Mol Biol 13, 748; Schupp et al., 2016) or Hela cells that ectopically express “flipped” SNAREs on their cell surface (Giraudo et al., 2008, JBC 283, 21211) showed that complexin can inhibit the SNARE-driven fusion machinery. Likewise, several studies boosting complexin action by either genetic overexpression or peptide supplementation have provided evidence for the complexin clamp function in neuronal and nonneuronal cells (e.g. Itakura et al., 1999, BBRC 265, 691; Liu et al., 2007, Biochemistry 72, 439; Abderrahmani et al., 2004, J Cell Sci 117, 2239; Archer et al., 2002, JBC 277, 18249; Tang et al, 2006,

      Cell 126, 1175; Vaithianathan et al., 2013, J Neurosci 33, 8216; Roggero et al., 2007, JBC, 282, 26335.)

      In addition, chromaffin cells enable the investigation of secretion on the background of a well-defined intracellular calcium concentration. Indeed, CplxII knock-out in chromaffin cells demonstrated an enhanced tonic release which is evident at elevated levels of [Ca]i (>100nM), but absent at low resting [Ca]i (Dhara et al., 2014). Given this observation, it is tempting to speculate that variations in [Ca]i among the different preparations may contribute to the deviating expression of the complexin null phenotype in different preparations.

      Ad 2) The reviewer states: The authors use a Semliki Forest virus-based approach to express mutant proteins in chromaffin cells. This strategy leads to a strong protein overexpression (~7-8 fold, Figure 3 Suppl. 1). Therefore, experimental findings under these conditions may not necessarily be identical to findings with normal protein expression levels.

      As shown in Fig. 4, we use the secretion response of wt cells as a control so that we can assess the specificity and quality of the rescue approach in our experiments. In addition, the comparative analysis of the CpxII mutants was performed with respect to the equally overexpressed CpxII wt protein (Fig. 3 Suppl. 1), which we used as a control to determine the standard response under these conditions.

      Ad 3) The reviewer states: Measurements of delta Cm in response to Ca2+ uncaging by ramping [Ca2+ ] from resting levels up to several µM over a me period of several seconds were used to establish changes in the release rate vs [Ca2+ ]i relationship. It is not clear to this reviewer if and how concurrently occurring vesicle endocytosis together with a possibly Ca2+-dependent kinetics of endocytosis may affect these measurements.

      By infusing bovine chromaffin cells with 50µM free Ca2+, Smith and Betz have shown that the total capacitance increase is dominated by exocytosis and that significant endocytosis only sets in after 3 minutes (Smith and Betz, 1996, Nature, 380, 531). In the same line, we previously showed that mouse chromaffin cells (infused with 19µM free calcium over 2 minutes) responded with robust increase in membrane capacitance which strongly correlated with the number of simultaneously recorded amperometric events monitoring fusion of single vesicles (Dhara et al., 2014, Fig. 5B). Thus, capacitance alterations recorded under tonic intracellular Ca2+ increase in chromaffin cells are solely due to exocytosis and are not contaminated by significant endocytosis. As our Ca2+ ramp experiments were carried out for 6 seconds and the intracellular free [Ca]i did not exceed 19 µM the observed phenotypical differences between the experimental groups are most likely due to changes in exocytosis rather than endocytosis.

      Ad 4) The reviewer states: It should be pointed out that an altered "apparent Ca2+ affinity" or "apparent Ca2+ binding rate" does not necessarily reflect changes at Ca2+-binding sites (e.g. Syt1).

      We fully agree with the reviewer’s comment. As pointed out also in the response to reviewer 1, our experiments do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523-529: ” The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+-triggered fusion.” 

      AD 5) There are alternative models on how Cplx may "clamp" vesicle fusion (see Bera et al. 2022, eLife) or how Cplx may achieve its regulation of transmitter release without mechanistically "clamping" fusion (Neher 2010, Neuron). Since the data presented here cannot rule out such alternative models (in this reviewer's opinion), the authors may want to mention and briefly discuss such alternative models.

      The study by Bara et al reiterates the model proposed by the Rothman group which attributes the clamping function of Cpx to its accessory alpha helix by hindering the progressive SNARE complex assembly. We have explicitly stated this issue in the original version of the manuscript (page 19, line 425) “As the accessory helix of Cpx has been found to bind to membrane proximal cytoplasmic regions of SNAP-25 and SybII (Malsam et al., 2012; Bykhovskaia et al., 2013; Vasin et al., 2016), an attractive scenario could be that both domains of CpxII, the CTD and the accessory helix, synergistically cooperate to stall final SNARE assembly”. In this context, we will now cite also the study by Bera et al.. 

      A related view of the function of complexin suggested that it may act as an allosteric adaptor for sytI (Neher 2010, Neuron). Here, rather than postulang independent "clamp" and "trigger" functions for the dual action of complexin, these were explained as facets of a simple allosteric mechanism by which complexin modulates the Ca2+ dependence of release. Yet, this interpretation appears to be difficult to reconcile with the observation of our and other laboratories, showing that the fusion-promoting and clamping effects are separable (e.g. Dhara et al., 2014; Lai et al., 2014; Makke et al., 2018; Bera et al., 2022).  

      Some parts of the Discussion are quite general and not specifically related to the results of the present study. The authors may want to consider shortening those parts.

      Considering the contrary findings in the field of SNARE-regulating proteins, the authors hope that the reviewer will agree that it is necessary to discuss the new observations in a broader context, as also acknowledged by the first reviewer.

      Last but not least, the presentation of the results could be improved to make the data more accessible to non-specialists, this concerns providing necessary background information, choice of colors, and labeling of diagrams.

      Done

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      Regarding figures: 

      (1) Please use clearly distinct colors in diagrams. For example, in Figure 2 Suppl. 3, four different shades of red (or reddish) are used to color the traces and the respective bars. These different shades of red are difficult to discriminate. In Figure 5 Suppl. 1, the two greens are nearly indistinguishable.  

      Done

      (2) RRP size and SRP size on the one hand, and SR rate on the other represent different quantities which are measured in different units. Please use a separate y-axis for the SR (a rate measured in fF/s) and do not combine with RRP and SRP (pool sizes measured in fF). This would also automatically alleviate the need for axis breaks in the plots of RRP size and SRP size. In general, please do not use axis breaks which make interpretation of data unnecessarily more complicated.  

      In order to clarify the display, we now define the different units together with the quantified parameter (e.g. RRP [fF], SRP [fF], SR [fF/s]) allowing us to omit a second axis in those subpanels.

      (3) When plotting bar graphs showing mean tau_RRP, mean tau_SRP, and mean delay, please always use the correct y-axis labels, i.e. use "tau_RRP", "tau_SRP" and "delay" as y-axis labels as it was done for example in Figure 4D, and do not use "tau_RRP", "tau_SRP" and "delay" as x-axis labels as it was done for example in Figure 1D and many other figure panels.  

      We have standardized the figure display. Yet, we would prefer to keep our way subpanel labelling which states the parameter underneath the bar graph and thereby makes the results more accessible.  

      (4) Are the asterisks indicating statistical significance perhaps missing in Figure 4D, middle panel (tau_SRP)?

      There was not a statistically significant difference (wt vs cpxIIko+CpxII EA, P=0.0826, Kruskal-Wallis with Dunn’ post hoc test).  

      (5) According to the Results section (pages 12 to 13), I assume that in Figures 6 and 7 the labels "+Cplx XYZ" are used by the authors to identify an overexpression of Cplx XYZ in a Cplx WT background. The legend text reads however " ... cells expressing either Cplx2 wt or the mutant ...", which would not be correct. Please check.

      We have changed the formulations to “overexpression” accordingly.

      (6) The x-axis unit in Figure 8C is likely "µM" and not "M".

      Done.

      (7) The abbreviations "CplxII LL-EE" and "CplxII LL-WW", and "CplxII LLEE" and "CplxII LLWW" are very similar but refer to different mutants. Could you please think of a more specific and unambiguous abbreviation? Perhaps "CplxII L124E-L128E"?  

      We have changed the abbreviations, accordingly (i.e. CpxII L124E-L128E).  

      Regarding the manuscript text:  

      Line 65: "prevents" instead of "impairs"? 

      done

      Line 67: why "in vivo"? 

      We changed the formulation to ‘Several’

      Line 83: "in addition to the clamping function ..." This is misleading. Many of the studies listed here did not provide evidence for enhanced spontaneous release following Cplx loss and often observed the opposite, reduced spontaneous release. The enhanced delayed release was observed by Strenzke et al 2009 J.Neurosci. and by Chang et al. 2015 J.Neurosci. (which the authors may want to cite). However, that enhanced delayed release occurred despite reduced spontaneous release indicating that it is not simply the result of a missing "fusion clamp". 

      To accommodate the reviewer’s suggestion, we have changed the formulation to “Independent of the clamping function of Cpx….”

      Line 104: "speeds up exocytosis that is controlled by the forward rate of Ca2+ binding" This is difficult to understand without context.  

      We have now added the corresponding citations (Voets et al., 2001; Sorensen et al., 2003), which showed that exocytosis timing in chromaffin cells is largely determined by the kinetics of Ca2+-binding to SytI.

      Line 116: "Cplx2 knock out ..." Please provide (here or earlier in the manuscript) information to the reader about which Cplx paralogs are expressed in chromaffin cells.  

      We now state on line 111 that “CpxII is the only Cpx isoform expressed in chromaffin cells (Cai et al., 2008)”

      Line 118: "=~" either "=" or "~". 

      done

      Line 120: "instead" seems superfluous.

      done

      Line 272: "calcium binding rates" should perhaps better read "apparent calcium binding rates". 

      done

      Line 290: "enhancing SytI's Ca2+ affinity" should perhaps better be "enhancing the apparent Ca2+ affinity of the release machinery". Ca2+ binding kinetics is never directly assayed here.

      We agree and have phrased the sentence accordingly.

      Line 300: "Expression of Cplx ... in Syt1 R233Q ki cells, ..." Perhaps better "Overexpression of Cplx ... in Syt1 R233Q ki/Cplx2 wt cells, ..." for clarification?

      done

      Lines 313ff: What is assayed here is the apparent Ca2+ binding kinetics and apparent KD values of the release machinery. Ca2+ binding to Syt1 is never directly measured!  

      We agree and have changed the wording accordingly to “CpxII NTD supports the forward rate of calcium binding to SytI in accelerating exocytosis”

      Line 347: "Complexin plays a dual role ..." This is partially misleading. It does so in chromaffin cells and D.m. and C.e. NMJs but not at conventional mammalian synapses. 

      We agree and have changed the formulation to “In many secretory systems, Complexin plays a dual role in the regulation of SNARE-mediated vesicle fusion”

    1. Author response:

      We thank the reviewers for their constructive comments that will help us clarify and strengthen the paper. We will be happy to address all the comments and adjust the text accordingly. Regarding the suggestion in the assessment to include a “more thorough comparison with with human behavior”, we believe this comment reflects one of the reviewer’s comments to compare with order effects (primacy and recency); we did not see any other comments that would reflect this (our existing simulations do make contact with other human behavior regarding error distributions, including probability of recall, precision, sensitivity to reinforcement history, and dopamine manipulation effects on human WM). We thank the reviewers for this comment and we will conduct the appropriate simulations and analysis to compare with sequential effects in working memory.

    1. Author response:

      Reviewer #1 (Recommendations For The Authors): 

      This paper represents a huge amount of work on a condition whose patients' health and well-being have not always been prioritized, and only relatively recently has the immune dysregulation seen in patients with Down Syndrome (DS) been garnering major research interest. 

      This paper provides an unparalleled examination of immune disorder in patients with DS. In a truly herculean effort, the authors provided the cumulative examination of over 440 patients with DS, confirmed the alterations in immune cell subsets (n=292, 96 controls) and multi-organ autoimmunity seen in these patients as they age, and identified autoantibody production that could contribute to conditions co-occurring in patients with DS. They also sought to look at whether the early immunosenescence seen in DS was due to the inflammatory profile by comparing age-associated markers in DS patients and euploid controls separately, finding that several markers are regulated with age regardless of group, while comparing the effect of age versus DS status on cytokine status identified inflammatory markers elevated in DS patients across the lifespan that do not increase with age or that increase with age only in the DS cohort. This is very interesting in the context of DS in particular, and immunity during aging in general. 

      The second part of the manuscript presents the results from a clinical trial with the JAK inhibitor tofacitinib in DS patients. While the number of DS patients treated with tofacitinib was small, the results were often quite striking. Treatment was well-tolerated and the improvement of dermatological conditions was clear. The less responsive patients AA4 and AA2 provide a very clear illustration that these patients are sensitive to immune triggers during treatment. Additionally, the demonstration that patients' IFN scores and cytokine levels decreased without clear immunosuppression with tofacitinib treatment is encouraging, since treatment with this drug would need to be continuous. I would be curious to see if the patients added past the cutoff for interim analysis follow a similar trajectory. I would not ask the authors to add any data; the paper is well-written and logically constructed. 

      I only have a small comment: I really did not like how Figure 2 a, d, and g tethered the coloring to the magnitude of fold change to show the effect of DS particularly for 2a and 2g. Given that these fold changes are quite modest, the coloring is very light and hard to distinguish. The clear takeaway is that the effect on T cells is greatest, but there must be a better way to illustrate this. Perhaps displaying this graph on a non-white background could help with contrast. 

      We are grateful for the Reviewer’s very positive assessment of the manuscript and constructive feedback. We want to assure the Reviewer that similar analyses will be completed in the future for the entire cohort recruited into the trial to determine if similar trajectories and results are observed with the larger sample size. Additionally, following Reviewer’s guidance, we will explore alternative ways to present the data in Figure 2 for greater clarity in a revised version of the manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      • Although the focus of the patients in the first part of the paper is on autoimmune/inflammatory conditions, it will be useful to also list the non-autoimmune infectious manifestations for reference with prevalence data. For example, otitis media, or lung infections (mentioned within the paper), or mucosal candidiasis. Same for other manifestations such as cardiac or malignant conditions. Given the impressive number of patients, it will be useful to the readers to have prevalence data for these as well, even in brief statements within the results. 

      We appreciate this inquiry by the Reviewer and will present additional data on the co-occurring conditions mentioned by the Reviewer in a revised version of the manuscript.

      • Have the authors looked at DN T cells and whether they may be enriched in DS patients, given their enrichment in some autoimmune conditions? 

      Thanks for this inquiry. We did examine DN T cells (double negative T cells), which we referred to in our Figure 2 and Figure 2 – figure supplement 1 as non-CD4+ CD8+ T cells. Although this T cell subset is mildly elevated (in terms of frequency among T cells) in individuals with Down syndrome, the result did not reach statistical significance after multiple hypothesis correction. This negative result is shown in the heatmap in Figure 2 – figure supplement 1d.

      • It would be useful to move the segment of the discussion that discusses the interim predefined analysis of the phase 2 trial to the corresponding segment of the results. As this reviewer was reading the paper, it was unclear why the interim analysis was done, whether it was predefined and it was not until the discussion that it became apparent. I believe it will help the readers to have a brief mention that this interim analysis was predefined and set to occur at the first 10 DS enrollees. Also, it would be helpful to state what is the total number of DS patients planned for enrollment in the Phase 2 trial which is continuing recruitment. 

      We appreciate this comment and will modify the text following the Reviewer’s guidance in the revised manuscript. The trial will be considered complete once a total of 40 participants undergo 16-weeks of treatment with good medicine compliance (less that 15% missed doses).

      • Although the authors present data on TPO autoantibodies before and after tofacitinib, it remains unclear whether the other non-TPO autoantibodies were altered during treatment or whether this was a TPO autoantibody-specific phenomenon. Was there an alteration in mature B cells or plasmablast populations after tofacitinib? If these data are available, they would further enhance the manuscript. If they are not available, it would be useful for the authors to discuss those in the discussion of the manuscript. 

      We are grateful for this comment, which strongly aligns with our future research interests and plans for the analysis of the full cohort once the trial is completed. In the interim analysis, we analyzed only auto-antibodies related to autoimmune thyroid disease and celiac disease, as shown in the manuscript. However, we plan to complete a more comprehensive analysis of the effects of JAK inhibition on autoantibody production once the full sample set is available at the end of the trial. Likewise, the clinical trial protocol contemplates collection and processing of blood samples for immune mapping using mass cytometry, which will enable us to answer the question from the Reviewer about potential changes in B cells or plasmablasts populations. Following Reviewer’s guidance, we will discuss these planned analyses in the Discussion of the revised manuscript.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Cellular immune phenotyping data in Figure 2 presents a large number of patients with DS versus euploid controls (292 and 96 respectively). Given the relatively large cohort there would seem to be an opportunity to determine whether age or sex alters the immune phenotype shown, for example, TEMRAs, etc. Was the data analyzed in this way? 

      We welcome this comment, which clearly aligns with our research interests and planned additional analyses of these datasets generated by the Human Trisome Project. We can share with the Reviewer that although sex as a biological variable has minimal impacts on the strong immune dysregulation observed in Down syndrome, there are clear age-dependent effects, with some immune changes occurring early during childhood versus others taking place later in adult life. A manuscript describing a complete analysis of age-dependent effects on the multi-omics datasets in the Human Trisome Project is currently under preparation.

      (2) The authors should strongly consider incorporating/discussing the findings from Gansa et al, Journal of Clinical Immunology May 2024 - where they reviewed the immune phenotype of 1299 patients with Down syndrome. 

      Thanks for this suggestion, we will surely cite and discuss this recent paper in the revised manuscript.

      (3) It is difficult to differentiate patients Hs2 and Ps1 in Figure 5d. 

      Thanks for this observation, we will modify the labels for greater clarity in the revised manuscript.

      (4) Given their finding of no correlation between cytokine levels/immune phenotype and autoimmunity, some additional discussion of the relevance of hypercytokinemia in the pathogenesis of autoimmunity would seem relevant (given that this was the basis for the clinical trial). The authors mention that cytokine levels may not be appropriate measures of disease in the patients. 

      We welcome this opportunity to expand the discussion of the relevance of hypercytokinemia in the pathogenesis of autoimmunity and will do so in the revised manuscript.

      (5) Data availability statement: appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The authors should perform experiments to answer this question: does Cav3 transcription increase in the G369i-KI, or is there instead some post-transcriptional modulation that permits surface expression of functional Cav3-containing channels in the absence of typical HVA Ca conductances? Also, the authors should determine whether G369i-KI can mediate Ca2+ release from intracellular stores and whether release from stores is upregulated as Cav3-containing channel expression (or function) is increased. 

      We performed transcriptomic (drop-seq) analysis to test whether a Cav3 subtype is upregulated in cones of G369i KI mice. These experiments show that, consistent with previous studies (PMID 35803735, 26000488), Cacna1h appears to be the primary Cav3 subtype expressed mouse cones. However, as shown in new Supp.Fig.S3, there was no significant difference in the levels of Cacna1h transcripts in WT and G369i KI cones. Therefore, we propose that there may be some post-transcriptional modification, or alteration in a pathway that regulates channel availability, that enables the contribution Cav3 channels to the whole-cell Ca2+ current in the absence of functional Cav1.4 channels cones.

      We also performed Ca2+ imaging experiments in WT vs G369i KI cone terminals to assess whether the diminutive Cav3 current in G369i KI cone terminals may be compensated by upregulation of a Ca2+ signal such as from intracellular stores. Arguing against this possibility, depolarization-evoked Ca2+ signals in G369i KI cones were dramatically reduced compared to WT cones (new Fig.9). 

      Reviewer #2 (Recommendations For The Authors): 

      Major points- 

      (1) It is stated in too many places that cone features in the Cav1.4 knock-in are "intact", preserved, or spared, but this representation is not accurate. There are two instances in this study that qualify as intact when comparing KI to WT: 1) the photopic a-waves in the Cav1.4 knock-in (also demonstrated in Maddox et al 2020) and 2) latency to the platform (current MS, Figure 7f). However, in the numerous instances listed below, the authors compared the Cav1.4 knock-in to the Cav1.4 knock-out, and then referred to the KI as exhibiting intact responses. The reference point for intactness needs to be wildtype, as appropriately done for Figures 2 and 3, and when comparing the KI to the KO the phrasing should be altered; for example: "the KI was spared from the extensive degeneration witnessed in the KO....". 

      In most cases, we clearly note that there are key differences in the WT and the G369i KI cone synapses, which highlight the importance of Cav1.4-specific Ca2+ signals for certain aspects of the cone synapse. We disagree with the reviewer on the point that we did not often use the WT as a reference since most of our experiments involved comparisons of only WT and G369i KI (Figs. 3-6) or WT, G369i KI, and Cav1.4 KO (Figs.1,7—and in these cases comparisons specifically between WT and G369i KI mice were included). We used “intact” as a descriptor for G369i KI cone synapses since these are actually present, albeit abnormal in the G369i KI retina, whereas cone synapses are completely absent in the Cav1.4 KO retina. To avoid confusion, we modified our use of “intact” and “preserved” where appropriate.

      A. Abstract, line 34 to 35: ".......preserved in KI but not in KO.". 

      Abstract was rewritten and this line was removed.

      B. Line 36: "....synaptogenesis remains intact". The MS documents many differences in the morphology of KI and WT cones (immunofluorescence and electron microscopy data), which is counter to an intact phenotype. 

      The sentence was: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the Ca2+-independent role of Cav1.4 in cone synaptogenesis remains intact.”

      Here the meaning of “intact” refers to the Ca2+ -independent role of Cav1.4, not synapses. Thus, we have left the sentence unchanged.

      C. This strikes the right balance, lines 67 to 68: "....although greatly impaired.....". 

      D. Line 149, "Cone signaling to a postsynaptic partner is intact in G369i KI mice". This description is inaccurate. Here there is only WT and KI, and the text reads as follows in line 162: "terminals (Figure 6b). The ON and OFF components of EPSCs in G369i KI HCs were measurable, although lower in amplitude than in WT (Figure 6a,b)." Neither "measurable" nor "lower in amplitude" meet the definition of "intact", and actual numerical values are lacking in the text. 

      We have added results showing that there are no light responses in the Cav1.4 KO horizontal cells and have modified the sentence to: “Cone synaptic responses are present in horizontal cells of G369i KI but not Cav1.4 KO mice”. 

      We have modified discussion of these results as (line 210-213): “Consistent with the lack of mature ribbons and abnormal cone pedicles (Fig.1), HC light responses were negligible in Cav1.4 KO mice (Fig.8a,b). In contrast, the ON and OFF responses were present in G369i KI HCs although significantly lower in amplitude than in WT HCs (Fig. 8a,b).”

      E. Please add a legend to Figure 6a to indicate the intensities. The shape of the KI responses is different from the control which is worthy of discussion: i) there is no clear cessation of HC EPSCs in the KI during the light ON period (when release stops, Im fluctuations should be minimal), and ii) the "peaked" appearances of the initial 500ms of the On and Off periods are very similar in shape for the KI (hard to interpret in the same fashion as a control response). How were the On and Off amplitudes analyzed? Furthermore, the OFF current is not summarized in Figure 6D, but should not this be when Cav3 should be opening and triggering release: Off response-EPSC? Lastly, Figure 6b,d shows a ~70% reduction in On-current in the KI, and the KI example of 6b an 80% reduction in Off current compared to WT. Yet, the only place asterisks are used to indicate sig diff is the DNQX data within each genotype in Fig 6d. These data cannot be described as showing "intact" KI responses, and the absence of numerical and statistical values needs to be addressed. 

      New Fig.8a depicting the horizontal cell light responses has been modified to include the legend indicating light intensities. The ON and OFF amplitudes were analyzed as the peak current amplitudes. This information has been added to the legend.

      The reviewer is correct in that the OFF response represents the EPSC whereas the ON response represents the decrease in the EPSC with light. To avoid confusion, we changed the y axis label for the averaged data to read ON or OFF “response” rather than “current” in new Fig.8b.

      As the reviewer suggests, the more transient nature of the KI response during the light ON period could result from aberrant continuation of vesicular release during the light-induced hyperpolarization of cones in the KI mice, in contrast to the prolonged suppression of release by light which is evident in the WT responses. We speculated on this difference as follows (lines 237-241):

      “In addition to its smaller amplitude, the transient nature of the ON response in G369i KI HCs suggested inadequate cessation of cone glutamate release by light (Fig.8b). Slow deactivation of Cav3 channels and/or their activation at negative voltages20 could give rise to Ca2+ signals that support release following light-induced hyperpolarization of G369i KI cones.”

      We added astericks to new Fig.8b,d indicating statistical differences and description of the tests in the legend.

      F. line 168 the section titled "Light responses of bipolar cells and visual behavior is spared in G369i KI but not Cav1.4 KO mice". 

      Changed to: “Light responses of bipolar cells and visual behavior are present in G369i KI but not Cav1.4 KO mice”

      Last sentence of erg results, 189-190: "These results suggest that cone-to-CBC signaling is intact in G369i KI mice.". "Spared and intact" are not accurate descriptions. The ERG data presented here shows massive differences between WT and the KI, except in the instance of awaves. 

      This sentence was removed.

      As for Figure 6, the results text related to Figure 7a-d does not present real numbers for ERG responses, and there is no indication of significant differences there or in the Figure panels. For instance, in Figure 7b, b-waves are KI are comparable to KO, except at the two highest-intensity flashes that show KI responses ~20% the amplitude of WT. Presentation of KI and KO data on a 6- to 10-fold expanded scale higher than WT can be misleading: a quick read of these Figure panels might make one incorrectly conclude that the KI is intact while the KO is impaired when compared to WT. The Methods section needs more details on the ERG analysis (e.g. any filtering out of oscillatory potentials when measuring b-wave, and what was the allowable range of time-to-peak for b-wave amplitude, etc..). 

      The vertical scaling of the ERG results in new Fig.10c,d has been changed so as to reflect clearly diminished responses of the KO and KI vs the WT. Further details regarding the ERG analysis was added to the Methods section.

      G. Can you point to other studies that have used the "visible platform swim test" used in Figure 7e, f, and specify further how mice were dark/light adapted prior to the recordings? 

      As referenced in the Methods, original line 674, the methods we used for the swim test were described in our previous study (PMID 29875267). Other studies that have used this assay include PMIDs: 28262416, 26402607.

      (2) The Maddox et al 2020 study does not safely address whether rods have a residual T-type Ca2+ current in the Cav 1.4 KO or KI. The study showed that membrane currents measured from rods in the KI and KO retina were distinct from WT, supporting their claim that L-type Ca2+ current is absent in the KI and KO. However, the recordings had shortcomings that challenge the analysis of Ca2+ currents: i) collected at room temp (22-24{degree sign}C), ii) at an unknown distance from the terminal (uncertain voltage clamp), iii) with a very slow voltage ramp rate that is not suitable for probing T-type currents (Figure 1d Maddox 2020, 140 mV over 1 sec: 7msec/1mV), and iv) at a signal-to-noise that does not allow to resolve a membrane current under 1 pA (avg wt rod Ca2+ current was -3.5 pA, and line noise ~1pA peak-to-peak in Maddox 2020). Suggestion: say T-type currents were not probed in Maddox et al 2020, but Davison et al 2022 did not find PCR signal for Cav3.2 in rods. 

      We disagree that recordings in the Maddox 2020 study were not sufficient to uncover a T-type current. The voltage ramps in that study were not much slower than that of the Davison et al. 2022 study (they used 0.19 mV/ms). Moreover, in new Supp. Fig.S1, we show that like the slower voltage ramp (0.15 mV/ms) used in the prior study of G369i KI rods, the voltage ramps we used in the present study (0.5 mV/ms), which clearly evoke currents with T-type properties in G369i KI cones (Fig.2a,b, Fig.3a,b) do not evoke currents in WT or G369i KI rods.  

      Minor comments. 

      (1) Suggestion: add an overview panel to Figure 1 that shows the rod terminals in the KI. The problem is that cropping out the ribbon and active zone signals from rods, to highlight cones, can give the impression that the cones are partially spared in the KI, and the rods are not spared at all. (yet you nicely clarify this in Figure 4 and in the legend and text, etc.). 

      We chose to modify the legend with this information as in Fig.4 rather than modify the figure.

      (2) Mouse wt cone Ca2+ currents look like L-type currents, as do your monkey and squirrel cone recordings, and also much like those of mouse rods (see Figure S5, Hagiwara et al., 2018 or Grabner and Moser 2021). Your pharm data from mice and squirrels further supports your conclusion, and certainly took much effort. Davison et al 2022 J Neurosci showed PCR results that support their claim that a Cav3 current exists in wt cones. Questions: 1) have you tried PCR? 2) Can you offer more details on what Cav3 KO you tried and what antibodies failed to confirm the KO? As the authors know, one complication is that the deletion of one Cav can be compensated for by the expression of a new Cav. There are 3 types of Cav3s and removal of one type may be compensated for by another Cav3. 

      We have included drop-seq data (new Supp.Fig.S3) implicating Cav3.2 as the main Cav3 subtype in cones and have modified our discussion of these results accordingly. These experiments did not reveal any changes in Cav3 subtype expression in G369i KI vs WT cones.

      (3) Lines 95/96- onward, spend more time telling the story. When working out the biophysical and pharmacological behavior of the Ca2+ currents, you might want to initially refer to the membrane current as a membrane current, and then state how your voltage protocols, intra- and extra-cell solutions, and drugs helped you verify 1) L-type and 2) T-type Ca2+ currents. 

      We have modified the text with more detail.

      (4) If data is in hand, add a ramp I-V to Figure S2, which shows the response of the ground squirrel cone. The steps in S2a are excellent for making your point that a transient current is missing, and the bipolar is a great control to illustrate ML218 works. However, a comparison of a squirrel cone ramp to a bipolar ramp response could complete the figure. 

      See Reponse to #5 below.

      (5) Consider moving Supplementary Figures S2 and S3 to the main text; these are highly relevant to the story, novel, and well-executed. 

      Fig.S2 and S3 were added as new Figs.4,5. The new Fig.4 includes voltage ramps in ground squirrel cones (panel a) to compare with the bipolar data (panel f).

      (6) The nice electron microscopy reconstructions are not elaborated on in any detail, and there is no mention of ribbon size. Is the resolution sufficient to estimate ribbon size, the number of synaptic vesicles around the ribbon and in the adjacent cytosol? The images indicate major changes in the morphology of the terminals. Is the glial envelope similar in WT and KI? 

      Since ribbons were quantified extensively in the confocal analyses in Fig.6, we felt it unnecessary to add this to the EM analysis which focused mainly on aspects of 3D structure (i.e., arrangement of ribbons, postsynaptic wiring, cone pedicle morphology). We added further discussion of the change in morphology of the G369i KI cone pedicle (lines 200-203): “Compared to WT, ribbons in G369i KI pedicles appeared disorganized and were often parallel rather than perpendicular to the presynaptic membrane (Fig.7a-c). Consistent with our confocal analyses (Fig.1), G369i KI cone pedicles extended telodendria in multiple directions rather than just apically (Fig. 7a).”

      While we did not opt to characterize the glial envelope in WT cones, we did add an analysis of synaptic vesicles around ribbons to Table 2.

      (7) Discussion line 250: "we found no evidence for a functional contribution of Cav3 in our recordings of cones in WT mice (Figures. 2,3), ground squirrels, or macaque (Supplementary Figures S2 and S3).". I would not use "functional" in this context because when comparing your work to Davison et al 2022, they defined functional as a separate response component driven by Cav3. For instance, they examined the influence of their T-type current on exocytosis (by membrane capacitance) and other features like spiking Ca2+ transients. Suggestion: substitute functional with "detectable", and say "we found no detectable Cav currents". Or if you had Ttype staining, but not T-type Ca2+ currents, then say "no functional current even though there is staining...". 

      We have modified the text as (lines 336-338): “However, in contrast to recordings of WT mouse cone pedicles in a previous study21, we found no evidence for Cav3-mediated currents in somatic recordings of cones in WT mice (Figs.2,3).”

      We propose an alternative interpretation of the results in the Davison et al study concerning the conclusion that Cav3 channels contribute to Ca2+ spikes and exocytosis. That study used 100 µM Ni2+ to block a “T-type” contribution to spike activity in cones. In their Figs.4,5, the spikes are suppressed by 100 µM Ni2+ and 10 µM nifedipine, a Cav1 antagonist, and spared by the T-type selective drug Z944. This is problematic for several reasons. First, as shown by the authors

      (their Fig.2A1,A2) and others (PMID: 15541900), 100 µM Ni2+ inhibits Cav1-type currents in photoreceptors. Second, Z944 potentiates Cav1 current in their mouse cones (their Fig.2C1,C2). Thus, both reagents are suboptimal for dissecting the contribution of either Cav subtype to spiking activity. With respect to Cav3 channels and exocytosis, these authors interpreted a reduction in exocytosis upon holding at -39 mV compared to at -69 mV as indicating a loss of a T-type driven component of release. However, Cav1 channel inactivation (PMID: 12473074) could lead to the observed reduction in exocytosis at -30 mV.

      (8) Additional literature related to your Intro and Discussion. Regarding CSNB2, related mutations of active zone proteins, and what happens to Ca2+ currents when ribbons are deleted, you might want to consider the following studies that measure Ca2+ currents from rods: conditional KO of RIM1/2 (Grabner et al 2015 JN), KO of ELKS1/2 (Hagiwara et al, 2018 JCB), and KO of Ribeye (Grabner and Moser eLife 2021). In these studies, the Cav currents were absent in rods of the ELKS1/2 DKO, strongly reduced (80%) in the RIM1/2DKO, but altered in more subtle ways (activation-inactivation) without significantly changing steady-state Ca2+ current in the Ribeye KO. This does not seem to support some of the arguments you have made in the Introduction and Discussion regarding ribbon size and Ca2+ currents, yet the suggested literature is related to the topic at hand. 

      A description of these synaptic proteins as potential mediators of the effect of Cav1.4 on ribbon morphogenesis was added to the Discussion, lines 325-327.

      (9) Line 129: "Along with the major constituents of the ribbon, CtBP2, and RIBEYE", for clarity Ribeye has two domains, one that is identical to CtBP2 (B-domain) and the unique Ribeye domain (A-domain) that is only expressed at ribbon synapses. And, Piccolino is also embedded in the ribbon (Brandstaetter lab, Wichmann/Moser labs). In other words, Ribeye and Piccolino are the major constituents of the ribbon. 

      To avoid confusion, we simply mention Ctbp2 and RIBEYE in the context of the corresponding antibodies that were used to label ribbons.

      (10) Abstract: consider to rephrase "Ca2+-independent role of Cav1.4" by "Ca2+-permeationindependent role of Cav1.4" or alike 

      Sentence changed to: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the nonconducting role of Cav1.4 in cone synaptogenesis remains intact.”

      Reviewer #3 (Recommendations For The Authors): 

      Cav1.4 voltage-gated calcium channels play an important role in neurotransmission at mammalian photoreceptor synapses. Mutations in the CACNA1f gene lead to congenital stationary night blindness that particularly affects the rod pathway. Mouse Cav1.4 knockout and Cav1.4 knockin models suggest that Cav1.4 is also important for the cone pathway. Deletion of Cav1.4 in the knockout models leads to signaling malfunctions and to abundant morphological re-arrangements of the synapse suggesting that the channel not only has a role in the influx of Ca2+ but also in the morphological organization of the photoreceptor synapse. Of note, also additional Cav-channels have been previously detected in cone synapses by different groups, including L-type Cav1.3 (Wu et al., 2007; pmid; Kersten et al., 2020; pmid), and also T-type Cav3.2 (Davison et al., 2021; pmid 35803735). 

      In order to study a conductivity-independent role of Cav1.4 in the morphological organization of photoreceptor synapses, the authors generated the knockin (KI) mouse Cav1.4 G369i in a previous study (Maddox et al., eLife 2020; pmid 32940604). The Cav1.4 G369i KI channel no longer works as a Ca2+-conducting channel due to the insertion of a glycine in the pore-forming unit (Madox et al. elife 2020; pmid 32940604). In this previous study (Madox et al. elife 2020; pmid 32940604), the authors analyzed Cav1.4 G369i in rod photoreceptor synapses. In the present study, the authors analyzed cone synapses in this KI mouse. 

      For this purpose, the authors performed a comprehensive set of experimental methods

      including immunohistochemistry with antibodies (also with quantitative analyses), electrophysiological measurements of presynaptic Ca2+ currents from cone photoreceptors in the presence/absence of inhibitors of L-type- and T-type- calcium channels, electron microscopy (FIB-SEM), ERG recordings and visual behavior tests of the Cav G369i KI in comparison to the Cav1.4 knockout and wild-type control mice. 

      The authors found that the non-conducting Cav channel is properly localized in cone synapses and demonstrated that there are no gross morphological alterations (e.g., sprouting of postsynaptic components that are typically observed in the Cav1.4 knockout). These findings demonstrate that cone synaptogenesis relies on the presence of Cav1.4 protein but not on its Ca2+ conductivity. This result, obtained at cone synapses in the present study, is similar to the previously reported results observed for rod synapses (Maddox et al., eLife 2020, pmid 32940604). No further mechanistic insights or molecular mechanisms were provided that demonstrated how the presence of the Cav channels could orchestrate the building of the cone synapse. 

      We respectfully disagree regarding the mechanistic advance of our study. As indicated by Reviewer 2, a major advance of our study is in providing a mechanism that can explain the longstanding conundrum that congenital stationary night blindness type 2 mutations that would be expected to severely compromise Cav1.4 function do not produce complete blindness. Our study provides an important contrast to the Maddox et al 2020 study in showing that rods and cones respond differentially to loss of Cav1.4 function, which is also relevant to the visual phenotypes of CSNB2. How the presence of Cav1.4 orchestrates cone synaptogenesis is an important topic that is outside the scope of our present study.

      In the present study, the authors also propose a homeostatic switch from L-type to (newly occurring) T-type calcium channels in the Cav1.4 G369i KI mouse as a consequence of the deficient calcium channel conductivity in the Cav1.4 G369i Cav1.4 KI mouse. In cones of the Cav1.4 G369i, the high-voltage activated, L-type Ca2+-entry was abolished, in agreement with their previous paper (Maddox et al., eLife 2020, pmid 32940604). The authors found a lowvoltage activated Ca2+ current instead that they assigned to T-type Ca2+-currents based on pharmacological inhibitor experiments. T-type Ca2+-currents/channels were already previously identified in other studies by independent groups and independent techniques

      (electrophysiology, RT-PCR, single-cell sequencing) in cones of wild-type mice (Davison et al.,

      2021, pmid 35803735; Macosko et al., 2015, pmid 26000488; Williams et al., 2022, pmid 35650675). In the present manuscript (Figures 3a/b), the authors also observed a low-voltage activated, T-type like current in cones of wild-type mice, that is isradipine-resistant and affected by the T-type inhibitor ML218. This finding appears compatible with a T-type-like current in wildtype cones and is consistent with the published data mentioned above, although the authors interpret this data in a different way in the discussion. 

      Due to the noise inherent in whole cell voltage clamp measurements and some crossover effects in the pharmacology, we cannot completely exclude the presence of a T-type current in WT mouse cones. However, our results very clearly support a conclusion opposite to that stated by the reviewer. Namely, if WT mouse cones have T-type Ca currents, then they are far smaller than those in the Cav1.4 G369i KI and KO cones. In particular, while we identified message for Cav3.2 in WT mouse cones, we were unable to identify a functional T-type current by either voltage clamp measurements or pharmacology. See below for a detailed rebuttal.

      This proposal of a homeostatic switch is not convincingly supported in this reviewer's opinion

      (for further details, please see below). Furthermore, no data on possible molecular mechanisms were provided that would support such a proposal of a homeostatic switch of calcium channels. No mechanistic/molecular insights were provided for a proposed homeostatic switch between Ltype to T-type channels that the authors propose to occur between wild-type and Cav1.4 G369i as a consequence of conduction-deficient Cav1.4 G369i channels. Is this e.g. based on posttranslational modifications that switch on T-type channels or regulation at the transcriptional level inducing expression of T-type calcium channel or on other mechanisms? The authors remain descriptive with their central hypotheses. No molecular mechanisms/signaling pathways were provided that would support the idea of such a homeostatic switch. 

      Homeostatic plasticity refers to the maintenance of neuronal function in response to some perturbation in neuronal activity and can result from changes in the expression of ion channel genes (PMID: 36377048, 32747440, 19778903) or regulatory pathways that modulate ion channels (PMID: 15051886, 32492405). We present multiple lines of evidence showing that Cav3 currents appear in cones upon genetically induced Cav1.4 loss of function and can support cone synaptic responses and visual behavior if cone synapse structure is maintained. Our new transcriptomic studies show no difference between levels of Cav3 channel transcripts in WT and G369i KI cones, suggesting that the appearance of the Cav3 currents in G369i KI cones does not result from an increase in Cav3 gene expression. We are currently investigating our transcriptomic dataset to determine if Cav3 regulatory pathways are upregulated in G369i KI cones and will present this in a follow-up study.

      The authors show residual photopic signaling in the non-conducting Cav1.4 G369i KI mouse as judged by the recording of postsynaptic currents, ERG recordings and visual behavior tests though in a reduced manner. The residual cone-based signaling could be based on the nonaffected T-type Ca2+ channel conductivity in cone synapses. Given that the L-type current through Cav1.4 is gone in the Cav1.4 G369i KI as previously shown (Maddox et al., 2020, pmid 32940604), the T-type calcium current will remain. However as discussed above, this does not necessarily support the idea of a homeostatic switch. 

      A major point which we highlighted with new results is that despite the expression of Cav3 transcripts in WT mouse cones, Cav3 channels do not contribute to the cone Ca2+ current. This is at odds with the Davison et al study (PMID: 35803735, see our response to Reviewer 2, pt 7 for caveats of this study), but our results convincingly show that the Cav3 current appears only when Cav1.4 is genetically inactivated. Pharmacological or electrophysiological methods that should reveal the presence of Cav3 currents do not change the properties of the Ca2+ current in cones of WT mice, ground squirrel, or macaque:

      • Figs.2-4: Voltage steps to -40 mV (Fig 2e) that activate a sizeable T-current in G369i KI mouse cones produce a negligible transient at pulse onset in WT mouse cones. Similarly, transient currents that are obvious in G369i KI mouse cones during the final step to -30 mV are absent in WT cones.  When we block Cav1.4 with isradipine either in cones of WT mice or ground squirrel, the current that remains does not resemble a Cav3 current but rather a scaled down version of the L-type current. ML218, which readily blocks Cav3 channels in HEK293T cells and in G369i KI cones, has only minor effects in cones of WT mice and ground squirrel; these effects of ML218 can be attributed to non-specific actions on Cav1.4 (new Supp.Fig.S2). New Fig.4 (moved from the supplementary data to the main article) clearly shows that the ML218-sensitive current in ground squirrel cones exhibits properties of Cav1.4 not Cav3 channels. 

      • Figs.2,5: Holding voltages that inactivate Cav3 channels have no effect on the Ca2+ current in cones of WT mice or macaque (recordings of macaque cones were moved from the supplement to the main article as new Fig.5).

      In Figure 4 the authors measured an increase in the size of the active zone (as judged by the size of the bassoon cluster) and of the synaptic ribbons in the Cav1.4 G369i. A mechanistic explanation for this phenomenon was not provided and the underlying molecular mechanisms were not unraveled. 

      The FIB-SEM data uncover some ultrastructural alteration/misalignments of the synaptic ribbons and misalignments of the regular arrangement of the postsynaptic dendrites in the G369i KI mice. Also concerning this observation, the study remains descriptive and does not reveal the underlying mechanisms as it would be expected for eLife. 

      We respectfully disagree on the descriptive nature of our study and the need for a full characterization of the molecular mechanism underlying the cone synaptic defects in the G369i KI mouse.   

      An important study in the field (Zanetti et al., Sci. Rep. 2021; pmid 33526839) should be also cited that used a gain-of-function mutation of Cav1.4 to analyze its functional and structural role in the cone pathway. 

      We have added citation of this paper to the Discussion (lines 354-356).

      In conclusion, the study has been expertly performed but remains descriptive without deciphering the underlying molecular mechanisms of the observed phenomena, including the proposed homeostatic switch of synaptic calcium channels. Furthermore, a relevant part of the data in the present paper (presence of T-type calcium channels in cone photoreceptors) has already been identified/presented by previous studies of different groups (Macosko et al., 2015; pmid 26000488; Davison et al., 2021; pmid 35803735; Williams et al., 2022; pmid 35650675). The degree of novelty of the present paper thus appears limited. I think that the study might be better suited in a more specialized journal than eLife. 

      We thank the reviewer for acknowledging the rigor of our study but disagree with their evaluation regarding the novelty of our work as outlined in our responses above.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      My comments are largely limited to suggestions to make the manuscript easier to read and digest.

      In the abstract they say RNA sequencing highlights changes in innate...

      Could they be more specific? Innate immune system up or down? They do not indicate actual findings in the abstract.

      We thank the reviewer for the comment and we have revised the abstract accordingly.  

      Their use of non‐intuitive abbreviations is often confusing. Perhaps they can add a table in methods listing all the abbreviations so that the reader can follow the data better. mNGA, vmHT....etc.

      As suggested, we have now included a list of the abbreviations used in the paper.

      There are mis‐spellings in the manuscript.

      We have gone through the manuscript and corrected the mis-spellings.   

      Has the SPR RNAi line been validated?

      The SPR RNAi line that we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript and added these statements in the results section concerning SPR RNAi.  

      In the figures showing the Climbing Index vs time, can they abbreviate seconds as sec vs s? At least I think it is seconds. At first, I thought it was Time or Times, and was confused about what they were indicating on those types of graphs (Figures 1D‐F).

      We have revised the figure as suggested by the reviewer.

      In Figure 3F, they have a significance indicated in an unclear manner. It looks like they are comparing neuropil to the cortex, but I think they really mean to compare the cortex of sham to cortex of D31?

      The reviewer was correct. We have revised figure 3F to make this clear.     

      In Figure 4B, what is the y‐axis? Percentage of what? Is that percentage of total flies?

      The reviewer was correct. We have revised the figure to make this clear. 

      In a figure like SF3 B, what is the y‐axis? "Norm. Accum. CI" Can they explain the abbreviation?

      We have revised the Y-axis label to be “Normalized accumulative CI”.  We have also made this clear in the legend.   

      In the methods, what does this mean: "Regions devoid of Hoechst and phalloidin signal in non‐physiologically appropriate areas were considered vacuoles"? What are non‐physiologically appropriate areas? To me, that would mean outside of the brain. I would have thought the areas should be physiologically appropriate (aka neuropil and cortex)? This is confusing.

      We have revised the method section to be more specific.  In the Drosophila brain, there are structures such as esophagus that are devoid of both Hoechst and phalloidin staining, which were excluded from our vacuole quantification.    

      Reviewer #2 (Recommendations For The Authors):

      Since I use mammalian systems, my comment about the confirmation of siRNA should be removed if this is not possible in the Drosophila system.

      We have revised the figures to include total N values when appropriate. Including individual n values for each experimental assay and condition will inevitably crowd the figure legends, so specific values are available upon request. 

      Regarding RNAi knockdown of sex peptide receptors (SPRs), we agree that confirmation of the knockdown by IHC or qRT-PCR will further strengthen our findings. It should be noted, however, that the RNAi line we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript to include these statements in the results concerning the SPR RNAi knockdown.    

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figures 1 and 2, the authors found that females have a lower climbing index in the acute phase in D17 injury, not due to neurodegeneration as shown no significant changes of brain vacuolation and other markers. However, in Figure 3, the authors found that female flies have a lower climbing index, more brain vacuolation, and neurodegeneration in the late phase. It's not very convincing that having a lower climbing index at the late phase is due to neurodegeneration. Is it possible that females suffered from more severe acute effects, at least in D17 injury?

      We thank the reviewer for this point. Female flies injured on D17 displayed acute climbing deficits at 90 minutes post-injury. Since we did not observe significant structural changes in the brain at this time, we believe that this short-term functional deficit is not due to acute neuronal death. Here it is important to note that males did not display any acute climbing deficits when injured on D17, which suggests that the females suffered from more severe acute effects than males. However, these injured female flies recovered fully at 24 hours post-injury and displayed no climbing deficits. At two weeks post-injury, we observe climbing deficits and increased vacuole formation as a direct result of the injuries on D17 (see Supplemental Figure 3). When we assessed sensorimotor behavior and brain vacuolation on D45, we found that the injured females had significantly lower climbing indices and more brain vacuolation than the non-injured females of the same age. In this case, the concurrent observance of decreased climbing ability and increased brain vacuolation suggests chronic neurodegeneration in aged, injured females. This is not to be confused with the acute neuronal death observed by other groups using injury models of stronger severity. Overall, our data are consistent with the current view that in many neurodegenerative diseases, functional deficits often precede observable brain degeneration, which may take years to manifest.

      (2) The authors determined late‐life brain deficits and neurodegeneration purely based on climbing index and vacuole formation. These phenotypes are not really specific to TBI‐related neurodegeneration and the significance and mechanisms of vacuole formation are not clear. Indeed, in Figures 3 A and B, male flies especially D31inj tend to have a much larger variation than any other groups. What could be the reasons? The authors should perform additional analyses on TBI‐related neurodegeneration in flies, which have been shown before, such as retinal degeneration and loss, neuronal degeneration, and loss, neuromuscular junction abnormalities, etc (Genetics. 2015 Oct; 201(2): 377‐402).

      We thank the reviewer for the thorough evaluation of our manuscript. The reviewer raised a very important question: whether the neurodegeneration observed in our model is specific to TBI. As the reviewer rightly pointed out, the neurodegenerative phenotypes are unlikely to be specific to TBI-related neurodegeneration. Throughout the manuscript, we have tried to convey the notion that the mild physical impacts to the head represent one form of environmental insults, which in combination with other risk factors such as aging can lead to the emergence of neurodegenerative conditions. It should be noted that the negative geotaxis assay and vacuolation quantification are two well-established approaches to assess sensorimotor deficits and frank brain degeneration in fly brains. 

      It is important to emphasize that the head-specific impacts delivered to the flies in our study are much milder than those used in previous studies. As we showed in our figure 1, this very mild form of head trauma (referred to as vmHT) did not cause any death, nor affected the lifespan of the injured flies. Our supplemental data also show very minimal structural neuronal damage and no acute and chronic apoptosis induced by vmHT exposure. Consistently, we did not observe any exoskeletal or eye damage immediately following injuries, nor did we observe any retinal degeneration and pseudopupil loss at the chronic stage of these flies. We have incorporated these important points in the revised manuscript.  

      (3) In Figure 4, it would be important to perform the behavior test fly speed and directional movement in the acute phase as well to determine whether the females have reduced performance at the acute phase.

      We thank the reviewer for this suggestion. Please note that our modified NGA has already improved the spatiotemporal resolution over the classic NGA.  The data presented in Fig.3 show that there are no acute deficits for young cohorts.  Therefore, we do not believe that the detailed analysis of the direction and speed of these flies is essential.  

      Unfortunately, the current setup for the AI-based analysis requires manual corrections of tracking errors, which are time-consuming and tedious.  We are building a newly designed AI-based NGA (NGA.ai) that will allow automatic tracking and quantification with minimal manual interventions. Once it is completed, we will perform some of the analyses that the reviewer suggested.  

      (4) In Figure 8, the authors performed an RNA‐seq analysis and identified some dysregulated gene expressions. However, it is really surprising to see so few DEGs even in wild‐type males and mated females, and to see that none of DEGs overlap among groups or related to the SP‐signaling. This raises questions about the validity of the RNAseq analysis. It is critical to independently verify their RNA‐sequencing results and to add some more molecular evidence to support their conclusion.

      We agree that future studies are needed to independently validate our RNA sequencing results. We believe that the small number of DEGs are likely due to two unique features of our study: (1) the very mild nature of our injury paradigm and (2) the chronic examination timepoint that was long after the head injury and SP exposure, which distinguish our study from previous fly TBI studies.  As pointed out in the manuscript, our study was aimed to understand how early life exposure to repetitive head traumatic insults could lead to the latelife onset of neurodegenerative conditions. We hope to further validate our results in our next phase of experiments using single-cell RNA sequencing and RT-qPCR. 

      (5) The current results raise a series of interesting questions: what implication of female fly mating and its associated Sex Peptide signaling would be to mammalians or humans? Would mammalian female animals mating with wild‐type or sex hormone‐null male animals have different effects on their post‐injury behavior tests or neuropathological changes? What are the mechanisms underlying the sexual dimorphism?

      As the reviewer pointed out, it would be very interesting to explore the possible roles of sex peptide-signaling in other animals and humans. As far as we know, there is no known mammalian ortholog to the insect sex peptide, so it would be difficult to study SP or an SPlike molecule in mammalian models. However, we believe that prolonged post-mating changes associated with reproduction in female fruit flies contribute to their elevated vulnerability to neurodegeneration.  In this regard, drastic changes within the biology of female mammals associated with reproduction can potentially lead to vulnerability to neurodegeneration. We agree that this demands further study, which may be done with future collaborators using rodent or large animal models.  We have discussed this point in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you very much for reviewing our manuscript and express our sincere appreciation for the valuable and thoughtful comments that led us to significantly improve the manuscript on Fshr-ZsGreen reporter mice. We have seriously taken your comments to make a major revision of the manuscript, and here is a summary of the revision:

      (1) New data on Fshr expression are input to the revised Manuscript:

      a. Fshr expression in the testis and adipose tissues (WAT and BAT) of B6 mice;

      b. Fshr expression in the testis of B6 by RNA-smFISH;

      c. Comparison of Fshr expression in the testis and ovary between Fshr-ZsGreen and B8 mice by ddRT-PCR to prove Fshr expression without interruptions by insertion of P2A-ZsGreen vector;

      d. Reduction of Fshr expression in osteocytes within the femoral sections from DMP1-CreERT2:Fshrfl/fl mice;

      e. Fshr expression in an established Leydig cell line-TM3 by immunofluorescence and ddRT-PCR, also show Fshr located in the nuclei of TM3 cells;

      f. Fshr expression at scRNA-seq level from 5 public single cell portals as Supplementary Data 3 to support our findings of the widespread expression pattern of Fshr, particularly in Leydig cells.

      (2) Re-organization of Figure 2 with a new legend.

      (3) A new paragraph is added to the Discussion Section of the revised MS to explain the function of P2A peptide in generation of GFP reporter mice and why Fshr express is not interrupted by the P2A-ZsGreen insertion in Fshr-ZsGreen reporter.

      (4) Deletion of Figure 1-D-c, as it is not necessary.

      (5) Replace of Figure 8-A (the left panel) with a reduced exposure time image.

      (6) Amended parts of the revised MS are labeled in red.

      A point by point response to the Reviewers’ comments:

      Reviewer 1:

      One of the shocking observations in this manuscript is the expression of FSHR in Leydig cells. Other observations are in the osteoblasts and endothelial cells as well as epithelial cells in different organs. The expression of ZsGreen in these tissues seems high and one shall start questioning if there are other mechanisms at play here.

      First, the turnover of fluorescent proteins is long, longer than 48h, which means that they accumulate at a different speed than the endogenous FSHR This means that ZsGreen will accumulate in time while the FSHR receptor might be degraded almost immediately. This correlated with mRNA expression (by the authors) but does not with the results of other studies in single-cell sequencing (see below).

      The expression of ZsGreen in Leydig cells seems much higher than in Sertoli cells, this is "disturbing" to put it mildly. This is visible in both the ZsGreen expression and the FISH assay (Figure 2 B-D).

      Thank you for this valuable comments. We added new data on Fshr expression to prove the presence of Fshr in Leydig cells in B6 detected by immunofluorescence staining, RNA-smFISH and ddRT-PCR, as well as in TM3 cells-isolated Leydig cells from a male mice in the revise MS (Fig 2E, F and G), that demonstrate no interruptions of normal Fshr expression by insertion of P2A-ZsGreen vector into a locus located between exon10 and stop code. We use ZsGreen as an indicator for active Fshr promoter status, rather than a method to measure Fshr expression, which is done by ddRT-PCR. These data are shown in Figure 2G of the revised MS

      In addition, we provide scRNA-seq based evidence on Fshr expression in human Leydig cells from two single cell portals (DISCO and BioGPS) as shown in Supplementary Data 3 in the revised MS. We also cited a recent report on scRNA-seq analysis of Fshr expression in Hu sheep in the revised MS as Reference 65 (PMID: 37541020) 1, which also clearly showed Fshr expression in Leydig cells at single cell level in Hu Sheep.

      We believe that the lack of Fshr expression in some single cell databases may be due to the degradation of Fshr transcript in cells during the process of single cell populations. In our laboratory, we spent more than 6 months to optimize methods and reagents to perverse mRNA integrity more than 8 for RAN-seq.

      The expression in WAT and BAT is also questionable as the expression of ZsGreen is high everywhere. That makes it difficult to believe that the images are truly informative. For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.

      FISH expression (for FSHR) in WT mice is missing.

      Also, the tissue sections were stained with the IgG only (neg control) but in practice both the KI and the WT tissues should be stained with the primary and secondary antibodies. The only control that I could think of to truly get a sense of this would be a tagged receptor (N-terminal) that could then be analysed by immunohistochemistry.

      Reply 2 and 3: Thank you for these comments. New data on Fshr expression in WAT and BAT of B6 mice by immunofluorescence staining and in the testis of B6 mice by immunofluorescence staining and RNA-smFISH are added to the revised MS (Fig.2D and E, and Fig. 4G), showing similar patterns to that of Fshr-ZsGreen mice. Furthermore, we provide more evidences as Supplementary Data 3 on Fshr expression obtained from 4 public single cell portables, showing FSHR expression in a widespread organs and tissues (including different fractions of adipose cells) of human, mice and rat at single cell levels. Please also check Fshr expression pattern in adipose tissues by immunostaining for Fshr in previous reports (Fig. 3a of PMID: 28538730 and Fig. 2 of PMID: 25754247) 2 3, which showed a similar expression pattern to our finding. These data should address your concerns on Fshr expression in WAT and BAT and other organs/tissues.

      Regard of “For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.” We believe that you referred to the image of the aorta in Supplementary Data2. However, Please take a look at the images of the aorta in Figure 5-C, which shows positively stained the layer of ‘elastin and collagen fibres’ for EMCN and a-SMA colocalized with Fshr expression with stained DAPI at a 1000X magnification, indicating endothelial cells and the cellular membrane presented in this layer, not just ‘elastin and collagen’.

      The authors also claim:

      To functionally prove the presence of FSHR in osteoblasts/osteocytes, we also deleted FSHR in osteocytes using an inducible model. The conditional knockout of FSHR triggered a much more profound increase in bone mass and decrease in fat mass than blockade by FSHR antibodies (unpublished data).

      This would be a good control for all their images. I think it is necessary to make the large claim of extragonadal expression, as well as intragonadal such as Leydig cells.

      Thank you for this very encouraging comment. As you suggested, we did add a result of reduced Fshr expression in osteocytes from DMP1-CreERT2+:Fshrfl/fl mice treated with tamoxifen to the revise MS, as shown in Figure 3D, demonstrating Fshr present in osteocytes and the specificity of Fshr antibody. Furthermore, we incorporated your advice on making ‘ large claim of extrogonadal and intragonadal expression of Fshr’ into the revised MS in red.

      Claiming that the under-developed Leydig cells in FSHR KO animals are due to a direct effect of the FSHR, and not via a cross-talk between Sertoli and Leydig cells, is too much of a claim. It might be speculated to some degree but as written at the moment it suggests this is "proven".

      Thank you for pointing out this incorrect claim and we apologized for it. In the revised MS, we deleted this claim.

      We also do not know if this FSHR expressed is a spliced form that would also result in the expression of ZsGreen but in a non-functional FSHR, or whether the FSHR is immediately degraded after expression. The insertion of the ZsGreen might have disturbed the epigenetics, transcription, or biosynthesis of the mRNA regulation.

      Thanks for this comment. In the revised MS, we added a new section to explain the function of P2A peptide in generation of a GFP reporter by sgRNA-guilded site specific knockin of P2A ZsGreen vector through CRISPRA/cas9 and provided a new result on comparison of Fshr expression in the testes and ovaries from Fshr-ZsGreen and B6 mice, showing equivalent Fshr expression between Fshr-ZsGreen and B6 mice (Figure 2G), which indicates no interruptions of Fshr expression by the insertion of P2A vector.

      The authors should go through single-cell data of WT mice to show the existence of the FSHR transcript(s).<br /> For example here:<br /> https://www.nature.com/articles/sdata2018192

      Thank you so much for the valuable comment. Yes, we took you critical advice to check Fshr expression through 4 single cell portals, including DISCO, GTEx, BioGPS and Human single cell portal, and present the collected data as Supplementary Data 3 in the revised MS, that strongly support our findings of the wider Fshr expression. Particularly, Fshr expression in Leydig cells is proved by scRNA-seq studies of human cells from DISCO and BioGPS, as well as a recent study in Hu sheep (PMID: 37541020) 1 and we cited it in the revised MS.

      Reviewer 2:

      Is the FSHR expression pattern affected by the knockin mice (no side-by-side comparison between wt and GSGreen mice, using in situ hybridization and ddRTPCR, at least in the gonads, is provided)?

      Thanks for the comment. In the revised MS, we provided a set of new data on Fshr expression in the testis, ovary, WAT and BAT of B6 mice by immunofluorescence staining and by RNA-smFISH for Fshr expression, showing similar expression patterns. Additionally, we also performed ddRT-PCT to compare Fshr expression in the testes and ovaries between Fshr-ZsGreen and B6 mice, demonstrating equivalent expression of Fshr expression between Fshr-ZsGreen and B6 mice. Interestingly, we also observed an significantly higher Fshr expression in the testis than that in the ovary (more than 30 folds).

      Is the splicing pattern of the FSHR affected in the knockin compared to wt mice, at least in the gonads?

      Thanks for the question. Please see our reply to the Reviewer 1 for the function of P2A peptide used for generation of GFP reporters.  Although we didn’t directly assess the splicing pattern, we provide a result of comparison of Fshr expression in Figure 2F in the revised MS, indirectly showing no changes of the splicing pattern. We will assess the splicing pattern of Fshr in the future that has been neglected in the field.

      Are there any additional off-target insertions of GSGreen in these mice?” and “Are similar results observed in separate founder mice?

      Thanks for the questions. As we describe it in the method section  in detail in the MS, Fshr-ZsGreen reporter was produced by the a site-specific long ssDNA recombination of the P2A-ZsGreen targeting vector to the locus between Exon10 and stop code by CRIPRA/cas9, which was guided by site-specific single guide RNA (sgRNA). We showed the results of Southern blot, DNA sequencing and site-specific PCR, proving the site-specific insertion of P2A-ZsGreen as shown in Figure 1. Because of the site-specific recombination, professionally, only one funder line is required for the study and there are no additional off-target insertions.

      How long is GSGreen half-life? Could a very long half-life be a major reason for the extremely large expression pattern observed?

      Thanks for the question. The half life of ZsGreen, also called ZsGreen1, is at least 26 h in mammalian cells or slightly longer due to its tetrameric structure, in contrast with the monomeric configuration of other well-known fluorescent proteins (PMID: 17510373) 4. The rationale for using this GFP protein is that ZsGreen is an exceptionally bright green fluorescent protein, which is up to 4X brighter than EGFP—and is ideally suited for whole-cell labelling, promoter-reporter studies, considering of the higher turnover and rapid degradation of Fshr transcript. In this study, we used ZsGreen as a monitor or an indicator of the active Fshr endogenous promoter, rather than a means for measuring the promoter activity. Therefore, regardless of its accumulation or not, ZsGreen driven by Fshr promoter, indicates the presence of active Fshr promoter in the defined cells. In stead, we used ddRT-PCR to measure Fshr expression degrees in this study. In addition, we also provide single cell sequence-based evidence from 4 public single cell portables to support our findings of the wide Fshr expression. Please see Supplementary Data 3 in the revised MS.

      References:

      (1) Su J, Song Y, Yang Y, et al. Study on the changes of LHR, FSHR and AR with the development of testis cells in Hu sheep. Anim Reprod Sci. Sep 2023;256:107306. doi:10.1016/j.anireprosci.2023.107306

      (2) Liu P, Ji Y, Yuen T, et al. Blocking FSH induces thermogenic adipose tissue and reduces body fat. Nature. Jun 1 2017;546(7656):107-112. doi:10.1038/nature22342

      (3) Liu XM, Chan HC, Ding GL, et al. FSH regulates fat accumulation and redistribution in aging through the Galphai/Ca(2+)/CREB pathway. Aging Cell. Jun 2015;14(3):409-20. doi:10.1111/acel.12331

      (4) Bell P, Vandenberghe LH, Wu D, Johnston J, Limberis M, Wilson JM. A comparative analysis of novel fluorescent proteins as reporters for gene transfer studies. J Histochem Cytochem. Sep 2007;55(9):931-9. doi:10.1369/jhc.7A7180.2007

    1. Author response:

      eLife assessment 

      This important study identifies a novel gastrointestinal enhancer of Ctnnb1. The authors present convincing evidence to support their claim that the dosage of Wnt/β-catenin signaling controlled by this enhancer is critical to intestinal epithelia homeostasis and the progression of colorectal cancers. The study will be of interest to biomedical researchers interested in Wnt signaling, tissue-specific enhancers, intestinal homeostasis, and colon cancer. 

      We greatly appreciate editors’ and reviewers’ extensive and constructive comments and suggestions. We will do our utmost to revise the manuscript accordingly.

      Public Reviews: 

      Reviewer #1 (Public Review)

      Summary: 

      Ctnnb1 encodes β-catenin, an essential component of the canonical Wnt signaling pathway. In this study, the authors identify an upstream enhancer of Ctnnb1 responsible for the specific expression level of β-catenin in the gastrointestinal tract. Deletion of this promoter in mice and analyses of its association with human colorectal tumors support that it controls the dosage of Wnt signaling critical to the homeostasis in intestinal epithelia and colorectal cancers. 

      Strengths: 

      This study has provided convincing evidence to demonstrate the functions of a gastrointestinal enhancer of Ctnnb1 using combined approaches of bioinformatics, genomics, in vitro cell culture models, mouse genetics, and human genetics. The results support the idea that the dosage of Wnt/β-catenin signaling plays an important role in the pathophysiological functions of intestinal epithelia. The experimental designs are solid and the data presented are of high quality. This study significantly contributes to the research fields of Wnt signaling, tissue-specific enhancers, and intestinal homeostasis. 

      Weaknesses: 

      One weakness of this manuscript is an insufficient discussion on the Ctnnb1 enhancers for different tissues. For example, do specific DNA motifs and transcriptional factors contribute to the tissue-specificity of the neocortical and gastrointestinal enhancers? It is also worth discussing the potential molecular mechanisms controlling the gastrointestinal expression of Ctnnb1 in different species since the identified human and mouse enhancers don't seem to share significant similarities in primary sequences. 

      We agree with the reviewer that the manuscript lacks sufficient discussions on how enhancers control cell-type-specific expressions of target genes, which is one of the most important questions in the field of transcription regulation. Equally important are the common and species-specific features of this regulation. In general, motif composition, location, order, and affinity with trans-factors within enhancers are four key elements. We will elaborate the point in follow-up revision.

      Reviewer #2 (Public Review): 

      Wnt signaling is the name given to a cell-communication mechanism that cells employ to inform on each other's position and identity during development. In cells that receive the Wnt signal from the extracellular environment, intracellular changes are triggered that cause the stabilization and nuclear translocation of β-catenin, a protein that can turn on groups of genes referred to as Wnt targets. Typically these are genes involved in cell proliferation. Genetic mutations that affect Wnt signaling components can therefore affect tissue expansion. Loss of function of APC is a drastic example: APC is part of the β-catenin destruction complex, and in its absence, β-catenin protein is not degraded and constitutively turns on proliferation genes, causing cancers in the colon and rectum. And here lies the importance of the finding: β-catenin has for long been considered to be regulated almost exclusively by tuning its protein turnover. In this article, a new aspect is revealed: Ctnnb1, the gene encoding for β-catenin, possesses tissue-specific regulation with transcriptional enhancers in its vicinity that drive its upregulation in intestinal stem cells. The observation that there is more active β-catenin in colorectal tumors not only because the broken APC cannot degrade it, but also because transcription of the Ctnnb1 gene occurs at higher rates, is novel and potentially game-changing. As genomic regulatory regions can be targeted, one could now envision that mutational approaches aimed at dampening Ctnnb1 transcription could be a viable additional strategy to treat Wnt-driven tumors. 

      We appreciate the reviewer for acknowledging the potential significance represented by the manuscript. We also recognize that targeting genomic regulatory regions to dampen Ctnnb1 transcription could be a promising strategy for treating Wnt-driven tumors, including many colorectal carcinomas. However, we would like to point out that three are significant technical challenges associated with AAV delivery to the GI epithelium, including the hostile environment, immune response, and low delivery efficiency.

      Reviewer #3 (Public Review): 

      The authors of this paper identify an enhancer upstream of the Ctnnb1 gene that selectively enhances expression in intestinal cells. This enhancer sequence drives expression of a reporter gene in the intestine and knockout of this enhancer attenuates Ctnnb1 expression in the intestine while protecting mice from intestinal cancers. The human counterpart of this enhancer sequence is functional and involved in tumorigenesis. Overall, this is an excellent example of how to fully characterize a cell-specific enhancer. The strength of the study is the thorough nature of the analysis and the relevance of the data to the development of intestinal tumors in both mice and humans. A minor weakness is that the loss of this enhancer does not completely compromise the expression of the Ctnnb1 gene in the intestine, suggesting that other elements are likely involved. Adding some discussion on that point would be helpful.

      We are quite encouraged by the reviewer’s positive comments. We agree with the reviewer that other cis-regulatory elements may be involved in the transcription of Ctnnb1 within the GI epithelium. It is also possible that the basal transcription of Ctnnb1 within the GI epithelium is relatively high, and that enhancers can only boost transcription within a certain range. We will discuss these possibilities in the revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript presents a machine-learning method to predict protein hotspot residues. The validation is incomplete, along with the misinterpretation of the results with other current methods like FTMap.

      We believe that validation is complete: The two most common techniques for testing and validating machine-learning methods are to split the dataset into either (1) a training set and a test set with a fixed ratio (e.g., 70% for training and 30% for testing) or (2) multiple subsets/folds; i.e., cross-validation. We did not employ a training set to train the model and a separate test set to evaluate its performance, as Reviewer 2 assumed. Instead, we employed cross-validation, as it helps reduce the variability in performance estimates compared to a single training/test split, and utilizes the entire dataset for training and testing, making efficient use of the limited data. Each fold was used once as a test set and the remaining folds as the training set - this process was repeated for each fold and the model's performance was measured using the F1 score. We had listed the mean validation F1 score in Table 1.

      We have clarified our comparison with FTMAP  - see reply to point 1 of reviewer 1 below. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper describes a program developed to identify PPI-hot spots using the free protein structure and compares it to FTMap and SPOTONE, two webservers that they consider as competitive approaches to the problem. On the positive side, I appreciate the effort in providing a new webserver that can be tested by the community but have two major concerns as follows.

      (1) The comparison to the FTMap program is wrong. The authors misinterpret the article they refer to, i.e., Zerbe et al. "Relationship between hot spot residues and ligand binding hot spots in protein-protein interfaces" J. Chem. Inf. Model. 52, 2236-2244, (2012). FTMap identifies hot spots that bind small molecular ligands. The Zerbe et al. article shows that such hot spots tend to interact with hot spot residues on the partner protein in a protein-protein complex (emphasis on "partner"). Thus, the hot spots identified by FTMap are not the hot spots defined by the authors. In fact, because the Zerbe paper considers the partner protein in a complex, the results cannot be compared to the results of Chen et al. This difference is missed by the authors, and hence the comparison of the FTMap is invalid. I did not investigate the comparison to SPOTONE, and hence have no opinion.

      Brenke et al. (Bioinformatics 2009 25: 621-627), who developed FTMAP, defined hot spots as regions of the binding surface that “contribute a disproportionate amount to the binding free energy”. Kozakov et al. (Proc. Natl. Acad. Sci. 2011:108, 13528-1353) used unbound protein structures as input to FTMap to predict binding hot spots for protein-protein interactions (PPIs), which are defined as regions (so-called consensus sites) on a protein surface that bind multiple probe clusters − the main hot spot is the largest consensus site binding the largest number of probe clusters. 

      Zerbe et al. (J. Chem. Inf. Model. 2012:52, 2236) noted that a consensus “site is expected to be important in any interaction that involves that region of the target independent of any partner protein.” They showed that for hot spot residues found by Ala scanning not only overlapped with the probe ligands but also form consensus sites, as shown in Figure 4. They stated that “A residue can also be identified as a hot spot by alanine scanning if it contributes to creating such a favorable binding environment by being among the residues forming a consensus site on the protein to which it belongs.”

      To clarify the comparison with FTmap in the revised version, we have added the following sentence in the Abstract on p. 3:

      “We explored the possibility of detecting PPI-hot spots using (i) FTMap in the PPI mode, which identifies hot spots on protein-protein interfaces from the free protein structure, and (ii) the interface residues predicted by AlphaFold-Multimer.”

      We have added the following sentences in the Introduction section on p. 4:

      “We explored the possibility of detecting PPI-hot spots using the FTMap server in the PPI mode, which identifies hot spots on protein-protein interfaces from free protein structures.45 These hot spots are identified by consensus sites − regions that bind multiple probe clusters.42,45,59 Such regions are deemed to be important for any interaction involving that region of the target, independent of partner protein.42 PPIhot spots were identified as residues in van der Waals (vdW) contact with probe ligands within the largest consensus site containing the most probe clusters.”

      and in the Results section on p. 5:

      “Given the free protein structure, PPI-HotspotID and SPOTONE53 predict PPI-hot spots based on a probability threshold (> 0.5). FTMap, in the PPI mode, detects PPIhot spots as consensus sites/regions on the protein surface that bind multiple probe clusters.59 Residues in vdW contact with probe molecules within the largest consensus site were compared with PPI-hotspotID/SPOTONE predictions.”

      (2) Chen et al. use a number of usual features in a variety of simple machine-learning methods to identify hot spot residues. This approach has been used in the literature for more than a decade. Although the authors say that they were able to find only FTMap and SPOTONE as servers, there are dozens of papers that describe such a methodology. Some examples are given here: (Higa and Tozzi, 2009; Keskin, et al., 2005; Lise, et al., 2011; Tuncbag, et al., 2009; Xia, et al., 2010). There are certainly more papers. Thus, while I consider the web server as a potentially useful contribution, the paper does not provide a fundamentally novel approach.

      Our paper introduces several novel elements in our approach: 

      (1) Most PPI-hot spot prediction methods employ PPI-hotspots where mutations decrease protein binding free energy by > 2 kcal/mol (J. Chem. Inf. Model. 2022, 62, 1052). In contrast, our method incorporates not only PPI-hot spots with such binding free energy changes, but also those whose mutations have been curated in UniProtKB to significantly impair/disrupt PPIs. Because our method employs the largest collection of experimentally determined PPI-hot spots, it could uncover elusive PPI-hot spots not within binding interfaces, as well as potential PPI-hot spots for other protein partners (see point 3 below). 

      (2) Whereas most machine-learning methods for PPI-hot spot prediction focus on features derived from (i) primary sequences or (ii) protein-protein complexes, we introduce novel features such as per-residue free energy contributions derived from unbound protein structures. We further revealed the importance of one of our novel features, namely, the gas-phase energy of the target protein relative to its unfolded state and provided the physical basis for its importance. For example, PPI-hot spots can enhance favorable enthalpic contributions to the binding free energy through hydrogen bonds or van der Waals contacts across the protein’s interface. This makes them energetically unstable in the absence of the protein’s binding partner and solvent; hence providing a rationale for the importance of the gas-phase energy of the target protein relative to its unfolded state.

      (3) As a result of these novel elements, our approach, PPI-HotspotID,  could identify many true positives that were not detected by FTMap or SPOTONE (see Results and Figure 1). Previous methods generally predict residues that make multiple contacts across the proteinprotein interface as PPI-hot spots. In contrast, PPI-HotspotID can detect not only PPI-hot spots that make multiple contacts across the protein-protein interface, but also those lacking direct contact with the partner protein (see Discussion).

      (4) Unlike most machine-learning methods which require feature customization, data preprocessing, and model optimization, our use of AutoGluon’s AutoTabular module automates data preprocessing, model selection, hyperparameter optimization, and model evaluation. This automation reduces the need for manual intervention.

      We have revised and added the following sentences on p. 9 in the Discussion section to highlight the novelty of our approach: 

      “Here, we have introduced two novel elements that have helped to identify PPI-hot spots using the unbound structure. First, we have constructed a dataset comprising 414 experimentally known PPI-hot spots and 504 nonhot spots, and carefully checked that PPI-hot spots have no mutations resulting in ΔΔGbind < 0.5 kcal/mol, whereas nonhot spots have no mutations resulting in ΔΔGbind ≥ 0.5 kcal/mol or impact binding in immunoprecipitation or GST pull-down assays (see Methods). In contrast, SPOTONE53 employed nonhot spots defined as residues that upon alanine mutation resulted in ΔΔGbind < 2.0 kcal/mol. Notably, previous PPI-hot spot prediction methods did not employ PPIhot spots whose mutations have been curated to significantly impair/disrupt PPIs in UniProtKB (see Introduction). Second, we have introduced novel features derived from unbound protein structures such as the gas-phase energy of the target protein relative to its unfolded state.”

      Strengths:

      A new web server was developed for detecting protein-protein interaction hot spots.

      Weaknesses:

      The comparison to FTMap results is wrong. The method is not novel.

      See reply to points 1 and 2 above.

      Reviewer #2 (Public Review):

      Summary:

      The paper presents PPI-hotspot a method to predict PPI-hotspots. Overall, it could be useful but serious concerns about the validation and benchmarking of the methodology make it difficult to predict its reliability.

      Strengths:

      Develops an extended benchmark of hot-spots.

      Weaknesses:

      (1) Novelty seems to be just in the extended training set. Features and approaches have been used before.

      The novelty of our approach extends beyond just the expanded training set, as summarized in our reply to Reviewer #1, point 2 above. To our knowledge, previous studies did not leverage the gas-phase energy of the target protein relative to its unfolded state for detecting PPI-hot spots from unbound structures. Previous studies did not automate the training and validation process. In contrast, we used AutoGluon’s AutoTabular module to automate the training  of (i) individual “base” models, including LightGBM, CatBoost, XGBoost, random forests, extremely randomized trees, neural networks, and K-nearest neighbours, then (ii) multiple “stacker” models. The predictions of multiple “stacker” models were fed as inputs to additional higher layer stacker models in an iterative process called multi-layer stacking. The output layer used ensemble selection to aggregate the predictions of the stacker models. To improve stacking performance, AutoGluon used all the data for both training and validation through repeated k-fold bagging of all models at all layers of the stack, where k is determined by best precision. This comprehensive approach, including repeated k-fold bagging of all models at all layers of the stack, sets our methodology apart from previous studies, including SPOTONE (see Methods). 

      (2) As far as I can tell the training and testing sets are the same. If I am correct, it is a fatal flaw.

      The two most common techniques for testing and validating machine-learning methods are to split the dataset into either (1) a training set and a test set with a fixed ratio (e.g., 70% for training and 30% for testing) or (2) multiple subsets/folds; i.e., cross-validation. We did not employ a training set to train the model and a separate test set to evaluate its performance. Instead, we employed cross-validation, where the model was trained and evaluated multiple times. Each fold was used once as a test set and the remaining folds serve as the training set - this process was repeated for each fold. For each test set, we assessed  the model's performance using the F1 score. We had listed the mean validation F1 score in Table 1 in the original manuscript. Cross-validation helps reduce the variability in performance estimates compared to a single training/test split. It also utilizes the entire dataset for training and testing, making efficient use of the limited data. We have clarified this on p. 14 in the revised version:

      “AutoGluon was chosen for model training and validation due to its robustness and userfriendly interface, allowing for the simultaneous and automated exploration of various machine-learning approaches and their combinations. Instead of using a single training set to train the model and a separate test set to evaluate its performance, we employed cross-validation, as it utilizes the entire dataset for both training and testing, making efficient use of the limited data on PPI-hot spots and PPI-nonhot spots. AutoGluonTabular automatically chose a random partitioning of our dataset into multiple subsets/folds for training and validation. Notably, the training and validation data share insignificant homology, as the average pairwise sequence identity in our dataset is 26%. Each fold was used once as a test set, while the remaining folds served as the training set. For each test set, the model's performance was measured using the F1 score.”

      (3) Comparisons should state that: SPOTONE is a sequence (only) based ML method that uses similar features but is trained on a smaller dataset. FTmap I think predicts binding sites, I don't understand how it can be compared with hot spots. Suggesting superiority by comparing with these methods is an overreach.

      In the Introduction on page 3, we had already stated that:

      “SPOTONE53 predicts PPI-hot spots from the protein sequence using residue-specific features such as atom type, amino acid (aa) properties, secondary structure propensity, and mass-associated values to train an ensemble of extremely randomized trees. The PPIhot spot prediction methods have mostly been trained, validated, and tested on data from the Alanine Scanning Energetics database (ASEdb)55 and/or the Structural Kinetic and Energetic database of Mutant Protein Interactions (SKEMPI) 2.0 database.56”

      On p. 4, we have clarified how we used FTMAP to detect hot spots - see reply to Reviewer #1, point 1. 

      “We explored the possibility of detecting PPI-hot spots using the FTMap server in the PPI mode, which identifies hot spots on protein-protein interfaces from free protein structures.45 These hot spots are identified by consensus sites − regions that bind multiple probe clusters.42,45,59 Such regions are deemed to be important for any interaction involving that region of the target, independent of partner protein.42 PPI-hot spots were identified as residues in van der Waals (vdW) contact with probe ligands within the largest consensus site containing the most probe clusters.”

      (4) Training in the same dataset as SPOTONE, and then comparing results in targets without structure could be valuable.

      We think that the dataset used by SPOTONE is not as “clean” as ours since SPOTONE employed nonhot spots defined as aa residues that upon alanine mutation resulted in ΔΔGbind < 2.0 kcal/mol.  In contrast, we define nonhot spots as residues whose mutations resulted in protein  ΔΔGbind changes < 0.5 kcal/mol. Moreover, we carefully checked that the nonhot spots have no mutations resulting in ΔΔGbind changes ≥ 0.5 kcal/mol or impact binding in immunoprecipitation or GST pull-down assays (see Methods). We cannot compare results in targets without structure because we require the free protein structure to compute the perresidue free energy contributions. 

      (5) The paper presents as validation of the prediction and experimental validation of hotspots in human eEF2. Several predictions were made but only one was confirmed, what was the overall success rate of this exercise?

      We did not test all predicted PPI-hot spots but only the PPI-hot spot with the highest probability of 0.67 (F794) and 7 other predicted PPI-hot spots that were > 12 Å from F794 as well as 4 predicted PPI-nonhot spots. Among the 13 predictions tested, F794 and the 4 predicted nonhot spots were confirmed to be correct. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Remove the comparison to FTMap, and find a more appropriate reference method, even if it requires installing programs rather than using the available web servers.

      We have clarified comparison to FTMap in the revised ms - see our reply above.

    1. Author response:

      eLife assessment

      This useful study examines the neural activity in the motor cortex as a monkey reaches to intercept moving targets, focusing on how tuned single neurons contribute to an interesting overall population geometry. The presented results and analyses are solid, though the investigation of this novel task could be strengthened by clarifying the assumptions behind the single neuron analyses, and further analyses of the neural population activity and its relation to different features of behaviour.

      Thanks for recognizing the content of our research, and please stay tuned for our follow-up studies on neural dynamics during interception.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study addresses the question of how task-relevant sensory information affects activity in the motor cortex. The authors use various approaches to address this question, looking at single units and population activity. They find that there are three subtypes of modulation by sensory information at the single unit level. Population analyses reveal that sensory information affects the neural activity orthogonally to motor output. The authors then compare both single unit and population activity to computational models to investigate how encoding of sensory information at the single unit level is coordinated in a network. They find that an RNN that displays similar orbital dynamics and sensory modulation to the motor cortex also contains nodes that are modulated similarly to the three subtypes identified by the single unit analysis.

      Strengths:

      The strengths of this study lie in the population analyses and the approach of comparing single-unit encoding to population dynamics. In particular, the analysis in Figure 3 is very elegant and informative about the effect of sensory information on motor cortical activity. The task is also well designed to suit the questions being asked and well controlled.

      We appreciate these kind comments.

      It is commendable that the authors compare single units to population modulation. The addition of the RNN model and perturbations strengthen the conclusion that the subtypes of individual units all contribute to the population dynamics. However, the subtypes (PD shift, gain, and addition) are not sufficiently justified. The authors also do not address that single units exhibit mixed modulation, but RNN units are not treated as such.

      We’re sorry for not providing sufficient grounds to introduce the subtypes. We determined the PD shift, gain, and addition as pertinent subtypes based on classical cosine tuning model (Georgopoulos et al., 1982) and referred to some gain modulation studies (e.g. Pesaran et al. 2010, Bremner and Andersen, 2012). Here, we applied this subtype analysis as a criteria to identify the modulation in neuronal population rather than to sort neuron into distinct cell types. We will update Methods in the revised version of manuscript.

      Weaknesses:

      The main weaknesses of the study lie in the categorization of the single units into PD shift, gain, and addition types. The single units exhibit clear mixed selectivity, as the authors highlight. Therefore, the subsequent analyses looking only at the individual classes in the RNN are a little limited. Another weakness of the paper is that the choice of windows for analyses is not properly justified and the dependence of the results on the time windows chosen for single-unit analyses is not assessed. This is particularly pertinent because tuning curves are known to rotate during movements (Sergio et al. 2005 Journal of Neurophysiology).

      The mixed selectivity or precisely the mixed modulation is indeed a significant feature of neuronal population in the present study. The purpose of the subtype analysis was to serve as a criterion for the potential modulation mechanisms. However, the results appear to be a spectrum than clusters. It still through some insights to understand the modulation distribution and we will refine the description in the next version. In the current version, we observed single-unit tuning and population neural state with sliding windows, focusing on the period around movement onset (MO) due to the emergence of a ring-like structure. We will clarify the choice of windows and the dependence assessment in the next version. It’s a great suggestion to consider the role of rotating tuning curves in neural dynamics during interception.

      This paper shows sensory information can affect motor cortical activity whilst not affecting motor output. However, it is not the first to do so and fails to cite other papers that have investigated sensory modulation of the motor cortex (Stavinksy et al. 2017 Neuron, Pruszynski et al. 2011 Nature, Omrani et al. 2016 eLife). These studies should be mentioned in the Introduction to capture better the context around the present study. It would also be beneficial to add a discussion of how the results compare to the findings from these other works.

      Thanks for the reminder. We will introduce the relevant research in the next version of manuscript.

      This study also uses insights from single-unit analysis to inform mechanistic models of these population dynamics, which is a powerful approach, but is dependent on the validity of the single-cell analysis, which I have expanded on below.

      I have clarified some of the areas that would benefit from further analysis below:

      (1) Task:

      The task is well designed, although it would have benefited from perhaps one more target speed (for each direction). One monkey appears to have experienced one more target speed than the others (seen in Figure 3C). It would have been nice to have this data for all monkeys.

      Great suggestion! However, it’s hard to implement as the implanted arrays have been removed.

      (2) Single unit analyses:

      In some analyses, the effects of target speed look more driven by target movement direction (e.g. Figures 1D and E). To confirm target speed is the main modulator, it would be good to compare how much more variance is explained by models including speed rather than just direction. More target speeds may have been helpful here too.

      Nice suggestion! The fitting goodness of the simple model (just motor direction) is much less than the complex model (including target speed). We will update the results in the next version.

      The choice of the three categories (PD shift, gain addition) is not completely justified in a satisfactory way. It would be nice to see whether these three main categories are confirmed by unsupervised methods.

      A good point. We will have a try with unsupervised methods. 

      The decoder analyses in Figure 2 provide evidence that target speed modulation may change over the trial. Therefore, it is important to see how the window considered for the firing rate in Figure 1 (currently 100ms pre - 100ms post movement onset) affects the results.

      Thanks for the suggestion and close reading. We will test the decoder in other epochs.

      (3) Decoder:

      One feature of the task is that the reach endpoints tile the entire perimeter of the target circle (Figure 1B). However, this feature is not exploited for much of the single-unit analyses. This is most notable in Figure 2, where the use of a SVM limits the decoding to discrete values (the endpoints are divided into 8 categories). Using continuous decoding of hand kinematics would be more appropriate for this task.

      This is a very reasonable suggestion. In this study, we discrete the reach-direction as the previous studies (Li et al., 2018&2022) and thought that the discrete decoding was already enough to show the interaction of sensory and motor variables. In future studies, we will try continuous decoding of hand kinematics.

      (4) RNN:

      Mixed selectivity is not analysed in the RNN, which would help to compare the model to the real data where mixed selectivity is common. Furthermore, it would be informative to compare the neural data to the RNN activity using canonical correlation or Procrustes analyses. These would help validate the claim of similarity between RNN and neural dynamics, rather than allowing comparisons to be dominated by geometric similarities that may be features of the task. There is also an absence of alternate models to compare the perturbation model results to.

      Thank you for these helpful suggestions. We will perform decoding analysis on RNN units to verify if there is interaction of sensory and motor variables as in real data, as well as the canonical correlation or Procrustes analysis.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Zhang et al. examine neural activity in the motor cortex as monkeys make reaches in a novel target interception task. Zhang et al. begin by examining the single neuron tuning properties across different moving target conditions, finding several classes of neurons: those that shift their preferred direction, those that change their modulation gain, and those that shift their baseline firing rates. The authors go on to find an interesting, tilted ring structure of the neural population activity, depending on the target speed, and find that (1) the reach direction has consistent positioning around the ring, and (2) the tilt of the ring is highly predictive of the target movement speed. The authors then model the neural activity with a single neuron representational model and a recurrent neural network model, concluding that this population structure requires a mixture of the three types of single neurons described at the beginning of the manuscript.

      Strengths:

      I find the task the authors present here to be novel and exciting. It slots nicely into an overall trend to break away from a simple reach-to-static-target task to better characterize the breadth of how the motor cortex generates movements. I also appreciate the movement from single neuron characterization to population activity exploration, which generally serves to anchor the results and make them concrete. Further, the orbital ring structure of population activity is fascinating, and the modeling work at the end serves as a useful baseline control to see how it might arise.

      Thank you for recognizing our work.

      Weaknesses:

      While I find the behavioral task presented here to be excitingly novel, I find the presented analyses and results to be far less interesting than they could be. Key to this, I think, is that the authors are examining this task and related neural activity primarily with a single-neuron representational lens. This would be fine as an initial analysis since the population activity is of course composed of individual neurons, but the field seems to have largely moved towards a more abstract "computation through dynamics" framework that has, in the last several years, provided much more understanding of motor control than the representational framework has. As the manuscript stands now, I'm not entirely sure what interpretation to take away from the representational conclusions the authors made (i.e. the fact that the orbital population geometry arises from a mixture of different tuning types). As such, by the end of the manuscript, I'm not sure I understand any better how the motor cortex or its neural geometry might be contributing to the execution of this novel task.

      The present study shows the sensory modulation on motor tuning in single units and neural state during motor execution period. It’s a pity that the findings were constrained in certain time windows. We are still working this topic, and hopefully will address related questions in our follow-up studies.

      Main Comments:

      My main suggestions to the authors revolve around bringing in the computation through a dynamics framework to strengthen their population results. The authors cite the Vyas et al. review paper on the subject, so I believe they are aware of this framework. I have three suggestions for improving or adding to the population results:

      (1) Examination of delay period activity: one of the most interesting aspects of the task was the fact that the monkey had a random-length delay period before he could move to intercept the target. Presumably, the monkey had to prepare to intercept at any time between 400 and 800 ms, which means that there may be some interesting preparatory activity dynamics during this period. For example, after 400ms, does the preparatory activity rotate with the target such that once the go cue happens, the correct interception can be executed? There is some analysis of the delay period population activity in the supplement, but it doesn't quite get at the question of how the interception movement is prepared. This is perhaps the most interesting question that can be asked with this experiment, and it's one that I think may be quite novel for the field--it is a shame that it isn't discussed.

      Great idea! We are on the way, and close to complete the puzzle.

      (2) Supervised examination of population structure via potent and null spaces: simply examining the first three principal components revealed an orbital structure, with a seemingly conserved motor output space and a dimension orthogonal to it that relates to the visual input. However, the authors don't push this insight any further. One way to do that would be to find the "potent space" of motor cortical activity by regression to the arm movement and examine how the tilted rings look in that space (this is actually fairly easy to see in the reach direction components of the dPCA plot in the supplement--the rings will be highly aligned in this space). Presumably, then, the null space should contain information about the target movement. dPCA shows that there's not a single dimension that clearly delineates target speed, but the ring tilt is likely evident if the authors look at the highest variance neural dimension orthogonal to the potent space (the "null space")--this is akin to PC3 in the current figures, but it would be nice to see what comes out when you look in the data for it.

      Nice suggestion. Target-speed modulation mainly influences PC3, which is consistent with ‘null space’ hypothesis. We will try other methods of dimensionality reduction (e.g. dPCA, Manopt) to determine the potent and null space.

      (3) RNN perturbations: as it's currently written, the RNN modeling has promise, but the perturbations performed don't provide me with much insight. I think this is because the authors are trying to use the RNN to interpret the single neuron tuning, but it's unclear to me what was learned from perturbing the connectivity between what seems to me almost arbitrary groups of neurons (especially considering that 43% of nodes were unclassifiable). It seems to me that a better perturbation might be to move the neural state before the movement onset to see how it changes the output. For example, the authors could move the neural state from one tilted ring to another to see if the virtual hand then reaches a completely different (yet predictable) target. Moreover, if the authors can more clearly characterize the preparatory movement, perhaps perturbations in the delay period would provide even more insight into how the interception might be prepared.

      We are sorry that we didn’t clarify the definition of “none” type, which can be misleading. The 43% unclassified nodes include those inactive ones, when only activate (task-related) nodes included, the ratio of unclassified nodes would be much lower. By perturbing the connectivity, we intended to explore the interaction between different modulations.

      Thank you for the great advice. We tried moving neural states from one ring to another without changing the directional cluster, but this perturbation didn’t have a significant influence on network performance as expected. We will check this result again and try perturbations in the delay period.

      Reviewer #3 (Public Review):

      Summary:

      This experimental study investigates the influence of sensory information on neural population activity in M1 during a delayed reaching task. In the experiment, monkeys are trained to perform a delayed interception reach task, in which the goal is to intercept a potentially moving target.

      This paradigm allows the authors to investigate how, given a fixed reach endpoint (which is assumed to correspond to a fixed motor output), the sensory information regarding the target motion is encoded in neural activity.

      At the level of single neurons, the authors found that target motion modulates the activity in three main ways: gain modulation (scaling of the neural activity depending on the target direction), shift (shift of the preferred direction of neurons tuned to reach direction), or addition (offset to the neural activity).

      At the level of the neural population, target motion information was largely encoded along the 3rd PC of the neural activity, leading to a tilt of the manifold along which reach direction was encoded that was proportional to the target speed. The tilt of the neural manifold was found to be largely driven by the variation of activity of the population of gain-modulated neurons.

      Finally, the authors studied the behaviour of an RNN trained to generate the correct hand velocity given the sensory input and reach direction. The RNN units were found to similarly exhibit mixed selectivity to the sensory information, and the geometry of the « neural population » resembled that observed in the monkeys.

      Strengths:

      - The experiment is well set up to address the question of how sensory information that is directly relevant to the behaviour but does not lead to a direct change in behavioural output modulates motor cortical activity.

      - The finding that sensory information modulates the neural activity in M1 during motor preparation and execution is non trivial, given that this modulation of the activity must occur in the nullspace of the movement.

      - The paper gives a complete picture of the effect of the target motion on neural activity, by including analyses at the single neuron level as well as at the population level. Additionally, the authors link those two levels of representation by highlighting how gain modulation contributes to shaping the population representation.

      Thanks for your recognition.

      Weaknesses:

      - One of the main premises of the paper is the fact that the motor output for a given reach point is preserved across different target motions. However, as the authors briefly mention in the conclusion, they did not record muscle activity during the task, but only hand velocity, making it impossible to directly verify how preserved muscle patterns were across movements. While the authors highlight that they did not see any difference in their results when resampling the data to control for similar hand velocities across conditions, this seems like an important potential caveat of the paper whose implications should be discussed further or highlighted earlier in the paper.

      Thanks for the suggestion. We will highlight the resampling results as important control in the next version of manuscript.

      - The main takeaway of the RNN analysis is not fully clear. The authors find that an RNN trained given a sensory input representing a moving target displays modulation to target motion that resembles what is seen in real data. This is interesting, but the authors do not dissect why this representation arises, and how robust it is to various task design choices. For instance, it appears that the network should be able to solve the task using only the motion intention input, which contains the reach endpoint information. If the target motion input is not used for the task, it is not obvious why the RNN units would be modulated by this input (especially as this modulation must lie in the nullspace of the movement hand velocity if the velocity depends only on the reach endpoint). It would thus be important to see alternative models compared to true neural activity, in addition to the model currently included in the paper. Besides, for the model in the paper, it would therefore be interesting to study further how the details of the network setup (eg initial spectral radius of the connectivity, weight regularization, or using only the target position input) affect the modulation by the motion input, as well as the trained population geometry and the relative ratios of modulated cells after training.

      Great suggestions. It’s a considerable pity that we didn’t dissect the formation reason and influence factor of the representation in the current version. We’ve tried several combinations of inputs before: in the network which received only motor intention and GO inputs, there were rings but not tilting related to target-speed; in the network which received only target location and GO inputs, there were ring-like structures but not clear directional clusters. We will check these results and try alternative models in the next version. In future studies, we will examine the influence of network setup details.

      - Additionally, it is unclear what insights are gained from the perturbations to the network connectivity the authors perform, as it is generally expected that modulating the connectivity will degrade task performance and the geometry of the responses. If the authors wish the make claims about the role of the subpopulations, it could be interesting to test whether similar connectivity patterns develop in networks that are not initialized with an all-to-all random connectivity or to use ablation experiments to investigate whether the presence of multiple types of modulations confers any sort of robustness to the network.

      Thank you for the great suggestions. By perturbations, we intended to explore the contribution of interaction between certain subpopulations. We tried ablation experiments, but the result was not significant. Probably because the most units were of mixed selectivity, the units of only modulations were not enough for bootstrapping, or the random sampling from single subpopulation (bearing mixed selectivity) could be repeated. We will consider these suggestions carefully in the revised version.

      - The results suggest that the observed changes in motor cortical activity with target velocity result from M1 activity receiving an input that encodes the velocity information. This also appears to be the assumption in the RNN model. However, even though the input shown to the animal during preparation is indeed a continuously moving target, it appears that the only relevant quantity to the actual movement is the final endpoint of the reach. While this would have to be a function of the target velocity, one could imagine that the computation of where the monkeys should reach might be performed upstream of the motor cortex, in which case the actual target velocity would become irrelevant to the final motor output. This makes the results of the paper very interesting, but it would be nice if the authors could discuss further when one might expect to see modulation by sensory information that does not directly affect motor output in M1, and where those inputs may come from. It may also be interesting to discuss how the findings relate to previous work that has found behaviourally irrelevant information is being filtered out from M1 (for instance, Russo et al, Neuron 2020 found that in monkeys performing a cycling task, context can be decoded from SMA but not from M1, and Wang et al, Nature Communications 2019 found that perceptual information could not be decoded from PMd)?

      How and where sensory information modulates M1 are very interesting and open questions. We will discuss further about this topic in the next version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course, the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      We appreciate the reviewer’s perspective.  In our revised version of the manuscript, we have attempted to address these concerns by more adequately explaining the limitations of the study and by more thoroughly discussing the context of the findings.  We are not able to associate the findings with specific clinical outcomes for individual study participants but we speculate about the overall biological meaning of these associations across the cohort.  We cannot disagree with the reviewer, but we find the associations statistically significant, potentially reflecting real biological associations, and forming the basis for future hypothesis testing research. 

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting-edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Sharing the data could create a useful repository for more specific analyses.

      We thank the reviewer for this assessment.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for the variable(s) predicting HIV reservoir size. The Spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though sometimes categorical variables are challenging to interpret.

      We agree with the reviewer that individual parameters are only weakly correlated with the HIV reservoir, likely reflecting the complex and multi-factorial nature of reservoir/immune cell interactions.  Nevertheless, these associations are statistically significant and form the basis for functional testing in viral persistence.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance.  On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance, %CD4 and %CD8).

      When deriving a list of cell populations whose frequency would be correlated with the reservoir, we focused on well-defined cell types for which functional validation exists in the literature to consider them as distinct cell types.  For many of the populations, gating based on combinations of multiple markers leads to recovery of very few cells, and so we excluded some potential combinations from the analysis.  We are also making our raw data available for others to examine and find associations not considered by our manuscript.

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also, sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within-person.

      We have repeated the analysis using log10 transformed data and the new figures are shown in Figure 1 and S2-S5.

      Also, the qualitative characterization of low/high reservoir is not standard and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data, it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Our ML models included time before ART as a variable in the analysis, and this was not found to be a significant driver of the reservoir size associations, except for the percentage of intact proviruses (see Figure 2C). Furthermore, we analyzed whether any of the reservoir correlated immune variables were associated with time on ART and found that, although some immune variables are associated with time on therapy, this was not the case for most of them (Table S4). We agree that it is challenging to translate above or below median into clinical meaning for this cohort, but we emphasize that this study is primarily a hypothesis generating approach requiring additional validation for the associations observed.  We attempted to predict reservoir size as a continuous variable using the data and this approach was not successful (Figure S13). We believe that a significantly larger cohort will likely be required to generate a ML model that can accurately predict the reservoir as a continuous variable.  We have added additional discussion of this to the manuscript.

      Lastly, the work is comprehensive and appears solid, but the code was not shared to see how calculations were performed.

      We now provide a link to the code used to perform the analyses in the manuscript, https://github.com/lesiasemenova/ML_HIV_reservoir.

      Reviewer #2 (Public Review):

      Summary:

      Semenova et. al., performed a cross-sectional analysis of host immunophenotypes (using flow cytometry) and the peripheral CD4+ T cell HIV reservoir size (using the Intact Proviral DNA Assay, IPDA) from 115 people with HIV (PWH) on ART. The study mostly highlights the machine learning methods applied to these host and viral reservoir datasets but fails to interpret these complex analyses into (clinically, biologically) interpretable findings. For these reasons, the direct translational take-home message from this work is lost amidst a large list of findings (shown as clusters of associated markers) and sentences such as "this study highlights the utility of machine learning approaches to identify otherwise imperceptible global patterns" - lead to overinterpretation of their data.

      We have addressed the reviewer’s concern by modifications to the manuscript that enhance the interpretation of the findings in a clinical and biological context.

      Strengths:

      Measurement of host immunophenotyping measures (multiparameter flow cytometry) and peripheral HIV reservoir size (IPDA) from 115 PWH on ART.

      Major Weaknesses:

      (1) Overall, there is little to no interpretability of their machine learning analyses; findings appear as a "laundry list" of parameters with no interpretation of the estimated effect size and directionality of the observed associations. For example, Figure 2 might actually give an interpretation of each X increase in immunophenotyping parameter, we saw a Y increase/decrease in HIV reservoir measure.

      We have added additional text to the manuscript in which we attempt to provide more immunological and clinical interpretation of the associations.  We also have emphasized that these associations are still speculative and will require additional validation.  Nevertheless, our data should provide a rich source of new hypotheses regarding immune system/reservoir interaction that could be tested in future work.

      (2) The correlations all appear to be relatively weak, with most Spearman R in the 0.30 range or so.

      We agree with the review that the associations are mostly weak, consistent with previous studies in this area.  This likely is an inherent feature of the underlying biology – the reservoir is likely associated with the immune system in complex ways and involves stochastic processes that will limit the predictability of reservoir size using any single immune parameter. We have added additional text to the manuscript to make this point clearer.

      (3) The Discussion needs further work to help guide the reader. The sentence: "The correlative results from this present study corroborate many of these studies, and provide additional insights" is broad. The authors should spend some time here to clearly describe the prior literature (e.g., describe the strength and direction of the association observed in prior work linking PD-1 and HIV reservoir size, as well as specify which type of HIV reservoir measures were analyzed in these earlier studies, etc.) and how the current findings add to or are in contrast to those prior findings.

      We have added additional text to the manuscript to help guide the readers through the possible biological significance of the findings and the context with respect to prior literature.

      (4) The most interesting finding is buried on page 12 in the Discussion: "Uniquely, however, CD127 expression on CD4 T cells was significantly inversely associated with intact reservoir frequency." The authors should highlight this in the abstract, and title, and move this up in the Discussion. The paper describes a very high dimensional analysis and the key takeaways are not clear; the more the author can point the reader to the take-home points, the better their findings can have translatability to future follow-up mechanistic and/or validation studies.

      We appreciate the reviewer’s comment.  We have increased the emphasis on this finding in the revised version of the manuscript.

      (5) The authors should avoid overinterpretation of these results. For example in the Discussion on page 13 "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy." It is highly unlikely that future studies will be performing the breadth of parameters resulting here and then use these directly for optimizing therapy.

      Our analyses indicate that membership of study participants in cluster1 or cluster 2 can be fairly accurately determined by a small number of individual parameters (KLRG1 etc, Figure 4F), and measuring the cells of PWH with the degree of breadth used in this paper would not be necessary to classify PWH into these clusters.  As such, we feel that it is not unrealistic to speculate that this finding could turn out to be clinically useful, if it becomes clear that the clusters are biologically meaningful.

      (6) There are only TWO limitations listed here: cross-sectional study design and the use of peripheral blood samples. (The subsequent paragraph notes an additional weakness which is misclassification of intact sequences by IPDA). This is a very limited discussion and highlights the need to more critically evaluate their study for potential weaknesses.

      We have expanded on the list of limitations discussed in the manuscript. In particular, we now address the size of the cohort, the composition with respect to different genders and demographics, lack of information for the timing of ART and the lack of information regarding intracellular transcriptional pathways.

      (7) A major clinical predictor of HIV reservoir size and decay is the timing of ART initiation. The authors should include these (as well as other clinical covariate data - see #12 below) in their analyses and/or describe as limitations of their study.

      All of the participants that make up our cohort were treated during chronic infection, and the precise timing of ART initiation is unclear in most of these cases.  We have added additional information to explain this in the manuscript and include this in the list of limitations.

      Reviewer #3 (Public Review):

      Summary:

      This valuable study by Semenova and colleagues describes a large cross-sectional cohort of 115 individuals on ART. Participants contributed a single blood sample which underwent IPDA, and 25-color flow with various markers (pre and post-stimulation). The authors then used clustering, decision tree analyses, and machine learning to look for correlations between these immunophenotypic markers and several measures of HIV reservoir volume. They identified two distinct clusters that can be somewhat differentiated based on total HIV DNA level, intact HIV DNA level, and multiple T cell cellular markers of activation and exhaustion.

      The conclusions of the paper are supported by the data but the relationships between independent and dependent variables in the models are correlative with no mechanistic work to determine causality. It is unclear in most cases whether confounding variables could explain these correlations. If there is causality, then the data is not sufficient to infer directionality (ie does the immune environment impact the HIV reservoir or vice versa or both?). In addition, even with sophisticated and appropriate machine learning approaches, the models are not terribly predictive or highly correlated. For these reasons, the study is very much hypothesis-generating and will not impact cure strategies or HIV reservoir measurement strategies in the short term.

      We appreciate the reviewer’s comments regarding the value of our study.  We fully acknowledge that the causal nature and directionality of these associations are not yet clear and agree that the study is primarily hypothesis generating in nature.  Nevertheless, we feel that the hypotheses generated will be valuable to the field.  We have added additional text to the manuscript to emphasize the hypothesis generating nature of this paper.

      Strengths:

      The study cohort is large and diverse in terms of key input variables such as age, gender, and duration of ART. Selection of immune assays is appropriate. The authors used a wide array of bioinformatic approaches to examine correlations in the data. The paper was generally well-written and appropriately referenced.

      Weaknesses:

      (1) The major limitation of this work is that it is highly exploratory and not hypothesis-driven. While some interesting correlations are identified, these are clearly hypothesis-generating based on the observational study design.

      We agree that the major goal of this study was hypothesis generating and that our work is exploratory in nature. Performing experiments with mechanism testing goals in human participants with HIV is challenging.  Additionally, before such mechanistic studies can be undertaken, one must have hypotheses to test. As such we feel our study will be useful for the field in helping to identify hypotheses that could potentially be tested.

      (2) The study's cross-sectional nature limits the ability to make mechanistic inferences about reservoir persistence. For instance, it would be very interesting to know whether the reservoir cluster is a feature of an individual throughout ART, or whether this outcome is dynamic over time.

      We agree with the reviewer’s comment. Longitudinal studies are challenging to carry out with a study cohort of this size, and addressing questions such as the one raised by the reviewer would be of great interest. We believe our study nevertheless has value in identifying hypotheses that could be tested in a longitudinal study.

      (3) A fundamental issue is that I am concerned that binarizing the 3 reservoir metrics in a 50/50 fashion is for statistical convenience. First, by converting a continuous outcome into a simple binary outcome, the authors lose significant amounts of quantitative information. Second, the low and high reservoir outcomes are not actually demonstrated to be clinically meaningful: I presume that both contain many (?all) data points above levels where rebound would be expected soon after interruption of ART. Reservoir levels would also have no apparent outcome on the selection of cure approaches. Overall, dividing at the median seems biologically arbitrary to me.

      The reviewer raises a valid point that the clinical significance of above or below median reservoir metrics is unclear, and that the size of the reservoir has potentially little relation to rebound and cure approaches.  In the manuscript, we attempted to generate models that can predict reservoir size as a continuous variable in Figure S13 and find that this approach performs poorly, while a binarized approach was more successful. As such we have included both approaches in the manuscript.  It is possible that future studies with larger sample sizes and more detailed measurements will perform better for continuous variable prediction.  While this is a fairly large study (n=115) by the standards of HIV reservoir analyses, it is a small study by the standards of the machine learning field, and accurate predictive ML models for reservoir size as a continuous variable will likely require a much larger set of samples/participants.  Nevertheless, we feel our work has value as a template for ML approaches that may be informative for understanding HIV/immune interactions and generates novel hypotheses that could be validated by subsequent studies.

      (4) The two reservoir clusters are of potential interest as high total and intact with low % intact are discriminated somewhat by immune activation and exhaustion. This was the most interesting finding to me, but it is difficult to know whether this clustering is due to age, time on ART, other co-morbidity, ART adherence, or other possible unmeasured confounding variables.

      We agree that this finding is one of the more interesting outcomes of the study. We examined a number of these variables for association with cluster membership, and these data are reported in Figure S8A-D.  Age, years of ART and CD4 Nadir were all clearly different between the clusters.   The striking feature of this clustering, however, is the clear separation between the two groups of participants, as opposed to a continuous gradient of phenotypes.  This could reflect a bifurcation of outcomes for people with HIV, dynamic changes in the reservoir immune interactions over time, or different levels of untreated infection.  It is certainly possible that some other unmeasured confounding variables contribute to this outcome and we have attempted to make this limitation clearer.

      (5) At the individual level, there is substantial overlap between clusters according to total, intact, and % intact between the clusters. Therefore, the claim in the discussion that these 2 cluster phenotypes may require different therapeutic approaches seems rather speculative. That said, the discussion is very thoughtful about how these 2 clusters may develop with consideration of the initial insult of untreated infection and / or differences in immune recovery.

      We agree with the reviewer that this claim is speculative, and we have attempted to moderate the language of the text in the revised version.

      (6) The authors state that the machine learning algorithms allow for reasonable prediction of reservoir volume. It is subjective, but to me, 70% accuracy is very low. This is not a disappointing finding per se. The authors did their best with the available data. It is informative that the machine learning algorithms cannot reliably discriminate reservoir volume despite substantial amounts of input data. This implies that either key explanatory variables were not included in the models (such as viral genotype, host immune phenotype, and comorbidities) or that the outcome for testing the models is not meaningful (which may be possible with an arbitrary 50/50 split in the data relative to median HIV DNA volumes: see above).

      We acknowledge that the predictive power of the models generated from these data is modest and we have clarified this point in the revised manuscript. As the reviewer indicates, this may result from the influence of unmeasured variables and possible stochastic processes.  The data may thus demonstrate a limit to the predictability of reservoir size which may be inherent to the underlying biology.  As we mention above, this study size (n-115) is fairly small for the application of ML methods, and an increased sample size will likely improve the accuracy of the models. At this stage, the models we describe are not yet useful as predictive clinical tools, but are still nonetheless useful as tools to describe the structure of the data and identify reservoir associated immune cell types.

      (7) The decision tree is innovative and a useful addition, but does not provide enough discriminatory information to imply causality, mechanism, or directionality in terms of whether the immune phenotype is impacting the reservoir or vice versa or both. Tree accuracy of 80% is marginal for a decision tool.

      The reviewer is correct about these points.  In the revised manuscript, we have attempted to make it clear that we are not yet advocating using this approach as a decision tool, but simply a way to visualize the data and understand the structure of the dataset.  As we discuss above, the models will likely need to be trained on a larger dataset and achieve higher accuracy before use as a decision tool.

      (8) Figure 2: this is not a weakness of the analysis but I have a question about interpretation. If total HIV DNA is more predictive of immune phenotype than intact HIV DNA, does this potentially implicate a prior high burden of viral replication (high viral load &/or more prolonged time off ART) rather than ongoing reservoir stimulation as a contributor to immune phenotype? A similar thought could be applied to the fact that clustering could only be detected when applied to total HIV DNA-associated features. Many investigators do not consider defective HIV DNA to be "part of the reservoir" so it is interesting to speculate why these defective viruses appear to have more correlation with immunophenotype than intact viruses.

      We agree with the reviewer that this observation could reflect prior viral burden and we have added additional text to make this clearer.  Even so, we cannot rule out a model in which defective viral DNA is engaged in ongoing stimulation of the immune system during ART, leading to the stronger association between total DNA and the immune cell phenotypes. We hypothesize that the defective proviruses could potentially be triggering innate immune pattern recognition receptors via viral RNA or DNA, and a higher burden of the total reservoir leads to a stronger apparent association with the immune phenotype.  We have included text in the discussion about this hypothesis.

      (9) Overall, the authors need to do an even more careful job of emphasizing that these are all just correlations. For instance, HIV DNA cannot be proven to have a causal effect on the immunophenotype of the host with this study design. Similarly, immunophenotype may be affecting HIV DNA or the correlations between the two variables could be entirely due to a separate confounding variable

      We have revised the text of the manuscript to emphasize this point, and we acknowledge that any causal relationships are, at this point, simply speculation. 

      (10) In general, in the intro, when the authors refer to the immune system, they do not consistently differentiate whether they are referring to the anti-HIV immune response, the reservoir itself, or both. More specifically, the sentence in the introduction listing various causes of immune activation should have citations. (To my knowledge, there is no study to date that definitively links proviral expression from reservoir cells in vivo to immune activation as it is next to impossible to remove the confounding possible imprint of previous HIV replication.) Similarly, it is worth mentioning that the depletion of intact proviruses is quite slow such that provial expression can only be stimulating the immune system at a low level. Similarly, the statement "Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" seems hard to dissociate from the persistence of immune cells that were reactive to viremia.

      We updated the text of the manuscript to address these points and have added additional citations as per the reviewer’s suggestion.

      (11) Given the many limitations of the study design and the inability of the models to discriminate reservoir volume and phenotype, the limitations section of the discussion seems rather brief.

      We have now expanded the limitations section of the discussion and added additional considerations. We now include a discussion of the study cohort size, composition and the detail provided by the assays.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A few specific comments:

      "This pattern is likely indicative of a more profound association of total HIV DNA with host immunophenotype relative to intact HIV DNA."

      Most studies I have seen (e.g. single cell from Lictherfeld/Yu group) show intact proviruses are generally more activated/detectable/susceptible to immune selection, so I have a hard time thinking defective proviruses are actually more affected by immunotype.

      We hypothesize that this association is actually occurring in the opposite direction – that the defective provirus are having a greater impact on the immune phenotype, due to their greater number and potential ability to engage innate or adaptive immune receptors. We have clarified this point in the manuscript

      "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy."

      I find this a bit of a reach, given that the definition of 2 categories depended on the total size.

      We have modified the language of this section to reduce the level of speculation.

      "This study is cross-sectional in nature and is primarily observational, so caution should be used interpreting findings associated with time on therapy".

      I found this an interesting statement because ultimately time on ART shows up throughout the analysis as a significant predictor, do you mean something about how time on ART could indicate other confounding variables like ART regimen or something?

      We have rephrased this comment to avoid confusion.  We were simply trying to make the point that we should avoid speculating about longitudinal dynamics from cross sectional data.

      "As expected, the plots showed no significant correlation for intact HIV DNA versus years of ART (Figure 1B), while total reservoir size was positively correlated with the time of ART (Figure 1A, Spearman r = 0.31)."<br />  Is this expected? Studies with longitudinal data almost uniformly show intact decay, at least for the first 10 or so years of ART, and defective/total stability (or slight decay). Also probably "time on ART" to not confuse with the duration of infection before ART.

      We have updated the language of this section to address this comment.  We have avoided comparing our data with respect to time on ART to longitudinal studies for reasons given above.

      On dimensionality reduction, as this PaCMAP seems a relatively new technique (vs tSNE and UMAP which are more standard, but absolutely have their weaknesses), it does seem important to contextualize. I think it would still be useful to show PCA and asses the % variance of each additional dimension to assess the effective dimensionality, it would be helpful to show a plot of % variance by # components to see if there is a cutoff somewhere, and if PaCMAP is really picking this up to determine the 2 dimensions/2 clusters is ideal. Figure 4B ultimately shows a lot of low/high across those clusters, and since low/high is defined categorically it's hard to know which of those dots are very close to the other categories.

      We have added this analysis to the manuscript – found in Figure S9. The PCA plot indicates that members of the two clusters also separate on PCA although this separation is not as clear as for the PaCMAP plot.

      Minor comments on writing etc:

      Intro

      -Needs some references on immune activation sequelae paragraph.

      We have added some additional references to this section.

      -"promote the entry of recently infected cells into the reservoir" -- that is only one possible mechanistic explanation, it's not unreasonable but it seems important to keep options open until we have more precise data that can illuminate the mechanism of the overabundance.

      We have modified the text to discuss additional hypotheses.

      -You might also reference Pankau et al Ppath for viral seeding near the time of ART.

      We have added this reference.

      -"Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" - this was unclear to me, do you mean HIV-specific cells that act against HIV during ART? I think most studies show immunity against HIV (CD8 and CD4) wanes over time during ART.

      The Goonetilleke lab has recently generated data indicating that antiviral T cell responses are remarkably stable over time on ART, but we agree with the reviewer that the idea that ongoing antigen expression in the reservoir maintains these cells is speculative.  We have modified the text to make this point clearer.

      -Overall I think the introduction lacked a little bit of definitional precision: i.e. is the reservoir intact vs replication competent vs all HIV DNA and whether we are talking about PWH on long-term ART and how long we should be imagining? The first years of ART are certainly different than later, in terms of dynamics. The ultimate implications are likely specific for some of these categorizations.

      -"persistent sequelae of the massive disruptions to T cell homeostasis and lymphoid structures that occur during untreated HIV infection" needs a lot more context/referencing. For instance, Peter Hunt showed a decrease in activation after ART a long time ago.

      -Heather Best et al show T cell clonality stays perturbed after ART.

      We have updated the text of the introduction and added references to address the reviewer’s comments.

      Results

      -It would be important to mention the race of participants and any information about expected clades of acquired viruses, this gets mentioned eventually with reference to the Table but the breakdown would be helpful right away.

      We have added this information to the results section.

      -"performed Spearman correlations", may be calculated or tested?

      We have corrected the language for this sentence.

      Comments on figures:

      -Figure 1 data on linear scale (re discussion above) -- hard to even tell if there is a decay (to match with all we know from various long-term ART studies).

      -Figure 4 data is shown on ln (log_e) scale, which is hard to interpret for most people.

      -Figures 4 C,D, and E should have box plots to visually assess the significance.

      -Figure 4B legend says purple/pink but I think the colors are different in the plot, could be about transparency

      -Figure 5 it is now not clear if log_e(?).

      -Figure 6 "HIV reservoir characteristics" might be better to make this more explicit. Do you mean for instance in the 6B title Total HIV DNA per million CD4+ T cells I think?

      We have made these modifications.

      Reviewer #2 (Recommendations For The Authors):

      Minor Weaknesses:

      (1) The Introduction is too long and much of the text is not directly related to the study's research question and design.

      We have streamlined the introduction in the revised manuscript.

      (2) While no differences were seen by age or race, according to the authors, this is unlikely to be useful since the numbers are so small in some of these subcategories. Results from sensitivity analyses (e.g., excluding these individuals) may be more informative/useful.

      We agree that the lower numbers of participants for some subgroupings makes it challenging to know for sure if there are any differences based on these variables.  Have added text to clarify this. We have added age, race and gender to the LOCO analysis and to the variable inflation importance analysis (Table S5).

      (3) For Figure 4, based on what was described in the Results section of the manuscript, the authors should clarify that the figures show results for TOTAL HIV DNA only (not intact DNA): "Dimension reduction machine learning approaches identified two robust clusters of PWH when using total HIV DNA reservoir-associated immune cell frequencies (Figure 4A), but not for intact or percentage intact HIV DNA (Figure 4B and 4C)".

      We have added this information.

      (4) The statement on page 5, first paragraph, "Interestingly, when we examined a plot of percent intact proviruses versus time on therapy (Figure 1C), we observed a biphasic decay pattern," is not new (Peluso JCI Insight 2020, Gandhi JID 2023, McMyn JCI 2023). Prior studies have clearly demonstrated this biphasic pattern and should be cited here, and the sentence should be reworded with something like "consistent with prior work", etc.

      We have added citations to these studies and rephrased this comment.

      (5) The Cohort and sample collection sections are somewhat thin. Further details on the cohort details should include at the very minimum some description of the timing of ART initiation (is this mostly a chronic-treated cohort?) and important covariate data such as nadir CD4+ T cell count, pre-ART viral load, duration of ART suppression, etc.

      The cohort was treated during chronic infection, and we have clarified this in the manuscript.  Information regarding CD4 nadir and years on ART are included in Table 1.  Unfortunately, pre-ART viral load was not available for most members of this cohort, so we did not use it for analyses. The partial pre-ART viral load data is included with the dataset we are making publicly available.

      Reviewer #3 (Recommendations For The Authors):

      Minor points:

      (1) What is meant by CD4 nadir? Is this during primary infection or the time before ART initiation?

      We have clarified this description in the manuscript.  This term refers to the lowest CD4 count recorded during untreated infection.

      (2) The authors claim that determinants of reservoir size are starting to emerge but other than the timing of ART, I am not sure what studies they are referring to.

      We have updated the language of this section.  We intended to refer to studies looking at correlates of reservoir size, and feel that this is a more appropriate term that ‘determinants’

      (3) The discussion does not tie in the model-generated hypotheses with the known mechanisms that sustain the reservoir: clonal proliferation balanced by death and subset differentiation. It would be interesting to tie in the proposed reservoir clusters with these known mechanisms.

      We have added additional text to the manuscript to address these mechanisms.

      (4) Figure 1: Total should be listed as total HIV DNA.

      We have updated this in the manuscript.

      (5) Figure 1C: Worth mentioning the paper by Reeves et al which raises the possibility that the flattening of intact HIV DNA at 9 years may be spurious due to small levels of misclassification of defective as intact.

      We have added this reference.

      (6) "Total reservoir frequency" should be "total HIV DNA concentration"

      We respectfully feel that “frequency” is a more accurate term than “concentration”, since we are expressing the reservoir as a fraction of the CD4 T cells, while “concentration” suggests a denominator of volume.

      (7) Figure S2-5: label y-axis total HIV DNA.

      We have updated this figure.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Rtf1 HMD domain facilitates global histone H2B monoubiquitination and regulates morphogenesis and virulence in the meningitis-causing pathogen Cryptococcus neoformans" by Jiang et al., the authors employ a combination of molecular genetics and biochemical approaches, along with phenotypic evaluations and animal models, to identify the conserved subunit of the Paf1 complex (Paf1C), Rtf1, and functionally characterize its critical roles in mediating H2B monoubiquitination (H2Bub1) and the consequent regulation of gene expression, fungal development, and virulence traits in C. deneoformans or C. neoformans. Specially, the authors found that the histone modification domain (HMD) of Rtf1 is sufficient to promote H2B monoubiquitination (H2Bub1) and the expression of genes related to fungal mating and filamentation, and restores the fungal morphogenesis and pathogenicity defects caused by RTF1 deletion.

      Strengths:

      The manuscript is well-written and presents the findings in a clear manner. The findings are interesting and contribute to a better understanding of Rtf1-mediated epigenetic regulation of fungal morphogenesis and pathogenicity in a major human fungal pathogen, and potentially in other fungal species, as well.

      Weaknesses:

      A major limitation of this study is the absence of genome-wide information on Rtf1-mediated H2B monoubiquitination (H2Bub1), as well as a lack of detail regarding the function of the Plus3 domain. Although overexpression of HMD in the rtf1Δ mutant restored global H2Bub1 levels, it did not rescue certain critical biological functions, such as growth at 39 °C and melanin production (Figure 4C-D). This suggests that the precise positioning of H2Bub1 is essential for Rtf1's function. A comprehensive epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 would elucidate potential mechanisms and shed light on the function of the Plus3 domain.

      We thank the reviewer (and other reviewers) for this excellent suggestion. We have planned to carry out CUT&Tag assay to gain a comprehensive epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 under conditions, where overexpression of HMD failed to rescue the phenotypes in the _rtf1_Δ mutant, such as growth at 39 °C.

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to determine the role of Rtf1 in Cryptococcal biology, and demonstrate that Rtf1 acts independently of the Paf1 complex to exert regulation of Histone H2B monoubiquitylation (H2Bub1). The biological impact of the loss of H2Bub1 was observed in defects in morphogenesis, reduced production of virulence factors, and reduced pathogenic potential in animal models of cryptococcal infection.

      Strengths:

      The molecular data is quite compelling, demonstrating that the Rtf1-depednent functions require only this histone modifying domain of Rtf1, and are dependent on nuclear localization. A specific point mutation in a residue conserved with the Rtf1 protein in the model yeast demonstrates the conservation of that residue in H2Bub1 modification. Interestingly, whereas expression of the HMD alone suppressed the virulence defect of the rtf1 deletion mutant, it did not suppress defects in virulence factor production.

      Weaknesses:

      The authors use two different species of Cryptococcus to investigate the biological effect of Rtf1 deletion. The work on morphogenesis utilized C. deneoformans, which is well-known to be a robust mating strain. The virulence work was performed in the C. neoformans H99 background, which is a highly pathogenic isolate. The study would be more complete if each of these processes were assessed in the other strain to understand if these biological effects are conserved across the two species of Cryptococcus. H99 is not as robust in morphogenesis, but reproducible results assessing mating and filamentation in this strain have been performed. Similarly, C. deneoformans does produce capsule and melanin.

      This is a fair point raised by the reviewer, and we are going to test whether these biological effects are conserved across the two species. We will access effects of RTF1 deletion on bisexual mating hyphal formation in C. neoformans H99 background and capsule and melanin productions in C. deneoformans XL280 background.

      There are some concerns with the conclusions related to capsule induction. The images reported in Figure B are purported to be grown under capsule-inducing conditions, yet the H99 panel is not representative of the induced capsule for this strain. Given the lack of a baseline of induction, it is difficult to determine if any of the strains may be defective in capsule induction. Quantification of a population of cells with replicates will also help to visualize the capsular diversity in each strain population.

      We thank the reviewer for raising this concern. We are going to confirm the conclusions related to capsule induction under multiple capsule-inducing conditions, including Dulbecco’s Modified Eagle’s Medium (DMEM), Littman’s medium, and 10% fetal bovine serum (FBS) agar medium [1].

      The authors demonstrate that for specific mating-related genes, the expression of the HMD recapitulated the wild-type expression pattern. The RNA-seq experiments were performed under mating conditions, suggesting specificity under this condition. The authors raise the point in the discussion that there may be differences in Rtf1 deposition on chromatin in H99, and under conditions of pathogenesis. The data that overexpression of HMD restores H2Bub1 by western is quite compelling, but does not address at which promoters H2Bub1 is modulating expression under pathogenesis conditions, and when full-length Rtf1 is present vs. only the HMD.

      We thank the reviewer for raising these concerns. As mentioned in the response to Reviewer 1, our CUT&Tag assay will provide evidence to address these questions.

      Reviewer #3 (Public Review):

      Summary:

      In this very comprehensive study, the authors examine the effects of deletion and mutation of the Paf1C protein Rtf1 gene on chromatin structure, filamentation, and virulence in Cryptococcus.

      Strengths:

      The experiments are well presented and the interpretation of the data is convincing.

      Weaknesses:

      Yet, one can be frustrated by the lack of experiments that attempt to directly correlate the change in chromatin structure with the expression of a particular gene and the observed phenotype. For example, the authors observed a strong defect in the expression of ZNF2, a known regulator of filamentation, mating, and virulence, in the rtf1 mutant. Can this defect explain the observed phenotypes associated with the RTF1 mutation? Is the observed defect in melanin production associated with altered expression of laccase genes and altered chromatin structure at this locus?

      We completely agree with the reviewer, and as mentioned in our response to Reviewer 1 and 2, we are going to conduct CUT&Tag assay to investigate the genetic relationship between Rtf1-mediated H2Bub1 and the expression of particular genes.

      (1) Jang, E.-H., et al., Unraveling Capsule Biosynthesis and Signaling Networks in Cryptococcus neoformans. Microbiology Spectrum, 2022. 10(6): p. e02866-22.

    1. Author response:

      We thank the editor and reviewers for the time they spent reviewing our manuscript entitled ‘Overnight fasting facilitates safety learning by changing the neurophysiological response to relief from threat omission’ which was sent as an original paper for a potential publication in eLife.

      Since we take the reviewer comments at heart and recognize the very complex scenario of our previous and current results we will take more time to re-think the paper. This time will serve us to look back to the interpretation of the results of our previous behavioral study, to the preregistration plan as well as findings of our current fMRI (replication) study.

      We aim to address the fundamental issues indicated by the reviewers as soon and as clearly as possible.

    1. Author response:

      “Overall, the paper has several strengths, including leveraging large-scale, multi-modal datasets, using computational reasonable tools, and having an in-depth discussion of the significant results.”

      We thank the reviewer for the very supportive comments.

      Based on the comments and questions, we have grouped the concerns and corresponding responses into three categories.

      (1) The scope and data selection

      “The results are somewhat inconclusive or not validated.

      The overall results are carefully designed, but most of the results are descriptive. While the authors are able to find additional evidence either from the literature or explain the results with their existing knowledge, none of the results have been biologically validated. Especially, the last three result sections (signaling pathways, eQTLs, and TF binding) further extended their findings, but the authors did not put the major results into any of the figures in the main text.”

      The goal of this manuscript is to provide a list of putative childhood obesity target genes to yield new insights and help drive further experimentation. Moreover, the outputs from signaling pathways, eQTLs, and TF binding, although noteworthy and supportive of our method, were not particularly novel. In our manuscript we placed our focus on the novel findings from the analyses. We did, however, report the part of the eQTLs analysis concerning ADCY3, which brought new insight to the pathology of obesity, in Figure 4C.

      “The manuscript would benefit from an explanation regarding the rationale behind the selection of the 57 human cell types analyzed. it is essential to clarify whether these cell types have unique functions or relevance to childhood development and obesity.”

      We elected to comprehensively investigate the GWAS-informed cellular underpinnings of childhood development and obesity. By including a diverse range of cell types from different tissues and organs, we sought to capture the multifaceted nature of cellular contributions to obesity-related mechanisms, and open new avenues for targeted therapeutic interventions.

      There are clearly cell types that are already established as being key to the pathogenesis of obesity when dysregulated: adipocytes for energy storage, immune cell types regulating inflammation and metabolic homeostasis, hepatocytes regulating lipid metabolism, pancreatic cell types intricately involved in glucose and lipid metabolism, skeletal muscle for glucose uptake and metabolism, and brain cell types in the regulation of appetite, energy expenditure, and metabolic homeostasis.

      While it is practical to focus on cell types already proven to be associated with or relevant to obesity, this approach has its limitations. It confines our understanding to established knowledge and rules out the potential for discovering novel insights from new cellular mechanisms or pathways that could play significant roles in the pathogenesis if obesity. Therefore, it is was essential to reflect known biology against the unexplored cell types to expand our overall understanding and potentially identify innovative targets for treatment or prevention.

      “I wonder whether the used epigenome datasets are all from children. Although the authors use literature to support that body weight and obesity remain stable from infancy to adulthood, it remains uncertain whether epigenomic data from other life stages might overlook significant genetic variants that uniquely contribute to childhood obesity.”

      The datasets utilized in our study were derived from a combination of sources, both pediatric and adult. We recognize that epigenetic profiles can vary across different life stages but our principal effort was to characterize susceptibility BEFORE disease onset.

      “Given that the GTEx tissue samples are derived from adult donors, there appears to be a mismatch with the study's focus on childhood obesity. If possible, identifying alternative validation strategies or datasets more closely related to the pediatric population could strengthen the study's findings.” 

      We thank the reviewer for raising this important point. We acknowledge that the GTEx tissue samples are derived from adult donors, which might not perfectly align with the study's focus on childhood obesity. The ideal strategy would be a longitudinal design that follows individuals from childhood into adulthood to bridge the gap between pediatric and adult data, offering systematic insights into how early-life epigenetic markers influencing obesity later in life. In future work, we aim to carry out such efforts, which will represent substantial time and financial commitment.

      Along the same lines, the Developmental Genotype-Tissue Expression (dGTEx) Project is a new effort to study development-specific genetic effects on gene expression at 4 developmental windows spanning from infant to post-puberty (0-18 years). Donor recruitment began in August 2023 and remains ongoing. Tissue characterization and data production are underway. We hope that with the establishment of this resource, our future research in the field of pediatric health will be further enhanced.

      “Figure 1B: in subplots c and d, the results are either from Hi-C or capture-C. Although the authors use different colors to denote them, I cannot help wondering how much difference between Hi-C and capture-C brings in. Did the authors explore the difference between the Hi-C and capture-C?”.

      Thank you for your comment. It is not within the scope of our paper to explore the differences between the Hi-C and Capture-C methods. In the context of our study, both methods serve the same purpose of detecting chromatin loops that bring putative enhancers to sometimes genomically distant gene promoters. Consequently, our focus was on utilizing these methods to identify relevant chromatin interactions rather than comparing their technical differences.

      (2) Details on defining different categories of the regions of interest

      “Some technical details are missing.

      While the authors described all of their analysis steps, a lot of the time, they did not mention the motivation. Sometimes, the details were also omitted.”

      We will add a section to the revision to address the rationale behind different OCRs categories.

      “Line 129: should "-1,500/+500bp" be "-500/+500bp"? 

      A gene promoter was defined as a region 1,500 bases upstream to 500 bases downstream of the TSS. Most transcription factor binding sites are distributes upstream (5’) from TSS, and the assembly of transcription machinery occurs up to 1000 bases 5’ from TSS. Given our interest in SNPs that can potentially disrupt transcription factor binding, this defined promoter length allowed us to capture such SNPs in our analyses.

      “How did the authors define a contact region?”

      Chromatin contact regions identified by Hi-C or Capture-C assays are always reported as pairs of chromatin regions. The Supplementary eMethods provide details on the method of processing and interaction calling from the Hi-C and Capture-C data.

      “The manuscript would benefit from a detailed explanation of the methods used to define cREs, particularly the process of intersecting OCRs with chromatin conformation data. The current description does not fully clarify how the cREs are defined.”

      “In the result section titled "Consistency and diversity of childhood obesity proxy variants mapped to cREs", the authors introduced the different types of cREs in the context of open chromatin regions and chromatin contact regions, and TSS. Figure 2A is helpful in some way, but more explanation is definitely needed. For example, it seems that the authors introduced three chromatin contacts on purpose, but I did not quite get the overall motivation.”

      We apologize for the confusion. Our definition of cREs is consistent throughout the study. Figure 2A will be the first Figure 1A in the revision in order to aid the reader.

      The 3 representative chromatin loops illustrate different ways the chromatin contact regions (pairs of blue regions under blue arcs) can overlap with OCRs (yellow regions under yellow triangles – ATAC peaks) and gene promoters.

      [1] The first chromatin loop has one contact region that overlaps with OCRs at one end and with the gene promoter at the other. This satisfies the formation of cREs; thus, the area under the yellow ATAC-peak triangle is green.

      [2] The second loop only overlapped with OCR at one end, and there was no gene promoter nearby, so it is unqualified as cREs formation.

      [3] The third chromatin loop has OCR and promoter overlapping at one end. We defined this as a special cRE formation; thus, the area under the yellow ATAC-peak triangle is green.

      To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 2A: The authors used triangles filled differently to denote different types of cREs but I wonder what the height of the triangles implies. Please specify.”

      The triangles are illustrations for ATAC-seq peaks, and the yellow chromatin regions under them are OCRs. The different heights of ATAC-seq peaks are usually quantified as intensity values for OCRs. However, in our study, when an ATAC-seq peak passed the significance threshold from the data pipeline, we only considered their locations, regardless of their intensities. To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 1B-c. the title should be "OCRs at putative cREs". Similarly in Figure 1B-d.”

      cREs are a subset of OCRs.

      - In the section "Cell type specific partitioned heritability", the authors used "4 defined sets of input genomic regions". Are you corresponding to the four types of regions in Figure 2A? 

      Figure 2A will be the first Figure 1A in the revision and will be modified to showcase how we define OCRs and cREs.

      “It seems that the authors described the 771 proxies in "Genetic loci included in variant-to-genes mapping" (ln 154), and then somehow narrowed down from 771 to 94 (according to ln 199) because they are cREs. It would be great if the authors could describe the selection procedure together, rather than isolated, which made it quite difficult to understand.”

      In the Methods section entitled “Genetic loci included in variant-to-genes mapping," we described the process of LD expansion to include 771 proxies from 19 sentinel obesity-significantly associated signals. Not all of these proxies are located within our defined cREs. Figure 2B, now Figure 2A in the revision, illustrates different proportions of these proxies located within different types of regions, reducing the proxy list to 94 located within our defined cREs.

      “Figure 2. What's the difference between the 771 and 758 proxies? “

      13 out of 771 proxies did not fall within any defined regions. The remaining 758 were located within contact regions of at least one cell type regardless of chromatin state.

      (3) Typos

      “In the paragraph "Childhood obesity GWAS summary statistics", the authors may want to describe the case/control numbers in two stages differently. "in stage 1" and "921 cases" together made me think "1,921" is one number.”

      This will be amended in the revision.

      “Hi-C technology should be spelled as Hi-C. There are many places, it is miss-spelled as "hi-C". In Figure 1, the author used "hiC" in the legend. Similarly, Capture-C sometime was spelled as "capture-C" in the manuscript.”

      “At the end of the fifth row in the second paragraph of the Introduction section: "exisit" should be "exist".

      “In Figure 2A: "Within open chromatin contract region" should be "Within open chromatin contact region". 

      These typos and terminology inconsistencies will be amended in the revision.

    1. Author response:

      Provisional author response to Reviewer #1<br /> We would like the reviewer for his/her careful evaluation of our manuscript and appreciate his/her appraisal for the strengths of our study. Regarding the weaknesses, we plan to address these as good as possible during the revision of our manuscript.<br /> We can already state that miR-26b has clear anti-inflammatory effects on human liver slices, which is in line with our results demonstrating that miR-26b plays a protective role in MASH development in mice. The notion that patients with liver cirrhosis have increasing plasma levels of miR-26b, seems contradictory at first glance. However, we believe that this increased miR-26b expression is a compensatory mechanism to counteract the MASH/cirrhotic effects. However, the exact source of this miR-26b remains to be elucidated in future studies.<br /> The performed kinase activity analysis revealed that miR-26b affects kinases that particularly play an important role in inflammation and angiogenesis. Strikingly and supporting these data, these effects could be inverted again by LNP treatment. Combined, these results already provide strong mechanistic insights on molecular and intracellular signalling level. Although the exact target of miR-26b remains elusive and its identification is probably beyond the scope of the current manuscript due to its complexity, we believe that the kinase activity results already provide a solid mechanistic basis.

      Provisional author response to Reviewer #2<br /> We would like the reviewer for his/her careful evaluation of our manuscript and appreciate his/her appraisal for the strengths of our study. Regarding the weaknesses, we plan to address these as good as possible during the revision of our manuscript. Particularly the validation suggestions are very valuable and we plan to address these in the revision by performing additional experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Komarova et al. investigate the clinical prognostic ability of cell-level metabolic heterogeneity quantified via the fluorescence lifetime characteristics of NAD(P)H. Fluorescence lifetime imaging microscopy (FLIM) has been studied as a minimally invasive approach to measure cellular metabolism in live cell cultures, organoids, and animal models. Its clinical translation is spearheaded through macroscopic implementation approaches that are capable of large sampling areas and enable access to otherwise constrained spaces but lack cellular resolution for a one-to-one transition with traditional microscopy approaches, making the interpretation of the results a complicated task. The merit of this study primarily lies in its design by analyzing with the same instrumentation and approach colorectal samples in different research scenarios, namely in vitro cells, in vivo animal xenografts, and tumor tissue from human patients. These conform to a valuable dataset to explore the translational interpretation hurdles with samples of increasing levels of complexity. For human samples, the study specifically investigates the prediction ability of NAD(P)H fluorescence metrics for the binary classification of tumors of low and advanced stage, with and without metastasis, and low and high grade. They find that NAD(P)H fluorescence properties have a strong potential to distinguish between high- and low-grade tumors and a moderate ability to distinguish advanced-stage tumors from low-stage tumors. This study provides valuable results contributing to the deployment of minimally invasive optical imaging techniques to quantify tumor properties and potentially migrate into tools for human tumor characterization and clinical diagnosis.

      Strengths:

      The investigation of colorectal samples under multiple imaging scenarios with the same instrument and approach conforms to a valuable dataset that can facilitate the interpretation of results across the spectrum of sample complexity.

      The manuscript provides a strong discussion reviewing studies that investigated cellular metabolism with FLIM and the metabolic heterogeneity of colorectal cancer in general.

      The authors do a thorough acknowledgement of the experimental limitations of investigating human samples ex vivo, and the analytical limitation of manual segmentation, for which they provide a path forward for higher throughput analysis.

      Weaknesses:

      To substantiate the changes in fluorescence properties at the examined wavelength range (associated with NAD(P)H fluorescence) in relationship to metabolism, the study would strongly benefit from additional quantification of metabolic-associated metrics using currently established standard methods. This is especially interesting when discussing heterogeneity, which is presumably high within and between patients with colorectal cancer, and could help explain the particularities of each sample leading to a more in-depth analysis of the acquired valuable dataset.

      In order to address this issue, we have performed immunohistochemical staining of the available tumor samples for the two standard metabolic markers GLUT3 and LDHA.

      The results are included in Supplementary (Fig.S4). Discussion has been extended.

      Additionally, NAD(P)H fluorescence does not provide a complete picture of the cell/tissue metabolic characteristics. Including, or discussing the implications of including fluorescence from flavins would comprise a more compelling dataset. These additional data would also enable the quantification of redox metrics, as briefly mentioned, which could positively contribute to the prognosis potential of metabolic heterogeneity.

      We agree with the Reviewer that fluorescence from flavins could be helpful to obtain more complete data on cellular metabolic states. However, we lack to detect sufficiently intensive emission from flavins in colorectal cancer cells and tissues. The paragraph about flavins was added in Discussion and representative images - in Supplementary Material (Figure S5).

      In the current form of the manuscript, there is a diluted interpretation and discussion of the results obtained from the random forest and SHAP analysis regarding the ability of the FLIM parameters to predict clinicopathological outcomes. This is, not only the main point the authors are trying to convey given the title and the stated goals, but also a novel result given the scarce availability of these type of data, which could have a remarkable impact on colorectal cancer in situ diagnosis and therapy monitoring. These data merit a more in-depth analysis of the different factors involved. In this context, the authors should clarify how is the "trend of association" quantified (lines 194 and 199).

      We thank the Reviewer for this suggestion. The section has been updated with SHAP analysis using different parameters (dispersion D of t2, a1, tm and bimodality index BI of t2, a1, tm). It is now more clear that D-a1 is more strongly associated with clinicopathological outcomes compared with other variables. We have also added some biological interpretation of these results in the Discussion.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Metabolic heterogeneity of colorectal cancer as a prognostic factor: insights gained from fluorescence lifetime imaging" by Komarova et al., the authors used fluorescence lifetime imaging and quantitative analysis to assess the metabolic heterogeneity of colorectal cancer. Generally, this work is logically well-designed, including in vitro and in vivo animal models and ex vivo patient samples. However, since the key parameter presented in this study, the BI index, is already published in a previous paper by this group (Shirshin et al., 2022), and the quantification method of metabolic heterogeneity has already been well (and even better) described in previous studies (such as the one by Heaster et al., 2019), the novelty of this study is doubted. Moreover, I am afraid that the way of data analysis and presentation in this study is not well done, which will be mentioned in detail in the following sections.

      Strengths:

      (1) Solid experiments are performed and well-organized, including in vitro and in vivo animal models and ex vivo patient samples.

      (2) Attempt and efforts to build the association between the metabolic heterogeneity and prognosis for colorectal cancer.

      Weaknesses:

      (1) The human sample number (from 21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis;.

      Additional 8 samples of patients’ tumors collected while the manuscript was under review were added to the present data. We agree that the number is still limited to conclude about the prognostic value of cell-level metabolic heterogeneity. But at this point we can expect that this parameter will become a metric for prognosis. We will continue this study to collect more samples of colorectal tumors and expand the approach to different cancer types.

      (2) The BI index or similar optical metrics have been well established by this and other groups; therefore, the novelty of this study is doubted.

      The purpose of this research was to quantify and compare the cellular metabolic heterogeneity across the systems of different complexity - commercial cell lines, tumor xenografts and patients’ tumors - using previously established FLIM-based metrics. For the first time, using FLIM, it was shown that heterogeneity of patients’ samples is much higher than of laboratory models and that it has associations with clinical characteristics of the tumors - the stage and the grade. In addition, this study provides evidence that bimodality (BI) in the distribution of metabolic features in the cell population is less important than the width of the spread (the dispersion value D).

      Some corrections have been made in the text on this point.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following comments should be addressed to strengthen the rigor and clarity of the manuscript.

      (1) The ethical committee that approved the human studies should also be mentioned in the methods section, as was done with the animal studies.

      Information about the ethics committee has been added in the Manuscript.

      The study with the use of patients’ material was approved by the ethics committee of the Privolzhsky Research Medical University (approval № 09 from 30.06.2023).

      (2) The captions in Figures 2 and 3 must be revised. In Figure 2, it seems the last 2 sentences for the description of (C) do not belong there, and instead, the last sentence in the description of (D) may need to be included in (C) instead. Figure 3 is similar.

      The captions were revised.

      (3) From supplement Figure S2 it seems that EpCam and vimentin staining were only done in two of the mouse tumor types. No further mention is made in the results or methods section. Is there any reason this was not performed in the other tumor types? Were the histology and IHC protocols the same for the mouse and human tumors?

      The data on other tumor types and patients’ tumors have been added in Figure S3. Discussion was extended with the following paragraph.

      One of the possible reasons for metabolic heterogeneity could be the presence of stromal cells or diversity of epithelial and mesenchymal phenotypes of cancer cells within a tumor. Immunohistochemical staining of tumors for EpCam (epithelial marker) and vimentin (mesenchymal marker) showed that the fraction of epithelial, EpCam-positive, cells was more than 90% in tumor xenografts and on average 76±10 % in patients’ tumors (Figure S3). However, the ratio of EpCam- to vimentin-positive cells in patients’ samples neither correlated with D-a1 nor with BI-a1, which means that the presence of cells with mesenchymal phenotype did not contribute to metabolic heterogeneity of tumors identified by NAD(P)H FLIM.

      (4) Clarify the design of the experiments: The results come from 50 - 200 cells in each sample (except 30 in the CaCo2 cell culture) that were counted from 5 - 10 images acquired from each sample. There were 21 independent human samples. How many independent samples were included in the cell culture experiments and the mouse tumor models? Why is there an order of magnitude fewer cells included in the CaCo2 group compared to the other groups (Figure 1)? From the image (Figure 1A - CaCo2), it seems to be a highly populated type of sample, yet only 30 cells were quantified. What prevents the inclusion of the same number of cells to be quantified in each group for a more systematic evaluation?

      We thank the Reviewer for this comment.

      Cell culture experiments included two independent replicates for each cell line, the data from which were then combined. In animal experiments measurements were made in three mice (numbered 1-3 in Figure 2C) for each tumor type. We have made calculations for additional >100 cells of CaCo2 cell line. In the revised version the number of Caco2 cells is 146.

      The text of the Manuscript was revised accordingly.

      (5) Regarding references: Some claims throughout the text would benefit from an additional reference. For example: line 70 "Metabolic heterogeneity [...] is believed to have prognostic value"; line 121 " [...] the uniformity of cell metabolism in a culture, which is consistent with the general view on standard cell lines [...]". The clinical translational aspect (i.e., paragraph in line 255) warrants the inclusion of the efforts already done with FLIM imaging in the clinical setting both in vivo and ex vivo with point-spectroscopy and macroscopy imaging (e.g., Jo Lab, Marcu Lab, French Lab, and earlier work by Mycek and Richards-Kortum in colorectal cancer to name a few).

      Additional references were added.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the Introduction, line 85, the authors mention that "Specifically, the unbound state of NAD(P)H has a short lifetime (~0.4 ns) and is associated with glycolysis, while the protein-bound state has a long lifetime (~1.7-3.0 ns) and is associated with OXPHOS". I do not think this claim is appropriate. One cannot simply say that the unbound state is associated with glycolysis, nor that the bound state is associated with OXPHOS; both unbound and bound state are associated with almost all the metabolic pathways. Instead, the expression of "glycolytic/ OXPHOS shift", as authors used in other sections of this manuscript, is a more appropriate one in this case.

      The text of the Introduction was revised.

      (2) What are the biological implications of the bimodality index (BI)? Please provide specific insights.

      Bimodal distribution indicates there are two separate and independent peaks in the population data. In the metabolic FLIM data, this indicates that there are two sub-populations of cells with different metabolic phenotypes. Previously, we have observed bimodal distribution in the population of chemotherapy treated cancer cells, where one sub-population was responsive (shifted metabolism) and the second - non-responsive (unchanged metabolism) [Shirshin et al., PNAS, 2022]. In the naive tumor, a number of factors have an impact on cellular metabolism, including genetics features and microenvironment, so it is difficult to determine which ones resulted in bimodality. Our data on correlation of bimodality (BI) with clinical characteristics of the tumors show that there are no associations between them. What really matters is the width of the parameter spread in the population. The early-stage tumors (T1, T2) were metabolically more heterogeneous than the late-stage ones (T3, T4). A degree of heterogeneity was also associated with differentiation state, a stage-independent prognostic factor in colorectal cancer where the lower grade correlates with better the prognosis. The early-stage tumors (T1, T2) and high-grade (G3) tumors had significantly higher dispersion of NAD(P)H-a1, compared with the late-stage (T3, T4) and low-grade ones (G1, G2). From the point of view of biological significance of heterogeneity, this means that in stressful and unfavorable conditions, to which the tumor cells are exposed, the spread of the parameter distribution in the population rather than the presence of several distinct clusters (modes) matters for adaptation and survival. The high diversity of cellular metabolic phenotypes provided the survival advantage, and so was observed in more aggressive (undifferentiated or poorly differentiated) and the least advanced tumors.

      The discussion has been expanded on this account.

      (3) Have you run statistics in Figure 1B? If yes, do you find any significance? The same question also applies to Figures 2C and 3C.

      We performed statistical analysis to compare different cell lines in in vitro and in vivo models, the results obtained are presented in Table S4.

      (4) Line 119, why is the BI threshold set at 1.1?

      When setting the BI threshold at 1.1, we relied on the work by Wang et al, Cancer Informatics, 2009. The authors recommended the 1.1 cutoff as more reliable to select bimodally expressed genes. Further, we validated this BI threshold to identify chemotherapy responsive and non-responsive sub-populations of cancer cells (Shirshin et al. PNAS, 2022)

      (5) Line 123, what does the high BI of mean lifetime stand for? Please provide biological implications and insights.

      The sentence was removed because inclusion of additional CaCo2 cells (n=146) for quantification NAD(P)H FLIM data showed no bimodality in this cell culture.

      (6) In the legend for Figure 2C, the authors mention that "the bimodality index (BI-a1) is shown above each box"; however, I do not see such values. It is also true for Figure 3C.

      The legends for Fig. 2 and 3 were corrected.

      (7) In Figure 2, t1-t3 were not explained and mentioned in the main text. What do they mean? Do they mean different time points or different tumors?

      t1-t3 means different tumors in a group. Changes have been made to the figure - individual tumors are indicated by numbers.

      (8) In Figure 3, what do p13, p15 and p16 mean? It is not clearly explained. If they just represent patients numbered 13, 15, and 16, then why are these patients chosen as representatives? Do they represent different stages or are they just chosen randomly?

      Figure 3 was revised. Representative images were changed and a short description for each representative sample was included. In the revised version, representatives have been selected to show different stages and grades.

      (9) In Figure 3, instead of showing the results for each patient, I would suggest that authors show representative results from tumors at different stages; or, at least, clearly indicate the specific information for each patient. I do not think that providing the patient number only without any patient-specific information is helpful.

      Figure 3 was revised.

      (10) The sample number (21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis.

      Additional eight samples were added. The text, figures and tables were revised accordingly.

      (11) In Discussion, it would be helpful to compare the BI index used in this study with the previously developed OMI-index (Line 275).

      We believe that BI index and OMI index describe different things and, therefore, it is hard to compare them. While BI index is used to describe the degree of the metabolic heterogeneity, OMI index is an integral parameter that includes redox ratio, mean fluorescence lifetimes of NAD(P)H and FAD, and rather indicates the metabolic state of a cell. In this sense it is more relevant to compare it with conventional redox ratio or Fluorescence Lifetime Redox Ratio (FLIRR) (H. Wallrabe et al., Segmented cell analyses to measure redox states of autofluorescent NAD(P)H, FAD & Trp in cancer cells by FLIM, Sci. Rep. 2018; 8: 79). The assessment of the heterogeneity of the FLIM parameters has been previously reported using the weighted heterogeneity (wH) index (Amy T. Shah et al, In Vivo Autofluorescence Imaging of Tumor Heterogeneity in Response to Treatment, Neoplasia 17, pp. 862–870 (2015). To the best of our knowledge, this is the only metric to quantify metabolic heterogeneity on the basis of FLIM data for today. A comparison of BI with the wH-index showed that the value of wH-index provides results similar to BI in the heterogeneity evaluation as demonstrated in our earlier paper (E.A. Shirshin et al, Label-free sensing of cells with fluorescence lifetime imaging: The quest for metabolic heterogeneity, PNAS 119 (9) e2118241119 (2022).  Yet, the BI provides dimensionless estimation on the inherent heterogeneity of a sample, and therefore it can be used to compare heterogeneity assessed by different decay parameters and FLIM data analysis methods. The limitation of using the OMI index for FLIM data analysis is the low intensity of the FAD signal, which was the case in our experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      We would like to see the major conclusions constrained to better fit the data presented in the manuscript. Speed is only a single performance metric of a very complicated, very diverse system of locomotion.

      If the authors would like to maintain the broader conclusions, the study should be repeated with a number of different performance metrics to shore up the manuscript's results. Particularly with efficiency, speed is not a reliable measure of efficiency to begin with, so this needs to be explored in a more targeted and appropriate manner.

      We agree with Reviewer 1 that we should be more precise about the fitness metrics used and more constrained about the conclusions. Considering the points raised in each paragraph, we’ve modified the text as follows:

      - [line 17] “... to test the necessity of both traits for sustained and effective displacement on the ground.”

      - [starting on line 105] “We generate the robot’s sample using an artificial evolutionary process that selects for better locomotion ability - defined as higher average speed as it is a proxy for organisms with sustained and effective displacement.”

      - [starting on line 287] “We also found that different gravitational environments require different shape structures to optimize locomotion average speed.”

      - [starting on line 311] “This consistency is evidence that a small number of sparsely connected modules is a morphological computation principle for an organism’s optimized average speed.”

      - [starting on line 348] “Beyond that, extending the tests for other important aspects of locomotion behavior - as noise on the ground, energetic costs, and maneuverability - by using other locomotion metrics - as energy efficiency, stability margin, and dissipated power (Paez and Melo, 2014; Aoi et al., 2016 ) - would also be relevant to evaluate the principle’s robustness.”

      - [starting on line 524] “As the robots with the highest average speed are the ones that succeed in maximizing displacement and having robust dynamics (they will not tumble with time), we defined $\bar s$ as the fitness value using it as a proxy of successful directed locomotion. Selecting for bodies that maximize speed is a common locomotion bias in natural selection, as both predators and prey and thus fecundity and mortality depend on it (Alexander, 2006). Other measures - such as energy efficiency - can capture distinct important aspects of the locomotion complexity (Paez and Melo, 2014) and would be worthy of investigating in future work.”

      Paper Premise/Mission Statement: As defined in the abstract and also called out in the text starting on line 59 is "investigate whether symmetry and modularity are features of an organism's shape need [authors italics] to have for better-directed locomotion..."

      If we understood correctly the reviewer is asking for more precision in the statement. We modified the respective sentence in the following way:

      - [line 62] “... need to have for optimizing average speed on the ground,”

      Reviewer #2 (Recommendations For The Authors):

      i) a lot of details that are in the captions should be moved in the main text;

      Thank you for this comment. We reviewed all the captions and text making modifications to ensure that all the information in the captions is also present in the main text. Below, we highlighted some of the changes:

      - [line 57] “Thus, locomotion on the ground is present in phylogenetically distant species (such as the maned wolf and frogfish in Figure 1A) and depends upon … “

      - [starting on line 64] “Figure 1B shows a schematic representation of symmetry and modularity on the maned wolf and frogfish bodies.”

      - [starting on line 277] “There is a negative correlation between the proportion of feet voxels and the robot’s locomotion transference capability when the robots go to an environment with higher gravity, i.e., water to mars (dark blue in Figure 5C), water to earth (light blue), and mars to earth (red) - with a Spearman correlation coefficients of r = -0.39, r = -0.43, and r = -0.32, respectively, all with p < 1e-08.”

      ii) hypotheses should be spelled out more clearly;

      We verified the experiments and certified that every experiment had a clear hypothesis statement in the original manuscript. Before each section defining the hypothesis and describing the experiment, we added the following statement:

      - [starting on line 119] “ With this sample, we tested the hypotheses about the relationships between locomotion performance and body modularity and symmetry (Figure 1I).”

      iii) performance metrics and other features should be better defined using mathematical terms if possible (for example, instability);

      Thank you for the comment. We added a definition for instability in the text:

      - [starting on line 218] “Nonetheless, locomotion requires a minimum instability - the dynamic possibility of translating the center of mass - in the direction axis to generate the necessary forward displacement (Bruijn et al., 2013; Nagarkar et al., 2021).”

      Despite the different definitions of instability in literature (Bruijn et al., 2013, Paez and Melo, 2014; Aoi et al., 2016, Nagarkar et al., 2021), we didn’t find one mathematical definition that fits perfectly in our context.

      Following the reviewer's comment, when necessary we expanded the definition for other features:

      - [starting on line 199] “... the distribution of body weight. As the robots do not have sensory feedback abilities, the weight balance is defined as the body’s movement due to gravity forces (consequences of the weight distribution and surface contact points) (Benda et al., 1994). We hypothesized that the robots with the best directed locomotion ability would tend to have a symmetric body shape. A robot with a low XY shape symmetry (XY shape symmetry < 0.5) has a higher chance of having a poor weight balance, increasing the chance of the body tipping over, thus leading it to a lousy locomotion performance (blue dotted line in Figure 3C). “

      iv)  more details regarding the simulations should be included;

      We thank the reviewer for this comment. If we understood correctly the Reviewer 2 is asking for more details regarding: “a) the adequacy of the spatial resolution, whereby I failed to see a compelling argument regarding the completeness of 64 voxels; b) the realism of the oscillatory patterns, whereby all the voxels are set to oscillate at the same, constant, frequency of 2Hz; and c) the accuracy of simulations in water where added mass effects seem to be neglected.”. We modified the text to better satisfy these concern:

      a) [starting on line 96] “We choose to first explore exhaustively the $4^3$ space dimension, as it is the minimal possible space that allows meaningful body plans. We also did control experiments within 6^3 and 8^3 to check for dimension size effects.”

      - [starting on line 432] “We did control experiments with robots within 6³ and 8³ dimensions to check for dimension size effects - and we found that the results found in 4³ remained valid. We choose to focus our analysis in the 4³ design space because we consider it the minimum coarse-grain to approach the biological question about the contingency of shape outcomes pressured for locomotion. Smaller spaces do not allow sufficient complexity in the body structures, and increasing spatial resolution reduces the extensiveness of the investigated search space.”

      b) [starting on line 451] “… we used a fixed oscillation frequency of 𝑓 = 2 Hz (Kriegman et al.,2020). A fixed frequency value reduces the number of degrees of freedom in the search for solutions, but in return, it narrows the direct connection between the simulated organisms and animals. Exploring different frequency values in future work would be important to investigate the impact of varied oscillatory frequencies in the shape solutions for directed locomotion.”

      c) The environment we call “water” is not an accurate modeling of aquatic habitats as we didn’t simulate essential forces such as draff effects. This choice is explained in text starting on line 110: “In the water-like environment the bodies have nullifying body weight but do not have drag effects. We did not add drag in our simulations because our aim is to study just the body weight influences in locomotion independently of other forces.”

      v) a full paragraph about limitations should be included in the discussions, focusing on both simulation aspects (for example, the use of simple spring elements in the voxels) and theoretical assumptions (for example, addressing the potential role of non-locomotion-related aspects).

      We thank the reviewer for the comment. We edited some paragraphs of the discussion section to make more explicit some limitations of our work:

      [starting on line 398] “We expect that including other important aspects of an animal's body as a developmental process and sensory functions could influence the shape's outcomes with other layers of principles. Although we based our simulations on an already successful transference of \textit{in silico} behavior to organisms made of biological tissue

      \citep{kriegman_scalable_2020}, there is an intrinsic gap between spring-mass robots modeling and animal’s bodies that is worthy of exploring to ensure the generality of our results. Other methods, such as the inclusion of rigid body elements in the simulation (possible in Voxelyze), the use of finite element modeling (FEM) (Coevoet et al., 2019), and the construction of physical robots (Aguilar et al., 2016), are important complements to this work. Beyond that, principles on other scales as in the genotypes (Johnston et al., 2022) and in other behavioral phenotypes (Gomez-Marin et al., 2016) could also be investigated.”

      To address the potential role of non-locomotion-related aspects, we revised the section

      “Discussion - Contingency of evolutionary outcomes” where we discussed other functional and biological roles:

      [starting on line 354 ] “Here we investigate how a specific functional cause - optimization of average speed during directed locomotion on the ground - externally defines the phenotypic space of shape possibilities.”

      [starting on line 359] “For simplification purposes, we choose to not explicitly control other important factors of locomotion (i.e., energy consumption, maneuverability) that nonlinearly interact during locomotion. In future studies, it would be important to conduct similar studies on a wider range of factors to study the shape and dynamic principles in different conditions.“

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors developed an extension to the pairwise sequentially Markov coalescent model that allows to simultaneously analyze multiple types of polymorphism data. In this paper, they focus on SNPs and DNA methylation data. Since methylation markers mutate at a much faster rate than SNPs, this potentially gives the method better power to infer size history in the recent past. Additionally, they explored a model where there are both local and regional epimutational processes. Integrating additional types of heritable markers into SMC is a nice idea which I like in principle. However, a major caveat to this approach seems to be a strong dependence on knowing the epimutation rate. In Fig. 6 it is seen that, when the epimutation rate is known, inferences do indeed look better; but this is not necessarily true when the rate is not known. (See also major comment #1 below about the interpretation of these plots.) A roughly similar pattern emerges in Supp. Figs. 4-7; in general, results when the rates have to be estimated don't seem that much better than when focusing on SNPs alone. This carries over to the real data analysis too: the interpretation in Fig. 7 appears to hinge on whether the rates are known or estimated, and the estimated rates differ by a large amount from earlier published ones.

      Overall, this is an interesting research direction, and I think the method may hold more promise as we get more and better epigenetic data, and in particular better knowledge of the epigenetic mutational process. At the same time, I would be careful about placing too much emphasis on new findings that emerge solely by switching to SNP+SMP analysis.

      Major comments:

      - For all of the simulated demographic inference results, only plots are presented. This allows for qualitative but not quantitative comparisons to be made across different methods. It is not easy to tell which result is actually better. For example, in Supp. Fig. 5, eSMC2 seems slightly better in the ancient past, and times the trough more effectively, while SMCm seems a bit better in the very recent past. For a more rigorous approach, it would be useful to have accompanying tables that measure e.g. mean-squared error (along with confidence intervals) for each of the different scenarios, similar to what is already done in Tables 1 and 2 for estimating $r$.

      We believe this comment was addressed in the previous revision (Sup Table 6-10) by adding Root Mean Square Errors for the demographic estimates (and RMSE for recent versus past portions of the demography). 

      - 434: The discussion downplays the really odd result that inputting the true value of the mutation rate, in some cases, produces much worse estimates than when they are learned from data (SFig. 6)! I can't think of any reason why this should happen other than some sort of mathematical error or software bug. I strongly encourage the authors to pin down the cause of this puzzling behaviour. (Comment addressed in revision. Still, I find the explanation added at 449ff to be somewhat puzzling -- shouldn't the results of the regional HMM scan only improve if the true mutation rate is given?)

      We do understand that our results and explanation can appear counter-intuitive. As acknowledged by the reviewer, in the previous round of revision we have at length clarified this puzzling behaviour by the discrepancy in assessing methylation regions using the HMM method which then differs from the HMM for the SMC inference. We are happy to clarify further in response to the new question of reviewer 1:

      If the Reviewer #1 means the SNP mutations (e.g. A → T), knowing the true mutation rate does not help the HMM to recover the region level methylation status. 

      If the Reviewer #1 means the epimutations (whether it is the region, site or both), knowing the true epimutations rates could theoretically help the HMM to recover the region level methylation status. However, at present, our method does not leverage information from epimutation rates to infer the region level methylation status. As inferring the epimutations rates is one of the goals of this study in the SMC inference, and that region level methylation status is required to infer those rates, we suspect that using epimutations rates to infer the region level methylation status could be statistically inappropriate (generating some kind of circular estimations). Instead, our HMM uses only the proportion of methylated and unmethylated sites (estimated from the genome) to determine whether or not a region status is most-likely to be methylated or unmethylated. We now explicit this fact in the HMM for methylation region in the method section.

      We acknowledge that our HMM to infer region level methylation status could be improved, but this would be a complete project and study on its own (due to the underlying complexity of the finite site and the lack of a consensus model for epimutations at evolutionary time scale). We believe our HMM to have been the best compromise with what was known from methylation and our goals when the study was conducted, and future work is definitely worth conducting on the estimation of the methylation regions.

      - As noted at 580, all of the added power from integrating SMPs/DMRs should come from improved estimation of recent TMRCAs. So, another way to study how much improvement there is would be to look at the true vs. estimated/posterior TMRCAs. Although I agree that demographic inference is ultimately the most relevant task, comparing TMRCA inference would eliminate other sources of differences between the methods (different optimization schemes, algorithmic/numerical quirks, and so forth). This could be a useful addition, and may also give you more insight into why the augmented SMC methods do worse in some cases. (Comment addressed in revision via Supp. Table 7.).

      - A general remark on the derivations in Section 2 of the supplement: I checked these formulas as best I could. But a cleaner, less tedious way of calculating these probabilities would be to express the mutation processes as continuous time Markov chains. Then all that is needed is to specify the rate matrices; computing the emission probabilities needed for the SMC methods reduces to manipulating the results of some matrix exponentials. In fact, because the processes are noninteracting, the rate matrix decomposes into a Kronecker sum of the individual rate matrices for each process, which is very easy to code up. And this structure can be exploited when computing the matrix exponential, if speed is an issue.

      We believe this comment was acknowledged in the previous revision (line 649), and we thank the reviewer for this interesting insight.

      - Most (all?) of the SNP-only SMC methods allow for binning together consecutive observations to cut down on computation time. I did not see binning mentioned anywhere, did you consider it? If the method really processes every site, how long does it take to run?

      We believe this comment was addressed in the previous revision and was added to the manuscript in the methods Section (subsection :  SMC optimization function).

      - 486: The assumed site and region (de)methylation rates listed here are several OOM different from what your method estimated (Supp. Tables 5-6). Yet, on simulated data your method is usually correct to within an order of magnitude (Supp. Table 4). How are we to interpret this much larger difference between the published estimates and yours? If the published estimates are not reliable, doesn't that call into question your interpretation of the blue line in Fig. 7 at 533? (Comment addressed in revision.)

      Reviewer #2 (Public Review):

      A limitation in using SNPs to understand recent histories of genomes is their low mutation frequency. Tellier et al. explore the possibility of adding hypermutable markers to SNP based methods for better resolution over short time frames. In particular, they hypothesize that epimutations (CG methylation and demethylation) could provide a useful marker for this purpose. Individual CGs in Arabidopsis tends to be either close to 100% methylated or close to 0%, and are inherited stably enough across generations that they can be treated as genetic markers. Small regions containing multiple CGs can also be treated as genetic markers based on their cumulative methylation level. In this manuscript, Tellier et al develop computational methods to use CG methylation as a hypermutable genetic marker and test them on theoretical and real data sets. They do this both for individual CGs and small regions. My review is limited to the simple question of whether using CG methylation for this purpose makes sense at a conceptual level, not at the level of evaluating specific details of the methods. I have a small concern in that it is not clear that CG methylation measurements are nearly as binary in other plants and other eukaryotes as they are in Arabidopsis. However, I see no reason why the concept of this work is not conceptually sound. Especially in the future as new sequencing technologies provide both base calling and methylating calling capabilities, using CG methylation in addition to SNPs could become a useful and feasible tool for population genetics in situations where SNPs are insufficient.

      We thank again the reviewer #2 for his positive comments.  

      Reviewer #3 (Public Review):

      I very much like this approach and the idea of incorporating hypervariable markers. The method is intriguing, and the ability to e.g. estimate recombination rates, the size of DMRs, etc. is a really nice plus. I am not able to comment on the details of the statistical inference, but from what I can evaluate it seems reasonable and in principle the inclusion of highly mutable sties is a nice advance. This is an exciting new avenue for thinking about inference from genomic data. I remain a bit concerned about how well this will work in systems where much less is understood about methylation,

      The authors include some good caveats about applying this approach to other systems, but I think it would be helpful to empiricists outside of thaliana or perhaps mammalian systems to be given some indication of what to watch out for. In maize, for example, there is a nonbimodal distribution of CG methlyation (35% of sites are greater than 10% and less than 90%) but this may well be due to mapping issues. The authors solve many of the issues I had concerns with by using gene body methylation, but this is only briefly mentioned on line 659. I'm assuming the authors' hope is that this method will be widely used, and I think it worth providing some guidance to workers who might do so but who are not as familiar with these kind of data.

      We thank the reviewer #3 for his positive comments. And we agree with Reviewer #3 concerning the application to data and that our approach needs to be carefully thought before applied. Our results clearly show that methylation processes are not well enough understood to apply our approach as we initially (maybe naively) designed it. Further investigations need to be conducted and appropriate theoretical models need to be developed before reliable results can be obtained. And we hope that our discussion points this out. However, our approach, the theoretical models and the additional tools contained in this study can be used to help researchers in their investigations to whether or not use different genomic markers to build a common (potentially more reliable) ancestral history. We enhanced the discussion in this second revision by clarifying also the use of the methylation from genic regions to avoid  confusion (lines 700-731).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      In added Supp. Table 7, I don't think these are in log10 units as stated in the caption.

      Well Spotted! Indeed, the RMSE is not in log10 scale, we corrected the caption. We also added that the TMRCA used for MRSE calculations is in generations units to avoid potential confusion.  

      Reviewer #3 (Recommendations for The Authors):

      I very much appreciate the authors' attention to previous questions. I would ask that a bit more is spent in the discussion on concerns/approaches empiricists should keep in mind -- I am wary of this being uncritically applied to data from non-model species. It was not clear to me, for example (only mentioned on line 659 in the discussion) that the thaliana data is only using gene-body methylation. This poses potential issues with background selection that the authors acknowledge appropriately, but also assuages many of my concerns about using genome-wide data. I think text with recommendations for data/filtering/etc or at least cautions of assumptions empiricists should be aware of would help.

      We apologize for the confusion at line 659. As written in the other section of the manuscript we meant CG sites in genic regions (and not only gene body methylated regions).

      Due to the manuscript’s structure, the data from Arabidopsis thaliana is only described at the very end of the manuscript (line 900+). However, a brief description could also be found line 291-296. We however added a sentence in the introduction (line 128) for clarity. 

      We however agree with the comment made by reviewer #3 concerning the application to data. We pointed in the discussion the risk of applying our approach on ill-understood (or illprepared) data and stressed the current need of studies on the epimutations processes at evolutionary time scale ( i.e. at Ne time scale) (line 700-703).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary:

      Clostridium thermocellum serves as a model for consolidated bioprocess (CBP) in lignocellulosic ethanol production, but yet faces limitations in solid contents and ethanol titers achieved by engineered strains thus far. The primary ethanol production pathway involves the enzyme aldehydealcohol dehydrogenase (AdhE), which forms long oligomeric structures known as spirosomes, previously characterized via the 3.5 Å resolution E. coli AdhE structure using single-particle cryoEM. The present study describes the cryo-EM structure of the C. thermocellum ortholog, sharing 62% sequence identity with E. coli AdhE, resolved at 3.28 Å resolution. Detailed comparative structural analysis, including the Vibrio cholerae AdhE structure, was conducted. Integrating cryoEM data with molecular dynamics simulations indicated that the aldehyde intermediate resides longer in the channel of the extended form, supporting the hypothesis that the extended spirosome represents the active form of AdhE. 

      Strengths: 

      The study conducts a comprehensive structural comparative analysis of oligomerization interfaces and the acetaldehyde channel across compact and extended conformations. Structural and computational results suggest the extended spirosome as the most likely active state of AdhE. 

      Weaknesses: 

      The overall resolution of the C. thermocellum structure is similar to the E. coli ortholog, which shares 62% sequence identity, and the oligomerization interfaces and the acetaldehyde channel were previously described. 

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Ziegler et al, entitled 'Structural characterization and dynamics of AdhE ultrastructure from Clostridium thermocellum: A containment strategy for toxic intermediates?" presents the atomic resolution cryo-EM structure of C. thermocellum AdhE showing that it show dominantly an extended form while E. coli AdhE shows dominantly a compact form. With comparative analysis of their C. thermocellum structure and the previous E. coli AdhE structure, they tried to reveal the mechanism by which C. thermocellum and E. coli show diXerent dominant conformations. In addition, they also analyzed the substrate channel by comparative and computational approaches. Lastly, their computational analysis using CryoDRGN reveals conformational heterogeneity in the sample. Although this manuscript suggests a potential mechanism of the diXerent features of AdhEs, this manuscript is very descriptive and does not provide suXicient data to support the authors' conclusions, which may be due to the lack of experimental data to support their findings from the computational analysis. 

      Strengths: 

      This manuscript provides the first C. thermocellum (Ct) AdhE structure and comparatively analyzed this structure with E. coli AdhE. 

      Weaknesses: 

      Their main conclusions obtained mostly by computational and comparative analysis are not supported by experimental data. 

      Reviewer #3 (Public Review): 

      This study describes the first structure of Gram-positive bacterial AdhE spirosomes that are in a native extended conformation. All the previous structures of AdhE spirosomes obtained come from Gram-negative bacterial species with native compact spirosomes (E. coli, V. cholerae). In E. coli, AdhE spirosomes can be found in two diXerent conformational states, compact and extended, depending on the substrates and cofactors they are bound to. 

      The high-resolution cryoEM structure of the extended C. thermocellum AdhE spirosomes produced in E. coli in an apo state (without any substrate or cofactors) is compared to the E. coli extended and compact AdhE spirosomes structures previously published. The authors have modeled (in Swiss-Model) the structure of compact C. thermocellum AdhE spirosomes, using E. coli compact AdhE spirosome conformation as a template, and performed molecular dynamics simulations. They have identified a channel in which the toxic reaction intermediate aldehyde could transit from the aldehyde dehydrogenase active site to the alcohol dehydrogenase active site, in an analogous manner to E. coli spirosomes. These findings are in line with the hypothesis that the extended spirosomes could correspond to the active form of the enzyme. 

      In this work, the authors speculate that the C. thermocellum AdhE spirosomes could switch from the native extended conformation to a compact conformation, in a way that is inverse of E. coli spirosomes. Although attractive, this hypothesis is not supported by the literature. Amazingly, in some Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. di8icile...), AdhE spirosomes are natively extended and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). The data presented as they are now are not convincing to confirm the existence of C. thermocellum AdhE spirosomes in a compact conformation. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Major points: 

      (1) The claim of achieving the highest resolution AdhE structure lacks strong support since the E. coli structure was solved at 3.5A, whereas the C. thermocellum was solved at 3.28A. Conducting a local resolution analysis could provide insights into distinct structural interpretations, enhancing the strength of the claim. 

      We have modified the sentence claiming this as the highest resolution AdhE structure to say, “In this study, we presented and analyzed a high-resolution structure of the AdhE spirosome from C. thermocellum.” We have included the local resolution map in Figure 2C – all structural analysis was performed in regions from the center of the molecule, where the highest resolution information was determined.

      (2) The comparative structural analysis of the oligomerization interface is thorough, yet it could benefit from greater conciseness. Focusing on highlighting major findings would streamline the presentation and enhance clarity. 

      We altered a few places in the comparative structural analysis in response to other reviewers. We also divided the main structure section into two subsections (spirosome interfaces and AdhE active sites) to enhance clarity.

      Reviewer #2 (Recommendations For The Authors): 

      (1) The authors should change the tile containing "?". Does it mean that the conclusions that the authors made are still in question? 

      We have removed the question mark to indicate that our results point to a channeling mechanism.

      (2) Figure 1B: Clarify Ct Fwd. Is this adding NADH, and Ct Rev adding NAD+? 

      This information is described in the text in lines 98-100. It is also at the bottom of figure 1B.

      (3) Line 131: Please revise accordingly for clarity: "The extended dimer interfaces" è "The extended E.coli dimer interface". 

      This has been edited for clarity. We have added the following sentence resulting to indicate which interfaces that are being discussed: “Both the E. coli and C. thermocellum extended dimer interfaces bury ~5000 Å2. While the compact C. thermocellum compact dimer interface buries a similar surface area of ~4800 Å2, the E. coli dimer interface buries ~3800 Å2.”

      (4) Line 133-136: Why that does not seem to be the case? These sentences are not clear what the authors exactly mean. 

      We altered the text to say, “One would expect the compact structure in E. coli to have a larger buried surface area due to it being the predominant form when it is examined without additives, but that is not the case; further corroborating that factors other than buried surface area must impact the apo state of the spirosome.” We hope this clarifies our intent.

      (5) Line 138-145: The authors should provide a logic for how the diXerent distribution of the charged residues would change the form of AdhE. It may just be a diXerent distribution nothing to do with the conformational change. 

      After further analysis of the interface amino acid distribution, we agree that the distribution may have nothing to do with the conformational change. We have changed this section to end with the sentence “Analysis of the residues buried in these interfaces reveals that while many of the residues are identical in the C. thermocellum and E. coli extended structures, there are some diXerences in amino acid type distribution, although nothing that directly indicates control of conformer state (Supplemental Figure 3).” 

      (6) Line 169: Kim et al. è Cho et al.

      We have corrected this error.

      (7) Line 122-235: The whole section is just describing the diXerence between Ct and Ec AdhE suggesting that this diXerence may contribute to the conformational diXerence without any evidence. The author cannot say that the diXerences in the interface, active sites cofactor pockets, etc explain why two AdhE (Ct, Ec) have diXerent domain conformers unless they provide experimental data. 

      We did not conclude that any diXerences we observed structurally were responsible for the conformation change. The purpose of this section was solely to compare the structures to determine if we could find a structural basis for the diXerence between E. coli and C. thermocellum conformation – we stated a few times throughout the section and in the discussion that there were no immediate structural reasons for this diXerence in shape. We have added a few sentences in the discussion to address whether Gram-positive vs. Gram-negative is influencing the shape, addressed in reviewer #3 comment #4. 

      (8) Line 237: The whole section "Identification..." analyzed the substrate channel by computational analysis. The author should provide experimental evidence that these residues identified are critical for channeling by generating mutants and measuring their activity. 

      We agree that mutagenesis is the next logical step for these results, however it is outside the scope of work of this paper as this study will not be that straightforward. We have included a sentence in the discussion to indicate our plans for further investigation to the channel that says, “Future mutagenesis studies will be needed to confirm whether the spirosome exists to control the reaction flux in high-reactant conditions.”

      Reviewer #3 (Recommendations For The Authors): 

      (1) The capacity of C. thermocellum AdhE spirosomes to switch from a natively extended conformation to a compact conformation is not demonstrated in this manuscript, as it is now. Because this would be the first time that Gram-positive bacterial AdhE spirosomes are observed in a compact conformation, the authors should provide a clear demonstration of their existence by presenting reliable and good images of C. thermocellum compact spirosomes. 

      We have modified Figure 1A to zoom in on one compact and extended spirosome that we have identified from each C. thermocellum sample. We have included triangles of the same size and shape to indicate the proximity of a turn of a helix, showing that the identified compact spirosomes have a tighter conformation than extended spirosomes.

      (2) The authors should show at least an image of the compact C. thermocellum spirosomes, that they claim to observe in the presence of NADH or in the forward reaction conditions mentioned in Figure 1. The authors have added diXerent reactants to the extended C. thermocellum spirosomes and visualized their conformation by negative stain. An image of each condition tested would be valuable and would nicely complete the distribution of compact versus extended spirosomes presented in Figure 1. 

      We have created a new supplemental figure with spirosomes circled for all of the experimental conditions for C. thermocellum (Supplemental figure 1). We have added a reference to supplemental figure 1 in the text to direct the reader to these images.

      (3) The cryoEM classes presented in Figure 8 are not convincing and could correspond to dimers or rosettes of AdhE or to E. coli endogenous AdhE. CryoEM classes showing longer compact C. thermocellum spirosomes should be shown. The percentage of these compact spirosomes visualized in the micrographs should be added and discussed in the text as it would increase confidence in these findings and confirm that C. thermocellum compact spirosomes exist. Heterologous production of C. thermocellum AdhE in E. coli depleted for its endogenous AdhE would be required to definitively prove that these are compact C. thermocellum AdhE spirosomes in the cryoEM. 

      We included the pictures of the theoretical compact spirosomes, as generated from the 8-mer of E. coli AdhE (6AHC) to address the possibility of rosettes. We have now indicated in the text that there were 6.7% of the particles in the compact conformation, which is less than seen by negative stain. We further mentioned that the compact spirosome is less compact than that seen in E. coli. We added a sentence to the discussion about the possibility of contaminating E. coli spirosomes (though this is very unlikely ) in our compact spirosome analysis: “While these compact spirosomes could result from expression in E. coli, though this is very unlikely, we also identified compact spirosomes in a native C. thermocellum lysate, which would not have similar contamination issues.”

      (4) The authors should include and discuss in the text previous findings (among which Laurenceau et al., 2015...) describing the diXerences between Gram-positive and Gram-negative spirosomes. AdhE spirosomes are natively extended in most Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. diXicile...), and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). 

      We have added the following sentences to the discussion to address this comment: “This could potentially be due to the diXerences between Gram-positive and Gram-negative bacteria. In previous studies, compact spirosomes have only been isolated from Gram-negatives while solely extended spirosomes have been isolated from Gram-positives. Furthermore, while the compact spirosomes can transition to extended in the presence of cofactors, the reverse has not been previously observed with an extended spirosome.”

      (5) The authors have spotted some diXerences between the E. coli and C. thermocellum structures, that they believe could explain the intrinsic capacity of these spirosomes to be natively extended or compact. It would be interesting to confirm this hypothesis by measuring C. thermocellum extended AdhE spirosome activity and comparing it to E. coli extended spirosomes. The impact of mutations in the regions proposed by the authors to be important in the capacity of C. thermocellum AdhE to be extended (especially the GxGxxG motif and the D494 position) would be appreciated to confirm this hypothesis. 

      We agree that this would be an interesting avenue of research although it is currently outside the scope of this paper. We are looking into experiments that we can perform where we can track both activity and conformation but have not found an ideal experiment at this time.

      (6) Many statements and result interpretations are overstated in several parts of the manuscript and would need to be rewritten to balance the absence of clear evidence of C. thermocellum compact spirosomes. 

      We have shown that we have identified compact spirosomes, addressed in multiple comments above. We have adjusted the language of the paper to indicate more uncertainty that will be followed up in future mutagenesis experiments. However, these mutations are not that simple to identify and this research would require a fairly large study that is better suited for a follow up manuscript.

      (7) The Figure 7 legend would need to be corrected.

      We are unsure as to what needs to be corrected in the figure 7 legend based on this comment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Strengths:

      (1) In my assessment, the data sufficiently demonstrates that a modified version of Pertuzamab can bind both the wild-type and S310 mutant forms of ERBB2.

      (2) The engineering strategy employed is rational and effectively combines computational and experimental techniques.

      (3) Given the clinical activity of HER2-targeting ADCs, antibodies unaffected by ERBB2 mutations would be desired.

      Weaknesses:

      (1) There is no data showing that the engineered antibody is equally specific as Pertuzamab i.e. that it does not bind to other (non-ERBB2) proteins.

      Showing the specificity of the engineered antibodies is indeed important. We did not address it in the current ms, but it can be tested in the future.

      (2) There is no data showing that the engineered antibody has the desired pharmacokinetics/pharmacodynamics properties or efficacy in vivo.

      In this ms we did not conduct in-vivo experiments. When moving forward, pharmacokinetics/pharmacodynamics properties and efficacy will be tested as well.

      (3) Computational approaches are only used to design a phage-screen library, but not used to prioritize mutations that are likely to improve binding (e.g. based on predicted impact on the stability of the interaction). A demonstration of how computational pre-screening or lead optimization can improve the time-intensive process would be a welcome advance.

      Thank you for this important comment. In the present ms we indeed used a computational approach for prioritizing residues to be mutated, but we did not prioritize the mutations that are likely to improve binding. In the initial library design, we did prioritize the mutations. However, due to experimental approach limitations with codon’s selection for the library, we had decided to allow all possible residues in each position, knowing that the selection will remove non-binding variants.

      Context:

      The conflict of interest statement is inadequate. Most authors of the study (but not the first author) are employees of Biolojic, a company developing multi-specific antibodies, but the statements do not clarify whether the presented antibodies represent Biolojic IP, whether the company sponsored the research, and whether the company is further developing the specific antibodies presented.

      The Conflict-of-Interest statement will be revised as such: The Biolojic Design authors are employees of Biolojic Design and have stock options in Biolojic Design. The company did not sponsor the research, does not hold IP for the presented antibodies, and is not further developing the presented antibodies.

      Reviewer #2 (Public Review):

      Strengths:

      (1) Deep computational analyses of large datasets of clinical data provide useful information about HER2 mutations and their potential relevance to antibody therapy resistance.

      (2) There is valuable information analyzing the residues within or near the interface between the antigen HER2 and the Pertuzumab antibody (heavy chain). The experimental antibody library screening obtained 90+ clones from 3.86×1011 sequences for further functional validation.

      Weaknesses:

      (1) There is a lack of assessment for antibody variant functions in cancer cell phenotypes in vitro (proliferation, cell death, motility) or in vivo (tumor growth and animal survival). The only assay was the western blotting of phosphopho-HER3 in Figure 4. However, HER2 levels and phosphor-HER2 were not analyzed.

      We indeed did not assess the engineered antibodies function in cancer cells. While a complete signaling assessment obviously requires functional assessment as well, due to the complexity of this assay, papers in this field (for example [1-3]) measure the signaling activation following HER2-HER3 dimerization by measuring pHER3, and we relied on them in this ms.

      (2) There is a misleading impression from the title of computational engineering of a therapeutic antibody and the statement in the abstract "we designed a multi-specific version of Pertuzumab that retains original function while also bindings these HER2 variants" for a few reasons:

      a. The primary method used for variant antibody identification for HER2 mutant binding is rather traditional experimental screening based on yeast display instead of the computational design of a multi-specific version of Pertuzumab.

      b. There is insufficient or lack of computational power in the antibody design or prioritization in choosing variant residues for the library construction of 3.86×1011 sequences. It seems random combinations from 6 residues out of 4 groups with 20 amino acid options.

      c. The final version of the tri-binding variant is a combination of screened antibody clones instead of computation design from scratch.

      d. There is incomplete experimental evidence about the therapeutic values of newly obtained antibody clones.

      Thank you for this relevant comment. When addressing relevant residues to be mutated, the number of potential variants is enormous. The computational approach was aimed at identifying the most preferable residues, in which variation can improve binding and is not likely to harm important interactions. Although an initial smaller number of residues could be chosen, we decided to broaden our view and create a larger library, in the aim of combining the computational selection with an experimental selection. This indeed is not a computational design from scratch, but rather an intercourse between the computer and the lab, that yielded the presented results.

      (3) Figures can be improved with better labeling and organization. Some essential pieces of data such as Supplementary Figure 1B on HER2 mutations in S310 that abrogated its binding to Pertuzumab should be placed in the main figures.

      Thank you for this comment, the relevant figures were moved to the main text, and the labels were revised.

      (4) It is recommended to provide a clear rationale or flowchart overview into the main Figure 1. Figure 2A can be combined with Figure 1 to the list of targeted residues.

      Figures 1 and 2 were divided differently, and the rationale was moved to the main text.

      (5) The quality of Figures such as Figure 2B-C flow data needs to be improved.

      High-quality figures were submitted with the revised ms.

      Reviewer #1 (Recommendations for The Authors):

      Major:

      (1) It should be clarified whether the S310 somatic mutations represent resistance mutations to Pertuzamab (i.e. emerge post-therapy) or are general mutations that activate HER2. This is important because mutations that specifically "evade" the binding of an antibody may be substantially more difficult to overcome than mutations that only by chance occur in the antibody binding site. This concern should be addressed in the introduction and discussion as it changes the interpretation of the data.

      This is a very important note. To the best of our knowledge, these mutations were not identified as resistance mutations that emerged post-therapy. However, as mentioned in the introduction, these mutations form hydrophobic interactions that stabilize HER2 dimerization. Moreover, cells expressing these mutations show hyperphosphorylation of HER2 and an increase in the subsequent activation of signaling pathways. Thus, these mutations do not necessarily evade Pertuzumab binding, but benefit cancer growth. This point was clarified in the introduction of the revised text.

      (2) While the authors claim that S310 germline pathogenic variants exist, I could not find evidence that this is the case. The dbGAP ID does not provide any evidence (either in the form of a citation or prevalence). The variants do not exist in GnomAD. A recent article discussing pathogenic ERBB2 germline variants only mentions S310 as a somatic variant https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8268839/ and I could not find evidence for S310 being a germline variant in the references provided by the author (https://www.nature.com/articles/nbt.3391) - where it is only mentioned as a somatic mutation. I could not find evidence of a cancer predisposition syndrome associated with this variant.

      Thank you for highlighting this matter. We had assumed that the presence of the variant in dbSNP means it is also a germline mutations, what may not be correct. However, we did find some evidence of this mutation as germline in ClinVar, and this was edited in the revised ms. https://www.ncbi.nlm.nih.gov/clinvar/RCV001311879.7.

      (3) The authors should consider experiments that show that the modified Pertuzamab has the same mechanism of action as the original Pertuzamab in preventing dimerization of the ERBB2 homodimer and/or interactions with ERBB3. I cannot recommend a specific approach, but at present it is not clear whether the mechanism or just the effect (phosphorylation of ERBB3) is the same.

      As mentioned above, for the assessment of HER-HER3 binding and HER3 signaling, in this ms we relied on a previous works [1-3] that also measured the signaling activation following HER2-HER3 dimerization by measuring pHER3.

      (4) The authors should perform in vitro experiments to demonstrate that the engineered antibody has similar on-target specificity not only sensitivity. I don't know what the ideal experiments would be, but should probably probe native epitopes. Western blots, immunoprecipitation of cell lysates?

      As mentioned above, showing the specificity of the engineered antibodies is indeed important. We did not address it in the current ms, but it can be tested in future work.

      Minor:

      (1) The introduction should review better the literature on the computational/rational design of antibodies, especially multi-specific - and likely de-emphasize small molecules (and mutations associated with the resistance thereof) as the presented research does not inform the design of mutation-agnostic small molecules.

      Thank you for these comments, the introduction was revised accordingly.

      (2) The authors should better present the fact that the lack of binding of Pertuzamab to HER2 S310 was previously known, thus the whole strategy of searching COSMIC, and computationally predicting their binding impact was unnecessary. Rather it would be helpful to learn how many other COSMIC hotspots could have a similar effect on other clinical antibodies.

      The lack of binding was indeed previously known, as mentioned in the introduction. However, we did not start our analysis targeting HER2 specifically, but we rather found these mutations because they were located in the binding pocket, which enabled our strategy to compensate for these mutations with alteration of the original Pertuzumab. Regarding other potential hotspots, the numbers appeared in Supplementary Table 1, and were moved to the main text.

      Stylistic:

      (1) Avoid using the term "drug" for an antibody.

      The term was changed to “antibody therapeutics” in the revised text.

      (2) Avoid repetition in the introduction.

      Thank you, we revised the introduction with this comment in mind.

      Reviewer #2 (Recommendations For The Authors):

      The quality of Figure 2B-C flow data needs to be improved:

      a. The diagonal populations suggest inappropriate color compensation or indicate cells are derived from unhealthy populations.

      We believe there may be some confusion here. The figures you are referring to are figures of very diverse library. The selected clones show nice diagonals, as shown in Supplementary Figure 5.

      b. Additional round 3 and round 4 did not seem to improve the enrichment of targeted clones but rather had similar binding profiles to each of the three proteins over and over.

      Two sets of the fourth round of selection were done, each originated from a different sub-population in round 3: 1. Clones that bind the S310Y mutation 2. Clones that bind the S310F mutation. The aim of the R4 was to examine this binders against the second mutation and canonical HER2 in the search for multi-specificity. Additional clarification of this point will be added to the main text.

      c. Figure legends are vague with non-specific descriptions of cells and conditions, and unclear statements of "FACS results...".

      The legends were edited in the revised version.

      d. Text fonts are in low resolution.

      High-quality figures were submitted with the revised ms.

      (1) Diwanji, D., et al., Structures of the HER2-HER3-NRG1β complex reveal a dynamic dimer interface. Nature, 2021. 600(7888): p. 339-343.

      (2) Yamashita-Kashima, Y., et al., Mode of action of pertuzumab in combination with trastuzumab plus docetaxel therapy in a HER2-positive breast cancer xenograft model. Oncol Lett, 2017. 14(4): p. 4197-4205.

      (3) Kang, J.C., et al., Engineering multivalent antibodies to target heregulin-induced HER3 signaling in breast cancer cells. MAbs, 2014. 6(2): p. 340-53.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The development of effective computational methods for protein-ligand binding remains an outstanding challenge to the field of drug design. This impressive computational study combines a variety of structure prediction (AlphaFold2) and sampling (RAVE) tools to generate holo-like protein structures of three kinases (DDR1, Abl1, and Src kinases) for binding to type I and type II inhibitors. Of central importance to the work is the conformational state of the Asp-Phy-Gly "DFG motif" where the Asp points inward (DFG-in) in the active state and outward (DFG-out) in the inactive state. The kinases bind to type I or type II inhibitors when in the DFG-in or DFG-out states, respectively.

      It is noted that while AlphaFold2 can be effective in generating ligand-free apo protein structures, it is ineffective at generating holo-structures appropriate for ligand binding. Starting from the native apo structure, structural fluctuations are necessary to access holo-like structures appropriate for ligand binding. A variety of methods, including reduced multiple sequence alignment (rMSA), AF2-cluster, and AlphaFlow may be used to create decoy structures. However, those methods can be limited in the diversity of structures generated and lack a physics-based analysis of Boltzmann weight critical to their relative evaluation.

      To address this need, the authors combine AlphaFold2 with the Reweighted Autoencoded Variational Bayes for Enhanced Sampling (RAVE) method, to explore metastable states and create a Boltzmann ranking. With that variety of structures in hand, grid-based docking methods Glide and Induced-Fit Docking (IFD) were used to generate protein-ligand (kinase-inhibitor) complexes.

      The authors demonstrate that using AlphaFold2 alone, there is a failure to generate DFG-out structures needed for binding to type II inhibitors. By applying the AlphaFold2 with rMSA followed by RAVE (using short MD trajectories, SPIB-based collective variable analysis, and enhanced sampling using umbrella sampling), metastable DFG-out structures with Boltzmann weighting are generated enabling protein-ligand binding. Moreover, the authors found that the successful sampling of DFG-out states for one kinase (DDR1) could be used to model similar states for other proteins (Abl1 and Src kinase). The AF2RAVE approach is shown to result in a set of holo-like protein structures with a 50% rate of docking type II inhibitors.

      Overall, this is excellent work and a valuable contribution to the field that demonstrates the strengths and weaknesses of state-of-the-art computational methods for protein-ligand binding. The authors also suggest promising directions for future study, noting that potential enhancements in the workflow may result from the use of binding site prediction models and free energy perturbation calculations.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the utility of AlphaFold2 (AF2) and the author's own AF2-RAVE method for drug discovery. As has been observed elsewhere, the predictive power of docking against AF2 structures is quite limited, particularly for proteins like kinases that have non-trivial conformational dynamics. However, using enhanced sampling methods like RAVE to explore beyond AF2 starting structures leads to a significant improvement.

      Strengths:

      This is a nice demonstration of the utility of the authors' previously published RAVE method.

      Weaknesses:

      My only concern is the authors' discussion of induced fit. I'm quite confident the structures discussed are present in the absence of ligand binding, consistent with conformational selection. It seems the author's own data also argues for an important role in conformational selection. It would be nice to acknowledge this instead of going along with the common practice in drug discovery of attributing any conformational changes to induced fit without thoughtful consideration of conformational selection.

      The reviewer is correct. We aim to highlight the significant role of conformational selection. To clarify this, we have expanded the discussion on conformational selection in the introduction.

      Reviewer #3 (Public Review):

      In this manuscript, the authors aim to enhance AlphaFold2 for protein conformation-selective drug discovery through the integration of AlphaFold2 and physics-based methods, focusing on improving the accuracy of predicting protein structures ensemble and small molecule binding of metastable protein conformations to facilitate targeted drug design.

      The major strength of the paper lies in the methodology, which includes the innovative integration of AlphaFold2 with all-atom enhanced sampling molecular dynamics and induced fit docking to produce protein ensembles with structural diversity. Moreover, the generated structures can be used as reliable crystal-like decoys to enrich metastable conformations of holo-like structures. The authors demonstrate the effectiveness of the proposed approach in producing metastable structures of three different protein kinases and perform docking with their type I and II inhibitors. The paper provides strong evidence supporting the potential impact of this technology in drug discovery. However, limitations may exist in the generalizability of the approach across other structures, especially complex structures such as protein-protein or DNA-protein complexes.

      Proteins undergo thermodynamic fluctuations and can occasionally reach metastable configurations. It can be assumed that other biomolecules, such as proteins and DNA, stabilize these metastable states when forming protein-protein or protein-DNA complexes. Since our method has the potential to identify these metastable states, it shows promise for designing drugs targeting proteins in allosteric configurations induced by other biomolecules.

      The authors largely achieved their aims by demonstrating that the AF2RAVE-Glide workflow can generate holo-like structure candidates with a 50% successful docking rate for known type II inhibitors. This work is likely to have a significant impact on the field by offering a more precise and efficient method for predicting protein structure ensemble, which is essential for designing targeted drugs. The utility of the integrated AF2RAVE-Glide approach may streamline the drug discovery process, potentially leading to the development of more effective and specific medications for various diseases.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions

      (1) The computational protocol is found to be insufficient to generate precise values of the relative free energies between structures generated. The authors note in the Conclusion that an enhancement in the workflow might result from the addition of free energy calculations. Can the authors comment on the prospects for generating more accurate estimates of the free energy that might be used to qualitatively evaluate poses and the free energy landscape surrounding putative metastable states? What are the principal challenges and what might help overcome them? What would the most effective computational protocol be?

      More accurate estimates of the free energy can theoretically be achieved by increasing the number of umbrella sampling windows and extending the simulation length until the PMF converges. However, there is always a trade-off between PMF accuracy and computational costs, so we have chosen to stick with the current setup. Metadynamics is another method to obtain a more accurate free energy profile, which we have used in previous versions of AlphaFold2-RAVE, but for the specific systems we investigated, it had issues in achieving back and forth movement given the high entropic nature of the activation loop. Research in enhanced sampling methods and dimensionality reduction techniques for reaction coordinates is continually evolving and will play a critical role in alleviating this problem.

      (2) I was surprised that there was not more correlation of a funnel-like shape in Figures S16 and S18, showing a stronger correlation between low RMSD and better docking score. This is true for both the ponatinib and imatinib applications in DDR1 and Abl1. That also seems true for the trimmed results for Src kinase in Figure S19. I was also surprised that there are structures with very large RMSD but docking scores comparable to the best structures of the lowest RMSD. Might something be done to make the docking score a more effective discriminator?

      The docking algorithm and docking score are used to filter out highly improbable docking poses. False positives in predicted docking poses are a common issue across all docking methods as described for instance in:

      Fan, Jiyu, Ailing Fu, and Le Zhang. "Progress in molecular docking." Quantitative Biology 7 (2019): 83-89.

      Ferreira, R.S., Simeonov, A., Jadhav, A., Eidam, O., Mott, B.T., Keiser, M.J., McKerrow, J.H., Maloney, D.J., Irwin, J.J. and Shoichet, B.K., 2010. "Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors." Journal of medicinal chemistry, 53(13), pp.4891-4905.

      Moreover, there is always a trade-off between docking accuracy and computational cost. While employing more accurate docking methods may decrease false positives, it can also be resource-intensive. In such scenarios, our approach to enriching holo-structures can be impactful by reducing the number of pocket structures in the input ensembles and significantly enhancing docking efficiency.

      (3) I think that it is fine to identify one structure as "IFD winner" but also feel that its significance is overstressed, especially given that it can be identified only in a retrospective analysis rather than through de novo prediction.

      We agree with the reviewer. We did not intend to emphasize the specific structure "IFD winner". Rather, we aimed to demonstrate that our method can enrich promising candidates for holo-structures. We verified this by showing that our holo-structure candidates performed well in retrospective docking using IFD, which we previously referred to as "IFD winner". We have now revised this term to "holo-model".

      Minor Points

      p. 3 "DymanicBind" should be "DynamicBind"

      p. 3 Change "We chosen" to "We have chosen" or "we chose."

      p. 3 In identifying the Schrödinger software Glide and IFD, I recommend removing the subjective modifier "industry-leading."

      Modifications done.

      Reviewer #2 (Recommendations For The Authors):

      In the view of this reviewer, the writing is 'choppy'.

      We have tried to improve the writing.

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figure 1, the workflow labels (i) to (iv) are not shown on the figures, making it difficult for readers to follow. Consider adding these labels to the figures.

      Modifications done.

      (2) Explain how Boltzmann ranks were calculated based on unbiased MD simulations to guide the enrichment of holo-like structures in metastable states.

      The Methods section is now updated for clarification.

      (3) The authors could clarify how the classical DFG-out decoys in the DDR1 rMSA AF2 ensemble are transferred to Abl1 kinase in the Methods section.

      The Methods section is now updated for clarification.

      (4) The authors can clarify the methodology section by providing more detailed explanations about how the unbiased MD simulations are performed, including which MD simulation software was used and whether energy minimization and equilibrium steps were needed as in conventional MD simulations, and other setup details.

      The Methods section is now updated for clarification.

      (5) The validation of the proposed approach in this work used three kinase proteins. The authors can enhance the discussion section by addressing other types of protein structure prediction that can use the proposed approach in drug discovery, beyond the three kinase proteins tested.

      The proposed approach is theoretically applicable to other types of proteins, such as GPCRs, where both conformational selection and the induced-fit effect are crucial. We have expanded the discussion on the generalization of our protocol in the Conclusion section.

      (6) The authors should add appropriate citations for the software and tools used in the manuscript. For example, a reference should be added for the Glide XP docking experiments that utilized the Maestro software. Double-check all related software citations.

      We have now updated the citations for docking experiments based on the instruction of the Maestro Glide User manual and IFD User manual.

      (7) The authors should consider offering a comprehensive list of software tools and databases utilized in the study to assist in replicating the experiments and further validating the results.

      We have now added a summary of tools used in the Methods section.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors present evidence suggesting that MDA5 can substitute as a sensor for triphosphate RNA in a species that naturally lacks RIG-I. The key findings are potentially important for our understanding of the evolution of innate immune responses. Compared to an earlier version of the paper, the strength of evidence has improved but it is still partially incomplete due to a few key missing experiments and controls.

      We would like to thank the editorial team for their positive comments and constructive suggestions on improving our manuscript. We have made further improvements based on the valuable suggestions of the reviewers, and we are pleased to send you the revised manuscript now. After revising the manuscript and further supplementing with experiments, we think that our existing data can support our claims.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study offers valuable insights into host-virus interactions, emphasizing the adaptability of the immune system. Readers should recognize the significance of MDA5 in potentially replacing RIG-I and the adversarial strategy employed by 5'ppp-RNA SCRV in degrading MDA5 mediated by m6A modification in different species, further indicating that m6A is a conservational process in the antiviral immune response.

      However, caution is warranted in extrapolating these findings universally, given the dynamic nature of host-virus dynamics. The study provides a snapshot into the complexity of these interactions, but further research is needed to validate and extend these insights, considering potential variations across viral species and environmental contexts. Additionally, it is noted that the main claims put forth in the manuscript are only partially supported by the data presented.

      After meticulous revisions of the manuscript, including adjustments to the title, abstract, results, and discussion, the main claim of our study now is the arm race between the MDA5 receptor and SCRV virus in a lower vertebrate fish, M. miiuy. This mainly includes two parts: Firstly, the MDA5 of M. miiuy can recognize virus invasion and initiate host immune response by recognizing the triphosphate structure of SCRV. Secondly, as an adversarial strategy, 5’ppp-RNA SCRV virus can utilize the m6A mechanism to degrade MDA5 in M. miiuy. Based on the reviewer's suggestions, we have further supplemented the critical experiments (Figure 3F-3G, Figure 4D, Figure 5G) and provided a more detailed and accurate explanation of the experimental conclusions, we believe that our existing manuscript can support our main claims. In addition, because virus-host coevolution complicates the derivation of universal conclusions, we will further expand our insights in future research.

      Reviewer #2 (Public Review):

      This manuscript by Geng et al. aims to demonstrate that MDA5 compensates for the loss of RIG-I in certain species, such as teleost fish miiuy croaker. The authors use siniperca cheats rhabdovirus (SCRV) and poly(I:C) to demonstrate that these RNA ligands induce an IFN response in an MDA5-dependent manner in m.miiuy derived cells. Furthermore, they show that MDA5 requires its RD domain to directly bind to SCRV RNA and to induce an IFN response. They use in vitro synthesized RNA with a 5'triphosphate (or lacking a 5'triphosphate as a control) to demonstrate that MDA5 can directly bind to 5'-triphosphorylated RNA. The second part of the paper is devoted to m6A modification of MDA5 transcripts by SCRV as an immune evasion strategy. The authors demonstrate that the modification of MDA5 with m6A is increased upon infection and that this causes increased decay of MDA5 and consequently a decreased IFN response.

      One critical caveat in this study is that it does not address whether ppp-SCRV RNA induces IRF3-dimerization and type I IFN induction in an MDA5 dependent manner. The data demonstrate that mmiMDA5 can bind to triphosphorylated RNA (Fig. 4D). In addition, triphosphorylated RNA can dimerize IRF3 (4C). However, a key experiment that ties these two observations together is missing.

      Specifically, although Fig. 4C demonstrates that 5'ppp-SCRV RNA induces dimerization (unlike its dephosphorylated or capped derivatives), this does not proof that this happens in an MDA5-dependent manner. This experiment should have been done in WT and siMDA5 MKC cells side-by-side to demonstrate that the IRF3 dimerization that is observed here is mediated by MDA5 and not by another (unknown) protein. The same holds true for Fig. 4J.

      Thank you for the referee's professional suggestions. In fact, we have transfected SCRV RNA into WT and si-MDA5 MKC cells, and subsequently assessed the dimerization of IRF3 and the IFN response (Figure 2P-2Q). The results indicated that knockdown of MDA5 prevents immune activation of SCRV RNA. However, considering the potential for SCRV RNA to activate immunity independent of the triphosphate structure, this experimental observation does not comprehensively establish the MDA5-dependent induction of IRF3 dimer by 5’ppp-RNA. Accordingly, in accordance with the referee's recommendation, we proceeded to investigate the inducible activity of 5'ppp-SCRV on IRF3 dimerization in WT and si-MDA5 MKC cells, revealing that 5'ppp-SCRV indeed elicits immunity in an MDA5-dependent manner (Figure 4D). Additionally, poly(I:C)-HMW, a known ligand for MDA5, demonstrated a residual, albeit attenuated, activation of IRF3 following MDA5 knockdown, potentially attributed to its capacity to stimulate immunity through alternative pathways such as TLR3.

      - Fig 1C-D: these experiments are not sufficiently convincing, i.e. the difference in IRF3 dimerization between VSV-RNA and VSV-RNA+CIAP transfection is minimal.

      We have reconstituted the necessary materials and repeated the pertinent experiments depicted in Fig 1C-1D. The results demonstrate that SCRV-RNA+CIAP and VSV-RNA+CIAP exhibit a mitigating effect on the induction activity of SCRV-RNA and VSV-RNA on IRF3 dimerization, albeit without complete elimination (Figure 1C and 1D). These findings suggest the presence of receptors within M. miiuy and G. gallus capable of recognizing the viral triphosphate structure; however, it is worth noting that RNA derived from SCRV and VSV viruses does not exclusively depend on the triphosphate structure to activate the host's antiviral response.

      Fig. 2N and 2O: why did the authors decide to use overexpression of MDA5 to assess the impact of STING on MDA5-mediated IFN induction? This should have been done in cells transfected with SCRV or polyIC (as in 2D-G) or in infected cells (as in 2H-K). In addition, it is a pity that the authors did not include an siMAVS condition alongside siSTING, to investigate the relative contribution of MAVS versus STING to the MDA5-mediated IFN response. Panel O suggests that the IFN response is completely dependent on STING, which is hard to envision.

      In our previous laboratory investigations, we have substantiated the induction effect of STING on IFN under SCRV infection or poly(I:C) stimulation, as documented in the relevant literature (10.1007/s11427-020-1789-5), which we have referenced in our manuscript (lines 177-178). While we did assess the impact of STING on MDA5-mediated IFN induction in SCRV-infected cells, as indicated in the figure legends, we have revised Figure 2N-2O for improved clarity, and similarly, Figure 1H-1I has also been updated. Furthermore, considering that RNA virus infection can activate the cGAS/STING axis (10.3389/fcimb.2023.1172739) and the significant role of MAVS in sensing RNA virus invasion in the NLR pathway (10.1038/ni.1782), it is challenging to ascertain the respective contributions of STING and MAVS to the immune signaling cascade mediated by MDA5 during RNA virus infection. We intend to explore this aspect further in future research endeavors.

      Fig. 3F and 3G: where are the mock-transfected/infected conditions? Given that ectopic expression of hMDA5 is known to cause autoactivation of the IFN pathway, the baseline ISG levels should be shown (ie. In absence of a stimulus or infection). Normalization of the data does not reveal whether this is the case and is therefore misleading.

      Based on the reviewer's suggestions, we have rerun the experiment. We examined the effects of MDA5 and MDA5-ΔRD on antiviral factors in both uninfected, SCRV-infected, and poly(I:C)-HMW-stimulated MKC cells. Results showed that overexpression of both MDA5 and MDA5-ΔRD stimulated the expression of antiviral genes. However, when cells were infected or stimulated with SCRV or poly(I:C)-HMW, only the overexpression of MDA5, not MDA5-ΔRD, significantly increased the expression of antiviral genes (Figure 3F-3I).

      Fig. 4F and 4G: can the authors please indicate in the figure which area of the gel is relevant here? The band that runs halfway the gel? If so, the effects described in the text are not supported by the data (i.e. the 5'OH-SCRV and 5'pppGG-SCRV appear to compete with Bio-5'ppp-SCRV as well as 5'ppp-SCRV).

      Apologies for any confusion. The relevant areas in the gel pertaining to the experimental findings were denoted with asterisks and elaborated upon in the figure legends (Figure 4G, 4H, and 4M). The findings indicated that 5'ppp-SCRV, in contrast to 5'OH-SCRV and 5'pppGG-SCRV, demonstrated the ability to compete with bio-5'ppp-SCRV.

      My concerns about Fig. 5 remain unaltered. The fact that MDA5 is an ISG explains its increased expression and increased methylation pattern. The authors should at the very least mention in their text that MDA5 is an ISG and that their observations may be partially explained by this fact.

      First, as our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, changes in the expression level of MDA5 can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature (Lines 606-608). In addition, we have elaborated on the fact that MDA5 is an ISG gene in the experimental results (lines 260-261), and emphasized its compatibility with enhanced m6A modification of MDA5 in the discussion section (lines 405-409).

      Reviewer #3 (Public Review):

      In this manuscript, the authors explored the interaction between the pattern recognition receptor MDA5 and 5'ppp-RNA in the Miiuy croaker. They found that MDA5 can serve as a substitute for RIG-I in detecting 5'ppp-RNA of Siniperca cheilinus rhabdovirus (SCRV) when RIG-I is absent in Miiuy croaker. Furthermore, they observed MDA5's recognition of 5'ppp-RNA in chickens (Gallus gallus), a species lacking RIG-I. Additionally, the authors documented that MDA5's functionality can be compromised by m6A-mediated methylation and degradation of MDA5 mRNA, orchestrated by the METTL3/14-YTHDF2/3 regulatory network in Miiuy croaker during SCRV infection. This impairment compromises the innate antiviral immunity of fish, facilitating SCRV's immune evasion. These findings offer valuable insights into the adaptation and functional diversity of innate antiviral mechanisms in vertebrates.

      We extend our sincere appreciation for your professional comments and insightful suggestions on our manuscript, as they have significantly contributed to enhancing its quality.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The interpretation of Figures 1H and I, along with the captions, seems unclear. Particularly, understanding the meaning of the X-axis in Figure I is challenging. Additionally, the designation of "H2O = 1" on the Y-axis in Figure 1E lacks clarity. It would be helpful if the author could revise and clarify these figures for better comprehension.

      We appreciate your reminder and have corrected and clarified these figures and figure legends (lines 768-772). We have replaced the Y-axis of Figure 1I with "Relative mRNA expression" instead of " Relative IFN-1 expression" (Figure 1I). In addition, we have added an explanation of "H2O=1" in the legend of Figure 1E.

      (2) The interpretation of Figure 5 in section 2.5 seems incomplete. The author mentioned that both m6A levels and MDA5 expression levels are increased (lines 256-257), prompting questions about the relationship between m6A and MDA5 expression. If higher m6A levels typically lead to MDA5 mRNA instability and lower MDA5 expression, observing both increasing simultaneously appears contradictory. Considering the dynamic changes shown in Figure 5, it would be more appropriate to propose an alteration in both m6A levels and MDA5 expression levels. Given the fluctuating nature of these changes, definitively labeling them as solely "increased" is challenging. Therefore, offering a nuanced interpretation of the results and clarifying this aspect would bolster the study's conclusions.

      While changes in m6A modification and the expression of m6A-modified transcripts are biologically relevant, identifying bona fide m6A alterations during viral infection will allow us to understand how m6A modification of cellular mRNA is regulated. As our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, the upregulation of MDA5 expression can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature. I hope to receive your understanding.

      In addition, although higher m6A levels often lead to unstable MDA5 mRNA and lower MDA5 expression, SCRV can affect MDA5 expression through multiple pathways. For example, since MDA5 is an interferon-stimulated gene, the infection of SCRV virus can cause strong expression of interferon and indirectly induce high-level expression of MDA5. Therefore, the expression of MDA5 is not contradictory to the simultaneous increase in MDA5 modification (24 h). In order to further enhance our experimental conclusions, we supplemented the dual fluorescence experiment. The results indicate that, the infection of SCRV can inhibit the fluorescence activity of MDA5-exon1 reporter plasmids containing m6A sites but not including the promoter sequence of the MDA5 gene, and this inhibitory effect can be counteracted by cycloleucine (CL, an amino acid analogue that can inhibit m6A modification) (Figure 5G). This further indicates that SCRV can reduce the expression of MDA5 through the m6A pathway.

      Finally, in light of the fluctuations in MDA5 expression levels, we have changed the subheadings of Results 2.5 section and provided a more comprehensive and precise elucidation of the experimental outcomes. We are grateful for your valuable feedback.

      (3) In the discussion section, it would indeed be advantageous for the author to explore the novelty of this work more comprehensively, moving beyond merely acknowledging the widespread loss of RIG-I and suggesting MDA5 as a compensatory mechanism. Considering the well-established roles of MDA5 and m6A in host-virus interactions, the findings of this study may seem familiar in light of previous research. To enhance the discussion, it would be valuable for the author to delve into the implications of this evolutionary model. For instance, does the compensation or loss of RIG-I impact a species' susceptibility to specific types of viruses? Exploring such questions would provide insight into the broader significance of this compensation model and its potential effects on host-virus interactions, thus adding depth to the study's contribution.

      We appreciate the expert advice provided by the referee. In response, we have expanded our discussion in the relevant section, addressing the potential influence of RIG-I deficiency and MDA5 compensation on the antiviral immune system in vertebrates (lines 371-376). Furthermore, we underscore the significance of exploring the impact of SCRV infection on MDA5 m6A modification, considering its compatibility with MDA5 as an ISG gene, in elucidating the host response to viral infection (lines 405-409).

      (4) To improve the manuscript, it would be beneficial if the editors could aid the author in refining the language. Many descriptions in the article are overly redundant, and there should be appropriate differentiation between experimental methods and results.

      We appreciate the reviewer’s comment. We have carefully revised the manuscript and removed redundant descriptions in the experimental results and methods.

      Reviewer #3 (Recommendations For The Authors):

      The authors have addressed all of my concerns.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews

      Reviewer 1 summarized that: In this revised version of the manuscript, the authors have made important modifications in the text, inserted new data analyses, and incorporated additional references, as recommended by the reviewers. These modifications have significantly improved the quality of the manuscript.

      We are grateful for the reviewer's positive recognition of our revisions.

      Reviewer 2 noted that:

      (1) The authors do not show if the PVT mediates dPAG to BLA communication with any functional behavioral assay.

      We appreciate the reviewer’s suggestion to include a functional assay to investigate the role of the PVT in mediating communication between the dPAG and BLA. Our primary objective was to confirm the upstream role of the dPAG in processing and relaying naturalistic predatory threat information to the BLA, thereby broadening our current understanding of the dPAG-BLA relationship based on Pavlovian fear conditioning paradigms.

      Given previous anatomical findings indicating the absence of direct monosynaptic projections from the dPAG to the BLA (Cameron et al. 1995, McNally, Johansen, and Blair 2011, Vianna and Brandao 2003), we employed both anterograde and retrograde tracers, supplemented by c-Fos expression analysis following predatory threats, to explore possible routes through which threat signals may be conveyed from the dPAG to the BLA. Our findings indicated significant activity within the midline thalamic regions, particularly the PVT as a mediator of dPAG-BLA interactions, corroborating the possibility of dPAGàBLA information flow.

      Investigating the PVT's functional role appropriately would require single-unit recordings, correlation analysis of PVT neuronal responses with dPAG and BLA neuronal responses, and pathway-specific causal techniques, involving other midline thalamic regions for controls. This comprehensive study would represent an independent study.

      In response to previous feedback, we have carefully revised our manuscript to moderate the emphasis on the PVT's role. Both the Abstract, Results, and Discussion refer more broadly to "midline thalamic regions" and “The midline thalamus” (subheading) rather than specifically to the PVT. In the Introduction, we mention that the PVT "may be part of a network that conveys predatory threat information from the dPAG to the BLA." Our conclusions about the functional interaction between the dPAG and BLA, which broaden the view of Pavlovian fear conditioning, are not contingent on confirming a specific intermediary role for the PVT.

      (2) The author also do not thoroughly characterize the activity of BLA cells during the predatory assay.

      Our previous studies have extensively detailed BLA cell firing characteristics, including their responsiveness to food and/or a robot predator during the predatory assay (Kim et al. 2018, Kong et al. 2021), and compared these findings to other predator studies (Amir et al. 2019, Amir et al. 2015). In the current study, out of 85 BLA cells, 3 were food-specific and 4 responded to both the pellet and the robot, with none of these 7 cells responding to dPAG stimulation.

      Given our earlier findings of the immediate responses of BLA neurons to robot activation, we specifically examined whether robot-responsive BLA neurons receive signals from the dPAG. For this analysis, we excluded all food-related cells (pellet cells and BOTH cells) and focused on the time window immediately after robot activation (within 500 ms after robot onset). This approach enabled us to avoid potential confounds from residual effects of robot-induced immediate BLA responses during the animals’ flight and nest entry behaviors.

      Furthermore, as previously described, the robot is programmed to move forward a fixed distance and then return, repeatedly triggering foraging behavior. This setup facilitates the analysis of neural changes during food approach and predator avoidance conflicts. However, animals quickly adapt to the robot, reducing freezing and stretch-attend behaviors, making time-stamped analysis of these behaviors unfeasible.

      We would like to highlight that the present study explicitly focused on demonstrating whether BLA neurons that responded to intrinsic dPAG optogenetic stimulation also responded to extrinsic predatory robot activation, and compared their firing characteristics to those BLA neurons that did not respond to dPAG stimulation (Figure 3). This targeted analysis provides insights into the responsiveness of BLA neurons to both intrinsic and extrinsic stimuli, furthering our understanding of the dPAG-BLA interaction in the context of predatory threats.

      Reviewer 3 also raised no concerns and stated that: The series of experiments provide a compelling case for supporting their conclusions. The study brings important concepts revealing dynamics of fear-related circuits particularly attractive to a broad audience, from basic scientists interested in neural circuits to psychiatrists.

      We sincerely thank the reviewer for the positive feedback on our revisions.

      Recommendations for the Authors

      Reviewer 1: There are a few minor concerns that the authors may want to fix:

      (1) Point 5) The sentence: "The complexity of targeting the dPAG, which includes its dorsomedial, dorsolateral, lateral, and ventrolateral subdivisions" is hard to follow because the ventrolateral subdivision is not part of the dPAG. The authors may want to say specific subregions of the PAG instead. It is also unclear why transgenic animals would be needed for this projection-defined manipulations. The combination of retrograde Cre-recombinase virus with inhibitory opsin or chemogenetic approach may be sufficient.

      We appreciate the reviewer’s insightful feedback regarding our description of the dPAG and the use of transgenic mice in future studies. As suggested, we have corrected the manuscript to exclude the 'ventrolateral' subdivision from the dPAG description, now accurately aligning with pioneering studies (Bandler, Carrive, and Zhang 1991, Bandler and Keay 1996, Carrive 1993) that designated dPAG as including the dorsomedial (dmPAG), dorsolateral (dlPAG) and lateral (lPAG) regions, as cited in our revised manuscript.

      We acknowledge the reviewer’s helpful suggestion regarding the use of retrograde Cre-recombinase virus with inhibitory opsins or chemogenetic approaches as viable alternatives. These methods have been incorporated into our discussion (pages 14-15): “While our findings demonstrate that opto-stimulation of the dPAG is sufficient to trigger both fleeing behavior and increased BLA activity, we have not established that the dPAG-PVT circuit is necessary for the BLA’s response to predatory threats. To establish causality and interregional relationships, future studies should employ methods such as pathway-specific optogenetic inhibition (using retrograde Cre-recombinase virus with inhibitory opsins; Lavoie and Liu 2020, Li et al. 2016, Senn et al. 2014) or chemogenetics (Boender et al. 2014, Roth 2016) in conjunction with single unit recordings to fully characterize the dPAG-PVT-BLA circuitry’s (as opposed to other midline thalamic regions for controls) role in processing predatory threat-induced escape behavior. If inactivating the dPAG-PVT circuits reduces the BLA's response to threats, this would highlight the central role of the dPAG-PVT pathway in this defense mechanism. Conversely, if the BLA's response remains unchanged despite dPAG-PVT inactivation, it could suggest the existence of multiple pathways for antipredatory defenses.”

      This revision addresses the critique by clarifying the anatomical description of the dPAG and emphasizing the feasibility of using targeted viral approaches without the necessity for transgenic animals.

      (2) Point 6e) The authors mentioned that "pellet retrieval" was indicated by the animal entering a designated zone 19 cm from the pellet, driven by hunger. Entering the area 19cm of distance should be labeled as food approaching rather then food retrieval because in many occasions the animals may be some seconds away of grabbing the pellet.

      We agree and incorporate the change (pg. 22).

      (3) Point 11) We would strongly recommend the authors to replace the terminology "looming" by "approaching" to avoid confusion with several previous studies looking at defensive behaviors in responses to looming induced by the shadow of an object moving closer to the eyes.

      Done.

      (4) Point 17) The authors mentioned that "A total of three rats were utilized for the robot testing experiments depicted in Fig. 2 G-J." However, the figure indicates a total of 9 ChR2 and 4 controls.

      We apologize for the confusion in our previous author responses. To examine the optical stimulation effects on behavior in Fig. 2G-J, we used a total of 9 ChR2 and 4 EYFP rats. The experimental sequence is detailed in the previously revised manuscript (pg. 20): “For optical stimulation and behavioral experiments, the procedure included 3 baseline trials with the pellet placed 75 cm away, followed by 3 dPAG stimulation trials with the pellet locations sequentially set at 75 cm, 50 cm, and 25 cm. During each approach to the pellet, rats received 473-nm light stimulation (1-2 s, 20-Hz, 10-ms width, 1-3 mW) through a laser (Opto Engine LLC) and a pulse generator (Master-8; A.M.P.I.). Additional testing to examine the functional response curves was conducted over multiple days, with incremental adjustments to the stimulation parameters (intensity, frequency, duration) after confirming that normal baseline foraging behavior was maintained. For these tests, one parameter was adjusted incrementally while the others were held constant (intensity curve at 20 Hz, 2 s; frequency curve at 3 mW, 2 s; duration curve at 20 Hz, 3 mW). If the rat failed to procure the pellet within 3 min, the gate was closed, and the trial was concluded.”

      This clarification ensures that the actual number of animals used is accurately reflected and aligns with the figure data, addressing the reviewer's concern.

      Reviewer 2: The authors made important changes in the text to address study limitations, including citations requested by the Reviewers and additional discussions about how this work fits into the existing literature. These changes have strengthened the manuscript.

      (1) However, the authors did not perform new experiments to address any of the issues raised in the previous round of reviews. For example, they did not make optogenetic manipulations of the pathway including the PVT, and did not add any loss of function experiments. The justification that these experiments are better suited for future reports using mice is not convincing, because hundreds of papers performing these types of circuit dissection assays have been performed in rats.

      We appreciate the reviewer's comments regarding the experimental scope of our study. Our study’s primary objective was to explore the dPAG’s upstream functional role in processing and conveying naturalistic predatory threat information to the BLA, extending our current understanding of the dPAG-BLA relationship based on Pavlovian fear conditioning paradigms. We believe that our findings effectively address this goal.

      Our use of anterograde and retrograde tracers, supplemented by c-Fos expression analysis in response to predatory threats, was primarily conducted to verify the possibility of the dPAGàBLA information flow during predator encounters. This involved exploring potential routes through which threat signals might be conveyed from the dPAG to the BLA, given the lack of direct monosynaptic projections from the dPAG to BLA neurons (Cameron et al. 1995, McNally, Johansen, and Blair 2011, Vianna and Brandao 2003). This methodology helped us identify a potential structure, PVT, for more in-depth future studies. A thorough examination of the PVT's role would require single-unit recordings and causal techniques, incorporating other midline thalamic regions as controls, representing a significant and separate study on its own.

      In response to prior feedback, we have carefully revised our manuscript to generally address the role of "midline thalamic regions" rather than focusing specifically on the PVT. We wish to emphasize that our findings, which illustrate unique functional interactions between the dPAG and BLA in response to a predatory imminence, remain compelling and informative even without definitive evidence of the PVT’s involvement.

      Reviewer 3: In the revised version of the manuscript, the authors addressed adequately all the concerns raised by the reviewers. 

      We thank the reviewer for the thoughtful feedback on the earlier version of our manuscript and for reexamining the revisions we have made.

      References

      Amir, A., P. Kyriazi, S. C. Lee, D. B. Headley, and D. Pare. 2019. "Basolateral amygdala neurons are activated during threat expectation." J Neurophysiol 121 (5):1761-1777.

      Amir, A., S. C. Lee, D. B. Headley, M. M. Herzallah, and D. Pare. 2015. "Amygdala Signaling during Foraging in a Hazardous Environment." J Neurosci 35 (38):12994-3005.

      Bandler, R., P. Carrive, and S. P. Zhang. 1991. "Integration of somatic and autonomic reactions within the midbrain periaqueductal grey: viscerotopic, somatotopic and functional organization." Prog Brain Res 87:269-305.

      Bandler, R., and K. A. Keay. 1996. "Columnar organization in the midbrain periaqueductal gray and the integration of emotional expression." Prog Brain Res 107:285-300.

      Boender, A. J., J. W. de Jong, L. Boekhoudt, M. C. Luijendijk, G. van der Plasse, and R. A. Adan. 2014. "Combined use of the canine adenovirus-2 and DREADD-technology to activate specific neural pathways in vivo." PLoS One 9 (4):e95392.

      Cameron, A. A., I. A. Khan, K. N. Westlund, and W. D. Willis. 1995. "The efferent projections of the periaqueductal gray in the rat: a Phaseolus vulgaris-leucoagglutinin study. II. Descending projections." J Comp Neurol 351 (4):585-601.

      Carrive, P. 1993. "The periaqueductal gray and defensive behavior: functional representation and neuronal organization." Behav Brain Res 58 (1-2):27-47.

      Kim, E. J., M. S. Kong, S. G. Park, S. J. Y. Mizumori, J. Cho, and J. J. Kim. 2018. "Dynamic coding of predatory information between the prelimbic cortex and lateral amygdala in foraging rats." Sci Adv 4 (4):eaar7328.

      Kong, M. S., E. J. Kim, S. Park, L. S. Zweifel, Y. Huh, J. Cho, and J. J. Kim. 2021. "'Fearful-place' coding in the amygdala-hippocampal network." Elife 10.

      Lavoie, A., and B. H. Liu. 2020. "Canine Adenovirus 2: A Natural Choice for Brain Circuit Dissection." Front Mol Neurosci 13:9.

      Li, Y., L. Hickey, R. Perrins, E. Werlen, A. A. Patel, S. Hirschberg, M. W. Jones, S. Salinas, E. J. Kremer, and A. E. Pickering. 2016. "Retrograde optogenetic characterization of the pontospinal module of the locus coeruleus with a canine adenoviral vector." Brain Res 1641 (Pt B):274-90.

      McNally, G. P., J. P. Johansen, and H. T. Blair. 2011. "Placing prediction into the fear circuit."  Trends Neurosci 34 (6):283-92.

      Roth, B. L. 2016. "DREADDs for Neuroscientists." Neuron 89 (4):683-94.

      Senn, V., S. B. Wolff, C. Herry, F. Grenier, I. Ehrlich, J. Grundemann, J. P. Fadok, C. Muller, J. J. Letzkus, and A. Luthi. 2014. "Long-range connectivity defines behavioral specificity of amygdala neurons." Neuron 81 (2):428-37.

      Vianna, D. M., and M. L. Brandao. 2003. "Anatomical connections of the periaqueductal gray: specific neural substrates for different kinds of fear." Braz J Med Biol Res 36 (5):557-66.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      The author presents the discovery and characterization of CAPSL as a potential gene linked to Familial Exudative Vitreoretinopathy (FEVR), identifying one nonsense and one missense mutation within CAPSL in two distinct patient families afflicted by FEVR. Cell transfection assays suggest that the missense mutation adversely affects protein levels when overexpressed in cell cultures. Furthermore, conditionally knocking out CAPSL in vascular endothelial cells leads to compromised vascular development. The suppression of CAPSL in human retinal microvascular endothelial cells results in hindered tube formation, a decrease in cell proliferation, and disrupted cell polarity. Additionally, transcriptomic and proteomic profiling of these cells indicates alterations in the MYC pathway. 

      Strengths: 

      The study is nicely designed with a combination of in vivo and in vitro approaches, and the experimental results are good quality. 

      We thank the reviewer for the conclusion and positive comments.

      Weaknesses: 

      My reservations lie with the main assertion that CAPSL is associated with FEVR, as the genetic evidence from human studies appears relatively weak. Further careful examination of human genetics evidence in both patient cohorts and the general population will help to clarify. In light of human genetics, more caution needs to be exercised when interpreting results from mice and cell models and how is it related to the human patient phenotype. 

      We thank the reviewer for careful reading and constructive suggestion. we added several experiments to address the concern of reviewer are as follows:

      (1) The pLI score of LOF allele of CAPSL is based of general population, among which Europeans account for ~77% and East Asians make up less than 3%. Since the FEVR families in this article all come from China, the pLI score may not be accurate. Of course, we will continue to collect FEVR pedigrees.

      (2) We evaluated the phenotype of Capsl heterozygous mice at P5, and the results showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. A similar example is LRP5 mutations associated with FEVR. Heterozygous mutations in LRP5 were reported in FEVR patients in multiple populations (PMID: 16929062, 33302760, 27486893, 35918671, 36411543). However, heterozygous Lrp5 knockout mice exhibited no visible angiogenic phenotype (PMID: 18263894). Corresponding description was added in the manuscript at page 6.

      (3) We further assessed the angiogenic phenotype when angiogenesis almost complete at P21, and the resulted revealed no difference observed between Ctrl and CapsliECKO/iECKO mice (Fig.S5). And corresponding description was added in the manuscript at page 7.

      (4) We evaluated the expression of MYC downstream genes in vivo using lung tissue form P35 Ctrl and _Capsl_iECKO/iECKO mice (Fig.S8). Consistent with the results from in vitro HRECs, _Capsl_iECKO/iECKO mice showed downregulated expression of MYC targets. And corresponding description was added in the manuscript at page 11.

      Reviewer #2 (Public Review): 

      Summary: 

      This work identifies two variants in CAPSL in two-generation familial exudative vitreoretinopathy (FEVR) pedigrees, and using a knockout mouse model, they link CAPSL to retinal vascular development and endothelial proliferation. Together, these findings suggest that the identified variants may be causative and that CAPSL is a new FEVR-associated gene. 

      Strengths: 

      The authors' data provides compelling evidence that loss of the poorly understood protein CAPSL can lead to reduced endothelial proliferation in mouse retina and suppression of MYC signaling in vitro, consistent with the disease seen in FEVR patients. The study is important, providing new potential targets and mechanisms for this poorly understood disease. The paper is clearly written, and the data generally support the author's hypotheses. 

      We thank the reviewer for the conclusion and positive comments.

      Weaknesses: 

      (1) Both pedigrees described appear to suggest that heterozygosity is sufficient to cause disease, but authors have not explored the phenotype of Capsl heterozygous mice. Do these animals have reduced angiogenesis similar to KOs? Furthermore, while the p.R30X variant protein does not appear to be expressed in vitro, a substantial amount of p.L83F was detectable by western blot and appeared to be at the normal molecular weight. Given that the full knockout mouse phenotype is comparatively mild, it is unclear whether this modest reduction in protein expression would be sufficient to cause FEVR - especially as the affected individuals still have one healthy copy of the gene. Additional studies are needed to determine if these variants alter protein trafficking or localization in addition to expression, and if they can act in a dominant negative fashion. 

      We thank the reviewer for the suggestion. We evaluated the phenotype of Capsl heterozygous mice at P5 (Fig.S4), and the results showed no overt difference in angiogenesis compared with littermate control mice.

      We transfected CAPSL wild-type plasmid, p.R30X mutant plasmid and p.L83F mutant plasmid into 293T cells to assess the intracellular localization change of CAPSL mutant proteins (Fig.S1). The result showed that the point mutation did not affect the localization of the mutated protein, and corresponding description was added in the manuscript at page 5.

      (2) The manuscript nicely shows that loss of CAPSL leads to suppressed MYC signaling in vitro. However, given that endothelial MYC is regulated by numerous pathways and proteins, including FOXO1, VEGFR2, ERK, and Notch, and reduced MYC signaling is generally associated with reduced endothelial proliferation, this finding provides little insight into the mechanism of CAPSL in regulating endothelial proliferation. It would be helpful to explore the status of these other pathways in knockdown cells but as the authors provide only GSEA results and not the underlying data behind their RNA seq results, it is difficult for the reader to understand the full phenotype. Volcano plots or similar representations of the underlying expression data in Figures 6 and 7 as well as supplemental datasets showing the differentially regulated genes should be included. In addition, while the paper beautifully characterizes the delayed retinal angiogenesis phenotype in CAPSL knockout mice, the authors do not return to that model to confirm their in vitro findings. 

      We thank the reviewer for the suggestion. Although endothelial MYC can be regulated by FOXO1, VEGFR2, ERK, and Notch signaling pathway, these pathways are not enriched in the RNA seq data of CAPSL-depleted HRECs. This suggests that the down regulated MYC targets may not be influenced by the signaling pathway mentioned above. RNA-seq raw data have been uploaded to the Genome Sequence Archive (https://ngdc.cncb.ac.cn/gsa/browse/HRA010305) and proteomic profiling raw data have been uploaded to the Genome Sequence Archive (https://www.ebi.ac.uk/pride/archive), and the assigned accession number was PXD051696. Corresponding description was added in the manuscript at page 20-21. The datasets represent the differentially regulated genes in Figure 6 and 7 were listed at Dataset S1 and S2.

      (3) In Figure S2D, the result of this vascular leak experiment is unconvincing as no dye can be seen in the vessels. What are the kinetics for biocytin tracers to enter the bloodstream after IP injection? Why did the authors choose the IP instead of the IV route for this experiment? Differences in the uptake of the eye after IP injection could confound the results, especially in the context of a model with vascular dysfunction as here. 

      We thank the reviewer for suggestion. In Figure S2D (now Fig.S6D), we used a non-representative image to show vascular leakage. We replaced the images with more representative ones. We are sorry that we are not clear about the kinetics for biocytin tracers to enter the bloodstream after IP injection. Since the experiment was carried out on mice at P5, it is not feasible to do IV injection in P5 neonatal mice. We followed the methods described in the previous study involving mice of same age (PMID:35361685).

      (4) In Figure 5, it is unclear how filipodia and tip cells were identified and selected for quantification. The panels do not include nuclear or tip cell-specific markers that would allow quantification of individual tip cells, and in Figure 5C it appears that some filipodia are not highlighted in the mutant panel. 

      We thank the reviewer for the comments. In Figure 5, we used HRECs to examine the cell proliferation, migration and polarity in vitro, and therefore there is no distinction between tip cells and stalk cells. The quantification of filopodia/lamellipodia was performed as previous studies (PMID: 30783090, PMID: 28805663). In briefly, wound scratch was performed on confluent layers of transfected HRECs, and 9 hours after initiating cell migration by scratch, cells were fixed and stained with phalloidin. Cells at the edge of wound were considered as leader cells and quantified for number of filopodia/lamellipodia.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript by Liu et al. presents a case that CAPSL mutations are a cause of familial exudative vitreoretinopathy (FEVR). Attention was initially focused on the CAPSL gene from whole exome sequence analysis of two small families. The follow-up analyses included studies in which CAPSL was manipulated in endothelial cells of mice and multiple iterations of molecular and cellular analyses. Together, the data show that CAPSL influences endothelial cell proliferation and migration. Molecularly, transcriptomic and proteomic analyses suggest that CAPSL influences many genes/proteins that are also downstream targets of MYC and may be important to the mechanisms. 

      Strengths: 

      This multi-pronged approach found a previously unknown function for CAPSLs in endothelial cells and pointed at MYC pathways as high-quality candidates in the mechanism. 

      Weaknesses: 

      Two issues shape the overall impact for me. First, the unreported population frequency of the variants in the manuscript makes it unclear if CAPSL should be considered an interesting candidate possibly contributing to FEVR, or possibly a cause. Second, it is unclear if the identified variants act dominantly, as indicated in the pedigrees. The studies in mice utilized homozygotes for an endothelial cell-specific knockout, leaving uncertainty about what phenotypes might be observed if mice heterozygous for a ubiquitous knockout had instead been studied. 

      In my opinion, the following scientific issues are specific weaknesses that should be addressed: 

      (1) Please state in the manuscript the number of FEVR families that were studied by WES. Please also describe if the families had been selected for the absence of known mutations, and/or what percentage lack known pathogenic variants. 

      We thank the reviewer for thoughtful comments. 120 FEVR families were studied by WES and we added corresponding description in the manuscript at page 4.

      (2) A better clinical description of family 3104 would enhance the manuscript, especially the father. It is unclear what "manifested with FEVR symptoms, according to the medical records" means. Was the father diagnosed with FEVR? If the father has some iteration of a mild case, please describe it in more detail. If the lack of clinical images in the figure is indicative of a lack of medical documentation, please note this in the manuscript. 

      We thank the reviewer for thoughtful comments. The father of family 3104 has also been identified as a carrier of this heterozygous variant, manifested with FEVR symptoms, according to the medical records. Nevertheless, clinical examination images are presently unavailable. We added corresponding description in the manuscript at page 5.

      (3) The TGA stop codon can in some instances also influence splicing (PMID: 38012313). Please add a bioinformatic assessment of splicing prediction to the assays and report its output in the manuscript. 

      We thank the reviewer for thoughtful comments. We predicted the splicing of c.88C>T variant of CAPSL using MaxEntScan (http://hollywood.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html) and SpliceTool (https://rddc.tsinghua-gd.org/ai) (Fig.S2). MaxEntScan and SpliceTool were used to predict the impact of TGA stop codon of c.88C>T variant on the formation of a cryptic donor splice site.

      (4) More details regarding utilizing a "loxp-flanked allele of CAPSL" are needed. Is this an existing allele, if so, what is the allele and citation? If new (as suggested by S1), the newly generated CAPSL mutant mouse strain needs to be entered into the MGI database and assigned an official allele name - which should then be utilized in the manuscript and who generated the strain (presumably a core or company?) must be described. 

      We added detailed description of Capsl flxoed allele to Method section on page 14-15: “Capslloxp/+ model was generated using the CRISPR/Cas9 nickase technique by Viewsolid Biotechology (Beijing, China) in C57BL/6J background and named Capslem1zxj. The genomic RNA (gRNA) sequence was as follows: Capsl-L gRNA: 5’-CTATCCCAA TTGTGCTCCTGG-3’; Capsl-R gRNA: 5’-TGGGACTCATGGTTCTAGAGG-3’. ”

      (5) The statement in the methods "All mice used in the study were on a C57BL/6J genetic background," should be better defined. Was the new allele generated on a pure C57BL/6J genetic background, or bred to be some level of congenic? If congenic, to what generation? If unknown, please either test and report the homogeneity of the background, or consult with nomenclature experts (such as available through MGI) to adopt the appropriate F?+NX type designation. This also pertains to the Pdgfb-iCreER mice, which reference 43 describes as having been generated in an F2 population of C57BL/6 X CBA and did not designate the sub-strain of C57BL/6 mice. It is important because one of the explanations for missing heritability in FEVR may be a high level of dependence on genetic background. From the information in the current description, it is also not inherently obvious that the mice studied did not harbor confounding mutations such as rd1 or rd8. 

      We thank the reviewer for suggestion. We added the following description to “Mouse model and genotyping” method section on page 14. “Capslloxp/+ model was generated using the CRISPR/Cas9 nickase technique by Viewsolid Biotechology (Beijing, China) in C57BL/6J background and named Capslem1zxj. The genomic RNA (gRNA) sequence was as follows: Capsl-L gRNA: 5’-CTATCCCAA TTGTGCTCCTGG-3’; Capsl-R gRNA: 5’-TGGGACTCATGGTTCTAGAGG-3’. Pdgfb-iCreER[43] transgenic mice on a mixed background of C57BL/6 and CBA was obtainted from Dr. Marcus Fruttiger and backcrossed to background for 6 generations. Capslloxp/+ mice were bred with Pdgfb-iCreER[43] transgenic mice to generate Capslloxp/loxp, Pdgfb-iCreER mice.” Sanger sequencing was performed on experimental mice to identify whether they harbor confounding mutations such as Pde6b or Crb1. The results showed the mice did not harbor confounding mutations (Fig.S9) and corresponding description was added in the manuscript at page 15.

      (6) In my opinion, more experimental detail is needed regarding Figures 2 and 3. How many fields, of how many retinas and mice were analyzed in Figure 2? How many mice were assessed in Figure 3? 

      We thank the reviewer for thoughtful comments. We have already presented the detailed information in the manuscript, please refer to the “Methods-Quantification of retinal parameters” section for experimental details.

      (7) I suggest adding into the methods whether P-values were corrected for multiple tests. 

      We thank the reviewer for suggestion. Actually, the statistical analysis was performed using unpaired Student’s t-test for comparison between two groups or one-way ANOVA followed by Dunnett multiple comparison test for comparison of multiple groups. The above description was added to “Methods-Image acquisition and statistical analysis” section to make it clear.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors): 

      In summary, the following concerns should addressing reviewers' concerns as outlined below could bolster the evidence from "solid" to "convincing" and further strengthen the study's impact. 

      (1) Analysis of the phenotype in CAPSLheterozygous mice, as highlighted by all 3 reviews. 

      We thank the editor for thoughtful comments. The phenotype analysis of Capsl heterozygous mice was added to Fig.S4, with the corresponding description provided at page 6.

      (2) Analysis of Capsl KO mice to determine if the pathways identified in vitro are modified (as suggested by reviewers 1 & 2). 

      We thank the editor for suggestion. In Fig.S7, RT-qPCR was performed on lung tissues from Capsl Ctrl and KO mice to validate the expression of MYC targets in vivo. And the result indicated that the downstream targets of MYC signaling were also downregulated in vivo, consistent with the in vitro findings.

      (3) Additional description of the genetic pedigrees and variants to address the points raised by reviewer #3. 

      We thank the editor for suggestion. The father of family 3104 has also been identified as a carrier of this heterozygous variant, manifested with FEVR symptoms, according to the medical records. Nevertheless, clinical examination data are presently unavailable. We added corresponding description in the manuscript page 5.

      (4) Validation of the identified protein variants, especially L83F which appears to be expressed at a near normal level. Are these proteins mislocalized, do the variants to interfere with sites of known or predicted protein-protein interactions, could they act in a dominant-negative fashion by aggregation with co-expressed WT protein etc. Given the comparatively weak genetic data, additional validation is required to establish plausibility of CAPSL as a FEVR gene. 

      We thank the editor for suggestion. As substantial amount of p.L83F was detectable at normal molecular weight, we further investigated whether this variant affects protein localization. Fig.S1, immunocytochemistry results indicated that this variant does not affect the subcellular localization of the protein.

      (5) Improved description of experimental details and statistical analyses as outlined by reviewer #3. 

      We thank the editor for suggestion. The more detailed information about Capsl mice was added in the manuscript at page 14-15. The experimental details regarding Figure 2 and Figure 3 have already presented in the “Methods-Quantification of retina parameters” section in the manuscript at page 19-20. And the statistical analysis was performed using unpaired Student’s t-test for comparison between two groups or one-way ANOVA followed by Dunnett multiple comparison test for comparison of multiple groups. The above description was added to “Methods-Image acquisition and statistical analysis” section at page 21 to make it clear.

      Reviewer #1 (Recommendations For The Authors): 

      My reservations lie with the main assertion that CAPSL is associated with FEVR, as the genetic evidence from human studies appears relatively weak. My concerns are as follows: 

      (1) The molecular characterization of the identified mutations suggests a loss of function (LOF). Notably, in one family, both the father and son exhibit the FEVR phenotype and share the LOF mutation, suggesting a dominant mode of inheritance. However, the prevalence of the LOF allele of CAPSL in the general population is high, and its pLI score is 0, according to the GNOMAD database. This raises doubts about the LOF variant of CAPSL being causative for FEVR. 

      We thank the reviewer for recommendation. The pLI score of LOF allele of CAPSL is based of general population, among which Europeans account for ~77% and East Asians make up less than 3%. Since the FEVR families in this article all come from China, the pLI score may not be accurate. Of course, we will continue to collect FEVR pedigrees and screen for CAPSL mutations.

      (2) In the conditional knockout study, a delay in vascular development is observed in the retina up to P14. What the phenotype looks like in adult mice and whether it replicates the human FEVR phenotype? 

      We thank the reviewer for recommendation. We further assessed the phenotype when angiogenesis almost complete at P21, the resulted showed no difference in Ctrl and CapsliECKO/iECKO mice (Fig.S5). And corresponding description was added in the manuscript at page 7.

      (3) The conditional knockout mice lack both alleles of CAPSL. The phenotype resulting from the knockout of a single allele needs investigation to align with observed human phenotypes and genetic data. 

      We thank the reviewer for recommendation. The phenotype of Capsl heterozygous mice at P5 showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. A similar example is LRP5 mutations associated with FEVR. Heterozygous mutations in LRP5 were reported in FEVR patients in multiple populations. However, heterozygous Lrp5 mice exhibited no visible angiogenic phenotype (PMID: 18263894).

      (4) The MYC pathway has been identified as influenced by CAPSL. Whether MYC downregulation is observed in the mouse model in vivo? 

      We thank the reviewer for recommendation. MYC expression was identified at both mRNA and protein level in Figure S8, and corresponding description was added in the manuscript at page 11.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      (1) While authors note that little is known about CAPSL protein function, more introductory detail about the protein (structure, domains intracellular localization etc) and additional discussion on potential mechanisms would aid the reader in interpreting the findings and model.

      We thank the reviewer for recommendation. The subcellular localization of the CAPSL protein is distributed in both the nucleus and cytoplasm (https://www.proteinatlas.org/). The immunochemistry analysis confirmed that CAPSL protein is expressed in both the cell nucleus and cytoplasm (Fig.S1). And corresponding description was added in the manuscript at page 5.

      (2) Pg 7 states that Capsl knockout mainly leads to "...defects in retinal vascular ECs rather than other vascular cells.". Consider rephrasing to describe "other vasculature-associated cells", as no vascular cells outside the retina were examined in the manuscript. 

      We thank the reviewer for recommendation. We rephrased the "...defects in retinal vascular ECs rather than other vascular cells." into "...defects in retinal vascular ECs rather than other vasculature-associated cells" at page 8.

      (3) The manuscript is well written but contains numerous typos. E.g. "" (Pg 14), "MCY signaling axis" (figure 6 legend), "shCAPAL" (figure 5 K). Please correct these, and search carefully for others. 

      We are sorry for the careless mistakes we made, and we have checked the manuscript and correct these mistakes.

      Reviewer #3 (Recommendations For The Authors): 

      The following are somewhat grammatical, but significant issues, that I feel should be addressed before making the pre-print final: 

      (1) Perhaps the largest issue with the manuscript to me is whether CAPSL is an interesting candidate (as stated repeatedly) or causative of FEVR. Within the scope of what is feasible, this is a challenging problem. Since the publication of the pre-print, it would be great if another group independently reported the detection of mutations specifically in FEVR patients. That lacking, meaningful additions to the manuscript that I'd recommend are the inclusion of a paragraph on caveats of the study and reporting the allele frequencies based on public databases. As the authors know the data better than anyone and will have invested thought into the implications, they are the ones best positioned to alert the field to the study's limitations - amongst them- the factors that might practically distinguish whether CAPSL is a candidate or cause.

      We thank the reviewer for recommendation. We will collect more samples from FEVR families and screen for other mutation sites within the CAPSL gene in further studies.

      (2) It is unclear why the modeling with mice did not attempt to recapitulate the observations in humans, i.e., why were heterozygotes for a ubiquitous knockout not studied? Any data with heterozygotes, or ubiquitous alleles (which would be easier to generate than the strain studied) should be shared in the manuscript. If no such data exists, this reviewer would find it a worthwhile new experiment to add, but it is appreciated that new experiments are sometimes beyond the scope of what is possible. At the least, this would be worthwhile to discuss in the requested caveats paragraph of the discussion. 

      We thank the reviewer for recommendation. We evaluated the phenotype of Capsl heterozygous mice at P5, and the results showed no overt difference in vascular progression, vessel density and branchpoints with littermate wildtype controls (Fig.S4). The lack of pronounced phenotype in FEVR heterozygous mice may be due to different sensitivity between human and mice. For example, heterozygous Lrp5 mice exhibited no visible angiogenic phenotype (PMID: 18263894). Corresponding description was added in the manuscript at page 6.

      (3) The statement in the Abstract "which provides invaluable information for genetic counseling and prenatal diagnosis of FEVR" should be toned down, better supported, or rephrased. This appears to be the 18th disease-associated gene for FEVR, with variants identified in 4 patients of the same ethnicity. In my opinion, the word "invaluable" is currently overstated. 

      We thank the reviewer for recommendation. We have changed "which provides invaluable information for genetic counseling and prenatal diagnosis of FEVR" into "which provides valuable information for genetic counseling and prenatal diagnosis of FEVR" in the abstract.

      (4) The transcriptomic and proteomic data should be deposited into a public repository and accession numbers added to the manuscript. 

      We thank the reviewer for recommendation. We have uploaded the raw data of transcriptomic and proteomic to the Genome Sequence Archive (https://ngdc.cncb.ac.cn/gsa/browse/HRA010305) and the Genome Sequence Archive (https://www.ebi.ac.uk/pride/archive), respectively.

      (5) The links to MYC are over-stated in the title "through the MYC axis", the abstract "CAPSL function causes FEVR through MYC axis", and the discussion "we demonstrated that the defects in CAPSL affect EC function by down-regulating the MYC signaling cascade". The links to MYC are entirely by association, there were no experiments testing that the transcriptomic and proteomic changes observed were determinative of the CAPSL-mediated phenotype. It seems appropriate to conjecture that these changes are important, but the above statements all need to be altered and conjectures need to be clearly identified as such. 

      We are sorry to overstate the link between CAPSL-mediated phenotype and MYC axis in the abstract and discussion sections, and we have altered the statements in these sections to make it more logical. For example, we changed “This study also reveals that compromised CAPSL function causes FEVR through MYC axis, shedding light on the potential involvement of MYC signaling in the pathogenesis of FEVR.” into “This study also reveals that compromised CAPSL function causes FEVR may through MYC axis, shedding light on the potential involvement of MYC signaling in the pathogenesis of FEVR.” in the abstract. And in the discussion we changed “…cause FEVR through inactivating MYC signaling, expanding FEVR-involved signaling pathway and providing a potential therapeutic target for the intervention of FEVR” to “…cause FEVR may through inactivating MYC signaling, expanding FEVR-involved signaling pathway and providing a potential therapeutic target for the intervention of FEVR”.

      (6) Finally, I suggest that the following grammatical issues in the pre-print be corrected before making the pre-print final: 

      We have checked the manuscript and correct these mistakes.

      (a) p2. Suggest rewriting the sentence "Nevertheless, the molecular mechanisms by which CAPSL regulates cell processes and signaling cascades have yet to be elucidated." The preceding sentences only state that CASPL is a candidate in another disease - the word "nevertheless" seems to reflect a logic that isn't described. 

      We have checked the manuscript and correct these mistakes.

      (b) p5. Please correct the grammar "We, generated an inducible" 

      We corrected this mistake.

      (c) p5. Suggest rephrasing "impairing CAPSL expression." The word "expression" is often used in reference to transcription. To avoid confusion, something such as "eliminating or reducing protein abundance" might be better. 

      We corrected this mistake.

      (d) p6. Please correct the grammar "As expected, the radial vascular growth, as well as vessel density and vascular branching, are dramatically reduced in..." - note subject-verb agreement issue 

      We corrected this mistake.

      (e) Figure 3 legend - correct "(A) Hyloaid vessels"

      We corrected this mistake.

  2. Jul 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Fita-Torró et al. study the toxic effects of the intermediary lipid degradation product trans-2-hexadecenal (t-2-hex) on yeast mitochondria and suggest a mechanism by which Hfd1 safeguards Tom40 from lipidation by t-2-hex and its consequences, such as mitochondrial protein import inhibition, cellular proteostasis deregulation, and stress-responses. 

      The authors aimed to dissect a mechanism for t-2-hex' apoptotic consequences in yeast and they suggest it is via lipidation of Tom40 but really under the tested conditions everything seems lipidated. Thus, it is unclear whether Tom40 is the crucial causal target. They also do not provide much biochemical experiments to investigate this phenomenon further functionally. Tom40 is one possible and perhaps, given the cellular consequences, a reasonable candidate but not validated beyond in vitro lipidation by exogenous t-2-hex. 

      In the revised version of our manuscript, we have now included extensive new experimentation, which shows that protein import at the TOM complex is a physiologically important target of the pro-apoptotic lipid t-2-hex and that enzymes such as the Hfd1 dehydrogenase sensitively regulate this inhibition. In vitro chemoproteomic experiments have now been performed at more physiological t-2hex concentrations of 10µM, which is lower than published data in human cell models. Consistently, several TOM and TIM subunits are enriched in these in vitro lipidation studies (new Fig. 8B). Tom40 lipidation alone is not sufficient to explain t2-hex toxicity, as a cysteine-free version of Tom40 does not confer tolerance to the apoptotic lipid (new Fig. 8D). Importantly however, the loss of function of nonessential accessory Tom subunits 70 or 20 confers t-2-hex tolerance (new Fig. 8D) indicating that pre-protein import at the TOM complex is a physiological target of t2-hex most likely dependent on lipidation of more Tom subunits than just the essential Tom40 pore. Moreover, we now show that mitochondrial protein import is inhibited by the lipid at low physiological doses of 10µM and that this inhibition is modulated by the gene dose of the t-2-hex degrading Hfd1 enzyme (new Fig. 5G).

      Strengths: 

      The effects of lipids and their metabolic intermediates on protein function are understudied thus the authors' research contributing to elucidating direct effects of a single lipid is appreciated. It is particularly unknown by which mechanism t-2hex causes cell death in yeast. The authors elegantly use modulation of the levels of enzyme Hfd1 that endogenously catabolizes t-2-hex as an approach to studying t2-hex stress. Understanding the cause and consequences of this stress is relevant for understanding fundamental regulation mechanisms, and also to human health since the human homolog of Hfd1, ALDH3A2, is mutated in Sjögren-Larsson Syndrome. The application of a variety of global transcriptomic, functional genomic, and chemoproteomic approaches to study t-2-hex stress targets in the yeast model is laudable. 

      Weaknesses: 

      -  The extent of the contribution of Tom40 lipidation to the general t-2-hex stress phenotype is unclear. Is Tom40 lipidation alone enough to cause the phenotype? An alteration of the cysteine residue in question could help answer this key question. 

      Deletion of all four cysteine residues in Tom40 is not sufficient to confer resistance to t-2-hex stress. This result had been included in the original manuscript, but was somehow hidden in the Discussion. The revised manuscript now includes t-2hex tolerance assays for the Tom40 cysteine free mutant in new Figure 8. As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. We therefore included the non-essential adaptor proteins Tom70 and Tom20 of the TOM complex and tested the tolerance of the respective deletion mutants in t-2-hex tolerance assays. As shown in new Figure 8, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2hex and the tom20 mutant accumulates less Aim17 pre-protein upon t-2-he stress, indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      -  It is unclear whether the exogenously applied amounts of t-2-hex (concentrations chosen between 25-200 uM) are physiologically relevant in yeast cells. For comparison, Chipuk et al. (2012) used at most 1 uM on mitochondria of human cells, while Jarugumilli et al. (2018) considered 25 uM a 'lower dose' on human cells. Since the authors saw responses below 10 uM (Fig. 3B) and at the lowest selected concentration of 25 uM (Fig. 8), why were no lower, likely more specific, concentrations applied for the global transcriptomic and chemoproteomic experiments? Key experiments have to be repeated with the lower concentrations. 

      We have now performed several experiments with lower t-2-hex concentrations. A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information, combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Many subunits of the TOM and TIM complexes consistently are enriched significantly in both chemoproteomic experiments. These new data are summarized in revised Figure 8. Additionally we have performed in vivo pre-protein assays with lower t-2-hex concentrations. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor. It is important to note that a dose of 10µM of external t-2-hex addition is significantly lower than doses applied to human cell cultures such as in Jarugumilli et al. (2018). It proves that mitochondrial protein import is a sensitive and physiologically relevant t2-hex target in our yeast models and that t-2-hex detoxification by enzymes such as the Hfd1 dehydrogenase sensitively regulates this specific inhibition.

      -  The amount of t-2-hex applied is especially important to consider in light of over 1300 proteins lipidated to an extent equal to or greater than Tom40 (Supp. Table 6). This chemoproteomic experiment (Fig. 8B, Supp. Table 6) is also weakened by the inclusion of only 2 replicates, thus precluding assessment of statistical significance. The selection of targets in Fig. 8B as "among the best hits" is neither immediately comprehensible nor further explained and represents at best cherrypicking. Further evidence based on statistical significance or validation by other means should be provided.

      We performed the chemoproteomic screens as described by Jarugumilli et al. (2018) with 2 replicates of mock treated versus 2 replicates of t-2-hex-alkyne treated cell extracts.  A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Differential enrichment analysis of the proteomic data was performed with the amica software (Didusch et al., 2022). Proteins were ranked according to their log2 fold induction comparing lipid- and mock-treated samples with a threshold of ≥1.5, and the adjusted p-value was calculated. Several TOM and TIM subunits were consistently identified as differentially enriched proteins, which is summarized in new Figure 8B.

      - The authors unfortunately also underuse the possible contribution of mass spectrometry technology to in addition determine the extent and localization of lipidation on a global scale (especially relevant since Cohen et al. (2020) suggest site-specific mechanisms). 

      We agree that site-specific modifications of t-2-hex will be most likely important in the inhibition or other type of regulation of specific target proteins. Our collective data show that in the case of the inhibition of mitochondrial protein import, several lipidation events on TOM and TIM are involved. Dissection of individual cysteine lipidations on those subunits will be interesting, but we feel that this is out of the scope of the present work.

      - The general novelty of studying t-2-hex stress is lowered in light of existing literature in humans (see e. g. Chipuk et al., 2012; Cohen et al., 2020; Jarugumilli et al., 2018), and in yeast by the same authors (Manzanares-Estreder et al., 2017) and as the authors comment themselves, a significant part of the manuscript may represent rather a confirmation of the already described consequences of t-2-hex stress 

      We do not agree and we have not commented that our present study is a mere confirmation of t-2-hex stress previously applied in yeast and human models. In humans, t-2-hex has been identified as an efficient pro-apoptotic lipid, which causes mitochondrial dysfunction via direct lipidation of Bax, however the studies of Jarugumilli et al. (2018) revealed that many other direct t-2-hex targets exist, which remained uninvestigated to date. This work continues our previous studies (Manzanares-Estreder et al., 2017), where we show that t-2-hex is a universal proapoptotic lipid applicable in yeast models and contributes important novel findings, such as the massive transcriptional response resembling proteostatic defects caused by t-2-hex, mitochondrial protein import as a physiologically important and direct target of t-2-hex, the function of detoxifying enzymes such as Hfd1 in modulating lipid-mediated inhibition of mitochondrial protein import and general proteostasis. Additionally, we provide transcriptomic, chemoproteomic and functional genomic data to the scientific community, which will be a rich source for future studies on yet undiscovered pro-apoptotic mechanisms employed by t-2-hex. 

      Reviewer #2 (Public Review): 

      This study elucidates the toxic effects of the lipid aldehyde trans-2-hexadecenal (t-2-hex). The authors show convincingly that t-2-hex induces a strong transcriptional response, leads to proteotoxic stress, and causes the accumulation of mitochondrial precursor proteins in the cytosol. 

      The data shown are of high quality and well controlled. The genetic screen for mutants that are hyper-and hypo-sensitive to t-2-hex is elegant and interesting, even if the mechanistic insights from the screen are rather limited. The last part of the study is less convincing. The authors show evidence that t-2-hex affects subunits of the TOM complex. However, they do not formally demonstrate that the lipidation of a TOM subunit is responsible for the toxic effect of t-2-hex. A t-2-hexresistant TOM mutant was not identified. Moreover, it is not clear whether the concentrations of t-2-hex in this study are physiological. This is, however, a critical aspect. The literature is full of studies claiming the toxic effects of compounds such as H2O2; even if such studies are technically sound, they are misleading if nonphysiological concentrations of such compounds were used. 

      Nevertheless, this is an interesting study of high quality. A few specific aspects should be addressed.

      We have now performed t-2-hex toxicity assays using several mutants in Tom subunits, the cysteine free mutant of the essential Tom40 core channel and deletion mutants in the accessory subunits Tom70 and Tom20 (new Figure 8). As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. Indeed, as shown in new Figure 8, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2-hex indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      We have now performed several experiments with lower t-2-hex concentrations. A new chemoproteomic study with 10µM t-2-hex-alkyne has been conducted and the new results added to the supplementary information combining 10µM and 100µM in vitro lipidation studies (Suppl. Table 6). Many subunits of the TOM and TIM complexes consistently are enriched significantly in both chemoproteomic experiments. These new data are summarized in revised Figure 8.

      Additionally we have performed in vivo pre-protein assays with lower t-2-hex concentrations. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor. It is important to note that a dose of 10µM of external t-2-hex addition is significantly lower than doses applied to human cell cultures such as in Jarugumilli et al. (2018). It proves that mitochondrial protein import is a sensitive and physiologically relevant t2-hex target in our yeast models and that t-2-hex detoxification by enzymes such as the Hfd1 dehydrogenase sensitively regulates this specific inhibition.

      Reviewer #3 (Public Review): 

      Summary: The authors investigate the effect of the lipid aldehyde trans-2hexadecenal (t-2-hex) in yeast using multiple omic analyses that show that a large range of cellular functions across all compartments are affected, e.g. transcriptomic changes affect 1/3 of all genes. The authors provide additional analyses, from which they built a model that mitochondrial protein import caused by modification of Tom40 is blocked. 

      Strengths: Global analyses (transcriptomic and functional genomics approach) to obtain an unbiased overview of changes upon t-2-hex treatment. 

      Weaknesses: It is not clear why the authors decided to focus on mitochondria, as only 30 genes assigned to the GO term "mitochondria" are increasing, and also the follow-up analyses using SATAY is not showing a predominance for mitochondrial proteins (only 4 genes are identified as hits). The provided additional experimental data do not support the main claims as neither protein import is investigated nor is there experimental evidence that lipidation of Tom40 occurs in vivo and impacts on protein translocation. 

      30 mitochondrial gene functions are very strongly (>10 fold) up-regulated by t-2-hex. However, when genes up-regulated (>2 log2FC) or down-regulated (<-2 log2FC) by t-2-hex were selected and subjected to GO category enrichment analysis, we found that “Mitochondrial organization” was the most numerous GO group activated by t-2-hex, while it was “Ribosomal subunit biogenesis” for t-2-hex repression (new data in Suppl. Tables 1 and 2). 

      In the revised version of our manuscript, we have now included extensive new experimentation, which shows that protein import at the TOM complex is a physiologically important target of the pro-apoptotic lipid t-2-hex and that enzymes such as the Hfd1 dehydrogenase sensitively regulate this inhibition. In vitro chemoproteomic experiments have now been performed at more physiological t-2hex concentrations of 10µM, which is lower than published data in human cell models. Consistently, several TOM and TIM subunits are enriched in these in vitro lipidation studies (new Fig. 8B). Tom40 lipidation alone is not sufficient to explain t2-hex toxicity, as a cysteine-free version of Tom40 does not confer tolerance to the apoptotic lipid (new Fig. 8D). Importantly however, the loss of function of nonessential accessory Tom subunits 70 or 20 confers t-2-hex tolerance (new Fig. 8D) indicating that pre-protein import at the TOM complex is a physiological target of t2-hex most likely dependent on lipidation of more Tom subunits than just the essential Tom40 pore. Moreover, we now show that mitochondrial protein import is inhibited by the lipid at low physiological doses of 10µM and that this inhibition is modulated by the gene dose of the t-2-hex degrading Hfd1 enzyme (new Fig. 5G).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Private recommendations for the authors 

      - On the existing data from Supp. Table 6, the authors may include a global assessment of whether or not the protein included a cysteine (the likely site for lipidation). 

      Although free cysteines in target proteins are the most frequent sites of modification by LDEs such as t-2-hex, other amino acids such as lysines or histidines can be lipidated by these lipid derivatives. Therefore we would like to exclude this information from our chemoproteomic data.

      - What determines whether a gene is labeled in Fig. 6B other than fold change? Why is MAC1 with the highest FC not shown? 

      We analyzed the potential anti-apoptotic SATAY hits with a log2 < -0.75 according to expected detoxification pathways (heat shock response, pleiotropic drug response), to their function in the ER (the intracellular site where t-2-hex is generated) or in mitochondria (the major t-2-hex target identified so far). This is now better described in the text. As for the potential pro-apoptotic SATAY hits, we analyzed gene functions with a log2 > 1.5 and marked the predominant groups “Cytosolic ribosome and translation” and “Amino acid metabolism”. In any case, the interested reader has all SATAY data available in supplemental tables 4 and 5 to find alternative gene functions with a potential role in cellular adaptation to t-2-hex.

      - Supplementary Table numbering should be double-checked.

      Ok, numbering has been double-checked.

      Reviewer #2 (Recommendations For The Authors): 

      Major points 

      (1) Identification of the t-2-hex target. Neither Tom70, Tom20 nor the cysteine in Tom40 is essential. If one of these components is critical for the t-2-hex-mediated toxicity, mutants should be t-2-hex-resistant. This is a straight-forward, simple, and critical experiment. 

      We have now performed t-2-hex toxicity assays in the cysteine free Tom40 mutant, and tom20 and tom70 deletion mutants. As shown in new Figure 8, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. However, the absence of Tom70 and Tom20 function significantly increases tolerance to t-2-hex indicating that the TOM complex is a physiologically important target of the proapoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      (2) The authors claim that t-2-hex blocks the TOM complex. Since in vitro import assays with yeast mitochondria are a well established and simple technique, the authors should isolate mitochondria from their cells and perform import experiments. It is expected that those mitochondria show reduced import rates, however, swelling of these mitochondria to mitoplasts should suppress the import defect. 

      We agree that our study does not investigate a direct effect of t-2-hex on the import capacity of purified mitochondria. However, we determine the in vivo accumulation of several mitochondrial precursor proteins, which is widely used to assay for the efficiency of mitochondrial protein import, for example the recent hallmark paper discovering the mitoCPR protein import surveillance pathway exclusively uses epitope-tagged mitochondrial precursors to determine the regulation of mitochondrial protein import (Weidberg and Amon, Science 2018 360(6385)). Additionally, our new results that mutants in accessory TOM subunits 20 and 70 are hyperresistant to t-2-hex (Figure 8D) and that deletion of TOM20 decreases the t-2-hex induced pre-protein accumulation (Suppl. Figure 1) identify the TOM complex and hence protein import at the outer mitochondrial membrane as a physiologically important t-2-hex target.

      (3) The first part of the study is very strong. The last figure is also of good quality, however, it is not clear whether the effects on TOM subunits are really causal for the observed t-2-hex effect on gene expression. The authors might cure this by improved data or by avoiding bold statements such as: 'Hfd1 associates with the Tom70 subunit of the TOM complex and t-2-hex covalently lipidates the central Tom40 channel, which altogether indicates that transport of mitochondrial precursor proteins through the outer mitochondrial membrane is directly inhibited by the pro-apoptotic lipid and thus represents a hotspot for pro- and anti-apoptotic signaling.' (Abstract). 

      We now show that several TOM and TIM subunits are lipidated in vitro by physiological low t-2-hex concentrations, that loss of function of accessory subunits Tom20 or Tom70 rescues t-2-hex toxicity (new Figure 8) and that the gene dose of Hfd1 determines the degree of mitoprotein import block (new Figure 5). These data identify the TOM complex as a physiologically important target of the pro-apoptotic lipid. The Abstract has been modified accordingly.

      (4) If the t-2-hex levels are in a physiological range, one would expect that overexpression of Hfd1 prevents the t-2-hex-induced import arrest.

      We have now confirmed that overexpression of Hfd1 indeed prevents inhibition of mitochondrial protein import by t-2-hex. As shown in new Figure 5, Aim17 mitochondrial import is already inhibited by t-2-hex doses as low as 10µM in a wild type strain, and that this inhibition is enhanced in a hfd1 mutant and alleviated in a Hfd1 overexpressor.

      (5) The authors claim that Fmp52 is a t-2-hex-detoxifying enzyme, but do not show evidence. They should rewrite this sentence and be more cautious, or they should show that increased Fmp52 levels indeed deplete t-2-hex from mitochondria.  

      We show that loss of Fmp52 function leads to a strong t-2-hex sensitivity. Fmp52 belongs to the NAD-binding short-chain dehydrogenase/reductase (SDR) family and localizes to highly purified mitochondrial outer membranes (Zahedi et al, 2006). These are the indications that suggest that Fmp52 participates in the enzymatic detoxification of t-2-hex in addition to Hfd1. The Results section has been modified accordingly.

      Minor points: 

      (6) Aim17 was recently identified as a characteristic constituent of cytosolic protein aggregates named MitoStores (Krämer et al., 2023, EMBO J). The authors might test whether the cytosolic Aim17 protein colocalizes with the Hsp104-GFP granules that accumulate upon t-2-hex exposure as shown in Fig. 4A. 

      We agree that determining the fate of unimported mitochondrial precursors upon t-2-hex stress would be interesting. We have made some attempts to co-visualize Aim17-dsRed and Hsp104-GFP upon t-2-hex treatment, but we still have some technical issues. While we clearly see that Aim17 accumulates in cytoplasmic foci upon prolonged t-2-hex exposure, we are not able to determine colocalization with Hsp104, in great part because t-2-hex causes mitochondrial fragmentation, which leads to the appearance of Aim17-stained foci in the cytosol independently of protein aggregates. While so far we are not able to localize Aim17 unambiguously in Hsp104 containing aggregates (mitoStores) upon lipid stress, we would like to move the manuscript farther without those experiments.

      (7) In Fig. 1A, the figures of the different lines are difficult to distinguish. Lines of one color with different intensities would be better suited. 

      We have been working before with dose-response profiles generated by the destabilized luciferase system and found that the color-coded representation of the plots is the most effective way to represent the data, see for example Fita-Torró et al. Mol Ecol. 2023 32(13):3557-3574, Pascual-Ahuir et al. BBA 2019 1862(4):457-471, Rienzo et al., Mol Cell Biol. 2015 35(21):3669-83, and several other publications. Therefore we want to keep the format of the Figure.

      (8) A title page should be added to each of the supplemental data files with short descriptions of the information that is provided in the columns of the tables.  Response: Explanatory title pages have been now added to the supplemental data files.

      Reviewer #3 (Recommendations For The Authors): 

      Figure 5A: The authors aim to assess protein import, however, their experimental set-up is not suited and does not allow conclusions about protein translocation into mitochondria. The authors monitor protein steady state levels, which does not reflect import capacity. For this e.g. pulse-chase experiments coupled to coIP or in organello import assays with radiolabeled substrate proteins would be required. In addition, the authors lack a non-treated control to show that no precursor accumulates in the absence of CCCP and t-2-hex. At the moment, the conclusion of blocked import cannot be made, as there are many other explanations for the observed steady state levels, e.g. the TAP tag interfered with the import competence of the precursor or t-2-hex could impact on MPP function (in particular as Figure 8B shows that also intra-mitochondrial proteins undergo modification by t-2-hex). 

      We agree that our study does not investigate a direct effect of t-2-hex on the import capacity of purified mitochondria. However, we determine the in vivo accumulation of several mitochondrial precursor proteins, which is widely used to assay for the efficiency of mitochondrial protein import, for example the recent hallmark paper discovering the mitoCPR protein import surveillance pathway exclusively uses epitope-tagged mitochondrial precursors to determine the regulation of mitochondrial protein import (Weidberg and Amon, Science 2018 360(6385)). Figure 5 contains several non-treated control experiments, which show that no (or less in the case of Ilv6) precursors of Tap-tagged Aim17, Cox5a, Ilv6, or Sdh4 accumulate in the absence of CCCP or t-2-hex. This is shown in Figure 5A for untreated cells or in Figure 5B and new Figure 5G for solvent (DMSO) treated cells. This demonstrates that the Tap-tag does not interfere with the import of the respective precursors. Additionally, our new results that mutants in accessory TOM subunits 20 and 70 are hyperresistant to t-2-hex (Figure 8D) identify the TOM complex and hence protein import at the outer mitochondrial membrane as a physiologically important t-2-hex target.

      Figure 8: The conclusion that Tom40 is directly lipidated comes from an in vitro assay, with the conclusion that Tom40 is the main target, because it is the only Tom protein with a cysteine (Tom70 as not being part of the Tom core is excluded, however, lack of Tom70 function would also have detrimental consequences for mitochondrial protein import). However, there is no experiment showing a modification of Tom40 and a consequence for protein import. The proposed model is therefore very far-fetched and several aspects are speculation but not supported by experimental data. To propose such a model, the author needs to show experimental evidence, e.g. by generating a yeast strain in which the cysteine i Tom40 are replaced by e.g. Serine residues, and then assess if protein import (e.g. pulse-chase assays) are not affected anymore upon addition of t-2-hex. 

      Deletion of all four cysteine residues in Tom40 is not sufficient to confer resistance to t-2-hex stress. This result had been included in the original manuscript, but was somehow hidden in the Discussion. The revised manuscript now includes t-2hex tolerance assays for the Tom40 cysteine free mutant in new Figure 8D. As a result, cysteine lipidation of Tom40 alone is not sufficient to confer t-2-hex toxicity. This implies most likely other lipidation targets within the TOM and TIM complexes, as indicated by our in vitro lipidation studies. We therefore included the non-essential adaptor proteins Tom70 and Tom20 of the TOM complex and tested the tolerance of the respective deletion mutants in t-2-hex tolerance assays. As shown in new Figure 8D, the absence of Tom70 and Tom20 function significantly increases tolerance to t2-hex indicating that the TOM complex is a physiologically important target of the pro-apoptotic lipid, which acts most likely via lipidation of more subunits than the Tom40 import channel.

      Figure 8A: The pulldown experiments lack positive (other Tom subunits) and negative controls and were performed with (large) tags on all proteins, which can easily result in false positive interactions. The conclusion that Hfd1 interacts with Tom70 and Tom22 cannot be made. Also, the conclusion if an interaction is robust or not cannot be made as the pull-down lacks control fractions, it is also not clear how much of the eluate was loaded. Finally, Hfd1-HA was not expressed from its endogenous promoter, likely resulting in over-expression, which again strongly hampers conclusions about bona fide interaction partners. 

      We agree that our pulldown studies are done in an artificial context, such as Hfd1 overexpression needed for sufficient protein level for detection or use of Tapfusion proteins. However, the conclusion that Tom70 is a potential interactor of Hfd1 can be made based on the following observations: Hfd1-HA is preferentially pulled down from total protein extracts containing Tom70-Tap, but not from extracts containing no Tap-protein and significantly less from extracts containing Tom22-Tap, another TOM associated subunit. The pulldown assay has been repeated now several times and the efficiency of Hfd1 pulldown has been quantified and statistically analyzed with respect to the quantity of purified Tom protein, which is shown in modified Figure 8A. 

      Figure 4A and C: Depletion of proteasomal activity results in larger aggregates in Figure 4A. However, the addition of t-2-hex blocks proteasomal activity (Figure 4C). How can proteasome inhibition result in bigger aggregates if the proteasomal activity is lost upon t-2-hex addition?

      The negative effect of t-2-hex on proteasomal activity is most likely an indirect effect caused by protein aggregation (Bence et al., Science 2001 292-1552) and occurs in wild type and rpn4 mutant cells with reduced proteasomal activity (Fig. 4C). t-2-hex causes cytosolic protein aggregation in wild type cells, which is aggravated (more and larger protein aggregates) in rpn4 mutants because of their lower levels of active proteasome (Fig. 4A). The observed protein aggregates will further diminish proteasomal activity, which is confirmed in Fig. 4C. 

      Figure 1B: The authors use a reporter to determine HFD1 expression that consists of the promoter region of HFD1 fused to luciferase. These fusion constructs have been shown to often not reflect the bona fide expression levels of genes (Yoneda et al., J Cell Sci 2004). qPCR analysis of transcript levels should be included to support the induction of HFD1. 

      We agree that the live cell luciferase reporters used here are not suitable for the determination of absolute mRNA levels. However, the aim of these reporter experiments is to quantify the inducibility of different genes (HFD1, GRE2) dependent on increasing stress doses. These dose response profiles cannot be obtained by qPCR analysis, while the destabilized reporters are an excellent tool for this, which have been used to accurately describe numerous dynamic stress responses (for example: Dolz-Edo et al. 2013 MCB 33:2228-40, Rienzo et al. 2015 MCB 35:3669-83, PascualAhuir et al. 2019 BBA 862:457-471). Additionally, the induction of HFD1 mRNA levels by salt (NaCl) and oxidative (menadione) stress determined by qPCR has been published before (Manzanares-Estreder et al. 2017 Oxid Med Cell Longevity 2017:2708345).

      The authors conclude from Figure 1 that entry into apoptotic cell death is modulated by efficient t-2-hex detoxification. However, this is based on growth curves and no analysis of apoptotic cell death is performed. The data show that the addition of hexadecenal results in a growth arrest, that is overcome likely upon degradation of t-2-hex (depending on the amount of Hfd1). 

      We agree that our experiments measure growth inhibition and not specifically apoptotic cell death. The text has been changed accordingly.  

      Figure 4A: Microscopy images show between 1-2 yeast cells. Either more cells need to be shown or quantifications of the aggregates are required. In addition, it is not clear if the control received the same DMSO concentration as the treated cells and also the time point for the control is not specified. 

      We have now quantified the number of aggregates across cell populations in new Figure 4A in DMSO, t-2-hex and t-2-hex-H2 treated wt and rpn4 mutants. These data show specific aggregate induction by t-2-hex and not by DMSO or the saturated t-2-hex-H2 control, which is aggravated in rpn4 mutants and avoided by CHX pretreatment.

      Figure 5: Western blots in figure 5A, B, D, E and F lack a loading control. Without this, conclusions about increases in protein abundance cannot be made.  Response: We have now included additional panels with the loading controls for the Western blots in new figure 5, except figure 5B, where the appearance or not of the pre-protein can be compared to the amount of mature protein in the same blot.

      Figure 2B: Complex II assembly factors SDH5,6,9 are described here as ETC complexes. As the proteins are not part of the mature complex II, the heading should be modified into ETC complexes and ETC assembly.

      Figure 2B has been revised and the classification of ETC proteins changed accordingly.

    1. Author response

      Reviewer #1 (Public Review):

      The authors use neural recordings from three different brain areas to assess whether the type of evidence accumulation dynamics in those regions are (1) similar to one another, and (2) similar to best-fitting evidence accumulation dynamics to behavioral choice alone. This is an important theoretical question because it relates to the 'linking hypothesis' that relates neurophysiological data to psychological phenomena. Although the standard evidence accumulation dynamic in describing choice has been the gradual accumulation of evidence, the authors find that those dynamics are not represented equally in all brain regions. Such results suggest that more nuanced computational models are needed to explain how brain areas interact to produce decisions, and the focus of theoretical development should shift away from explaining behavioral patterns alone and more toward explaining both brain and behavioral interactions. Given that the authors simply test the assumption that the same dynamics that best explain behavior should also explain neural data, they accomplish their objective using a sophisticated methodology and find evidence *against* this assumption: they find that each region was best described by a distinct accumulation model, which all differed from the model that best described the rat's choices.

      I thought this was an excellent paper with a clear scientific objective, direct analysis to achieve that objective, and a very strong methodological approach to leave little doubt that the conclusions they drew from their analyses were as reasonable and accurate as possible.

      We thank the reviewer for their time and appreciate their generous comments.

      Reviewer #2 (Public Review):

      The neural dynamics underlying decision-making have long been studied across different species (e.g., primates and rodents) and brain areas (e.g., parietal cortex, frontal eye fields, striatum). The key question is to what extent neural firing rates covary with evidence accumulation processes as proposed by evidence accumulation models. It is often assumed that the evidence-accumulation process at the neural level should mirror the evidence-accumulation process at the behavioral level. The current paper shows that the neural dynamics of three rat brain regions (the FOF, ADS, and PCC) all show signatures of evidence accumulation, but in distinct ways. Especially the role of the FOF appears to be distinct, due to its dependence on early evidence compared to the other regions. This sheds new light and a new interpretation of the role of the FOF in decision-making - previously, it has been described as a region encoding the choice that is currently being committed to; this new analysis suggests it is instead strongly influenced by early evidence.

      A major strength of the paper is that the results are achieved through joint modelling of the behavioral and neural data, combined with information on the physical stimulus at hand. Joint models were shown to provide more information on the underlying processes compared to behavioral or neural models alone. Especially the inclusion of the neural data seemed to have greatly improved the quality of inferences. This is a key contribution that illustrates that the sophisticated modelling of multiple sources of data at the same time, pays off in terms of the quality of inferences. Yet, it should be added here, that due to the nature of the task, the behavioral data contained only choices, and not response times, which tend to contain more information regarding the evidence accumulation process than choice alone. It would be interesting to additionally discuss how choice decision times can be modeled with the proposed modelling framework.

      We thank the reviewer for their generous views on our work. We agree that adding decision times, which could readily be added to our framework, will likely further constrain the inference of the latent model. We are currently pursuing such topics using this framework and appropriate data. We have altered a passage in our Discussion, where we note the various extensions of our model one could pursue, to include response time within the set of behavioral measurements one might include.

      A main limitation of the paper is that it does not appear to address a seemingly logical follow-up question: If these three brain regions individually accumulate evidence in distinct manners, how do these multiple brain regions then each contribute to a final choice? The joint models fit each region's data separately, so how well does each region individually 'explain' or 'predict' behavior, and how does the combined neural activity of the regions lead to manifest behavior? I would be very interested in the authors' perspectives on these questions.

      We could not share the reviewers view and interest in this question with any more excitement than we already do! Unfortunately, the experiments necessary for answering this question in a satisfying way have not yet been performed (e.g. simultaneous multi-region population recordings). Additionally, our analysis approach, as presented currently, would require some technical alterations to deal with data at that scale. Both efforts are underway, but we feel as though the current manuscript describes the basic modeling framework one would need to use to address these questions if/when such data exists. We have added some text to the Discussion to highlight these exciting future directions:

      “An exciting future application of our modeling framework is to model multiple, independent accumulators in several brain regions which collectively give rise to the animal’s behavior. Such a model would provide incredible insight into how the brain collectively gives rise to behavioral choices.”

      There are some remaining questions regarding the specific models used, that I was hoping the authors could clarify. Specifically, in equations 10-11, I was wondering to what extent there might be a collinearity issue. Equation 10 proposes that the firing rates of neurons can vary across time due to two mechanisms: (1) The dependence of the firing rate on the accumulated evidence, and (2) a time-varying trial average (as detailed in Equation 11). If firing rates of the neuron indeed covary with the accumulated evidence and therefore increase across time, how can the effects of mechanisms 1 and 2 be disentangled? Relatedly, the independent noise models model each neuron separately and thereby include many more parameters, each informed by less data. Is it possible that the relatively poor cross-validation of the independent noise model may be a consequence of the overfitting of the independent noise model?

      Thank you for this important observation. Please see our response to the essential revisions above which addresses this issue. In short, although it is true that firing rates increase with time (with accumulating evidence) they do so in a way that depends on the stimulus, and so just as often as they increase with time, they decrease.

      Regarding the poor cross-validation of the independent noise model, we apologize for confusion here — both the shared and independent noise model have exactly the same number of parameters. They only differ in that the latent process for a trial contains unique noise instantiation per trial for the independent noise model and the same instantiating for the shared model. The number of parameters is the same. See above for our response to this issue, and how the manuscript was modified in light of this confusion.

      Another related question is how robust the parameter recovery properties of these models are under a wider range of data-generating parameter settings. I greatly appreciate the inclusion of a parameter recovery study (Figure S1C) using a single synthetic dataset, but it could be made even stronger by simulating multiple datasets with a wider range of parameter settings. Such a simulation study would help understand how robust and reliable the estimated parameters of all models are. Similarly, it would be helpful if also the \theta_{y} parameters are shown, which aren't shown in Figure S1C.

      We agree that understanding the model fitting behavior under a wider set of parameter settings is valuable. We fit our model to additional sets of parameter settings and included an additional supplemental figure (Figure 1 — figure supplement 2) to illustrate these results. In short, we found that parameter recovery was robust across the different parameter settings we tested. We also updated Figure S1C with the neural parameters. We included the following in the Results to note that parameter recovery was robust:

      “We verified that our method was able to recover the parameters that generated synthetic physiologically-relevant spiking and choices data (Figure 1 — figure supplement 1), and that parameter recovery was robust across a range of parameter values (Figure 1 — figure supplement 2)).” 

      An aspect of the paper that initially raised confusion with me is that the models fit on the choice data and stimulus information alone, make different predictions for the evidence accumulation dynamics in different regions (e.g., Figure 5A, 6A) and also led to different best-fitting parameters in different regions (Figure S9A). It took me a while to realize that this is due to the data being pooled across different rats and sessions - as such, the behavioral choice data are not the same across regions, and neither is the resulting fit models. This could easily be clarified by adding a few notes in the captions of the relevant figures.

      Thanks for pointing this out. We agree that this tends to be a point of confusion, and we have added clarification prior to Fig 3, where the choice model is first introduced:

      “We stress that because of this, each fitted choice model uses different behavioral choice data, and thus the fitted parameters vary from fitted model to fitted model.”

      Combined, this manuscript represents an interesting and welcome contribution to an ongoing debate on the neural dynamics of decision-making across different brain regions. It also introduced new joint modelling techniques that can be used in the field and raised new questions on how the concurrent activity of neurons across different brain regions combined leads to behavior.

      We appreciate the very generous views on our work!

    1. Author response:

      eLife assessment

      This useful study reports on the discovery of an antimicrobial agent that kills Neisseria gonorrhoeae. Sensitivity is attributed to a combination of DedA assisted uptake of oxydifficidin into the cytoplasm and the presence of a oxydifficidin-sensitive RplL ribosomal protein. Due to the narrow scope, the broader antibacterial spectrum remains unclear and therefore the evidence supporting the conclusions is incomplete with key methods and data lacking. This work will be of interest to microbiologists and synthetic biologists.

      General comment about narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The main focus of this study is on its previously unreported potent anti-gonococcal activity and mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      We are troubled by the statement that our paper is narrow in scope and that evidence supporting our conclusions is incomplete. We do not feel the reviews as presented substantiate drawing this conclusion about our work.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      (1) There are important gaps in the manuscript's methods.

      The requested additions to the method describing bacterial sequencing and anti-gonococcal activity screening will be made. However, we do not think the absence of these generic methods reduces the significance of our findings.

      (2) The work should evaluate antibiotics relevant to N. gonorrhoeae.

      (1) It is not clear to us why reevaluating the activity of well characterized antibiotics against known gonorrhoeae clinical strains would add value to this manuscript. The activity of clinically relevant antibiotics against antibiotic-resistant N. gonorrhoeae clinical isolates is well described in the literature. Our use of antibiotics in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      (2) If the reviewer insists, we would be happy to include MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone).

      (3) The genetic diversity of dedA and rplL in N. gonorrhoeae is not clear, neither is it clear whether oxydifficidin is active against more relevant strains and species than tested so far.

      (1) We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      (2) While the usefulness of screening more clinically relevant antibiotics against clinical isolates as suggested in comment 2 was not clear to us, we agree that screening these strains for oxydifficidin activity would be beneficial. We have ordered Neisseria gonorrhoeae strain AR1280, AR1281 (CDC), and Neisseria meningitidis ATCC 13090. They will be tested when they arrive.

      Reviewer #2 (Public Review):

      Summary:

      Kan et al. present the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase-assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This novel mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study underscores the potential of revisiting natural products for antibiotic discovery of modern-day-concerning pathogens and highlights a new target mechanism that could inform future drug development. Indeed there is a recent growing body of research utilizing AI and predictive computational informatics to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic-resistant N. gonorrhoeae. Methodologically, the study is rigorous employing various experimental techniques such as genome sequencing, bioassay-guided fractionation, LCMS, NMR, and Tn-mutagenesis.

      Weaknesses:

      The scope is somewhat narrow, focusing primarily on N. gonorrhoeae. This limits the generalizability of the findings and leaves questions about its broader antibacterial spectrum. Moreover, while the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. Potential SNPs within dedA or RplL raise concerns about how quickly resistance could emerge in clinical settings.

      (1) Spectrum/narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The focus of this study is on its previously unreported potent anti-gonococcal activity and its mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      (2) Animal models: We acknowledge the reviewer’s insight regarding the importance of in vivo validation to enhance oxydifficidin’s pre-clinical potential. However, due to the labor-intensive process needed to isolate oxydifficidin, obtaining a sufficient quantity for animal studies is beyond the scope of this study. Our future work will focus on optimizing the yield of oxydifficidin and developing a topical mouse model for subsequent investigations.

      (3) Potential SNPs: Please see our response to Reviewer #1’s comment 3. We acknowledge that potential SNPs within dedA and rplL raise concerns regarding clinical resistance, which is a common issue for protein-targeting antibiotics. Yet, as pointed out in the manuscript, obtaining mutants in the lab was a very low yield endeavor.

      Reviewer #3 (Public Review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rplL and showed that resistance could occur via mutation in the DedA flippase and RplL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      We thank the reviewer for the positive comment. We agree that investigating factors that could compensate for the fitness attenuation caused by DedA mutation would enhance our understanding of the role of DedA.

    1. Author response:

      We thank you for the opportunity to provide a concise response. The criticisms are accurately summarized in the eLife assessment:

      the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      The essence of our study is to propose the adoption of the Haldane model of genetic drift, based on the branching process, in lieu of the Wright-Fisher (WF) model, based on sampling, usually binomial.  In addition to some extensions of the Haldane model, we present 4 paradoxes that cannot be resolved by the WF model. The reviews suggest that some of the paradoxes could be resolved by the WF model, if we engage prior literature sufficiently.

      We certainly could not review all the literature on genetic drift as there must be thousands of them. Nevertheless, the literature we do not cover is based on the WF model, which has the general properties that all modifications of the WF model share.  (We should note that all such modifications share the sampling aspect of the WF model. To model such sampling, N is imposed from outside of the model, rather than self-generating within the model.  Most important, these modifications are mathematically valid but biologically untenable, as will be elaborated below. Thus, in concept, the WF and Haldane models are fundamentally different.)

      In short, our proposal is general with the key point that the WF model cannot resolve these (and many other) paradoxes.  The reviewers disagree (apparently only partially) and we shall be specific in our response below.

      We shall first present the 4th paradox, which is about multi-copy gene systems (such as rRNA genes and viruses, see the companion paper). Viruses evolve both within and between hosts. In both stages, there are severe bottlenecks.  How does one address the genetic drift in viral evolution? How can we model the effective population sizes both within- and between- hosts?  The inability of the WF model in dealing with such multi-copy gene systems may explain the difficulties in accounting for the SARS-CoV-2 evolution. Given the small number of virions transmitted between hosts, drift is strong which we have shown by using the Haldane model (Ruan, Luo, et al. 2021; Ruan, Wen, et al. 2021; Hou, et al. 2023). 

      As the reviewers suggest, it is possible to modify the WF model to account for some of these paradoxes. However, the modifications are often mathematically convenient but biologically dubious. Much of the debate is about the progeny number, K.  (We shall use haploid model for this purpose but diploidy does not pose a problem as stated in the main text.) The modifications relax the constraint of V(k) = E(k) inherent in the WF sampling.  One would then ask how V(k) can be different from E(k) in the WF sampling even though it is mathematically feasible (but biologically dubious)?  Kimura and Crow (1963) may be the first to offer a biological explanation.  If one reads it carefully, Kimura's modification is to make the WF model like the Haldane model. Then, why don't we use the Haldane model in the first place by having two parameters, E(k) and V(k), instead of the one-parameter WF model?

      The Haldane model is conceptually simpler. It allows the variation in population size, N, to be generated from within the model, rather than artificially imposed from outside of the model.  This brings us to the first paradox, the density-dependent Haldane model. When N is increasing exponentially as in bacterial or yeast cultures, there is almost no drift when N is very low and drift becomes intense as N grows to near the carrying capacity.  We do not see how the WF model can resolve this paradox, which can otherwise be resolved by the Haldane model.

      The second and third paradoxes are about how much mathematical models of population genetic can be detached from biological mechanisms. The second paradox about sex chromosomes is rooted in the realization of V(k) ≠ E(k).  Since E(k) is the same between sexes but V(k) is different, how does the WF sampling give rise to V(k) ≠ E(k)? We are asking a biological question that troubled Kimura and Crow (1963) alluded above. The third paradox is acknowledged by two reviewers. Genetic drift manifested in the fixation probability of an advantageous mutation is 2s/V(k).  It is thus strange that the fundamental parameter of drift in the WF model, N (or Ne), is missing.  In the Haldane model, drift is determined by V(k) with N being a scaling factor; hence 2s/V(k) makes perfect biological sense,

      We now answer the obvious question: If the model is fundamentally about the Haldane model, why do we call it the WF-Haldane model? The reason is that most results obtained by the WF model are pretty good approximations and the branching process may not need to constantly re-derive the results.  At least, one can use the WF results to see how well they fit into the Haldane model. In our earlier study (Chen, et al. (2017); Fig. 3), we show that the approximations can be very good in many (or most) settings.

      We would like to use the modern analogy of gas-engine cars vs. electric-motor ones. The Haldane model and the WF model are as fundamentally different concepts as the driving mechanisms of gas-powered vs electric cars.  The old model is now facing many problems and the fixes are often not possible.  Some fixes are so complicated that one starts thinking about simpler solutions. The reservations are that we have invested so much in the old models which might be wasted by the switch. However, we are suggesting the integration of the WF and Haldane models. In this sense, the WF model has had many contributions which the new model gratefully inherits. This is true with the legacy of gas-engine cars inherited by EVs.

      The editors also issue the instruction: while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      We are thankful to the editors and reviewers for the thoughtful comments and constructive criticisms. We also appreciate the publishing philosophy of eLife that allows exchanges, debates and improvements, which are the true spirits of science publishing.

      References for the provisional author responses

      Chen Y, Tong D, Wu CI. 2017. A New Formulation of Random Genetic Drift and Its Application to the Evolution of Cell Populations. Mol. Biol. Evol. 34:2057-2064.

      Hou M, Shi J, Gong Z, Wen H, Lan Y, Deng X, Fan Q, Li J, Jiang M, Tang X, et al. 2023. Intra- vs. Interhost Evolution of SARS-CoV-2 Driven by Uncorrelated Selection-The Evolution Thwarted. Mol. Biol. Evol. 40.

      Kimura M, Crow JF. 1963. The measurement of effective population number. Evolution:279-288.

      Ruan Y, Luo Z, Tang X, Li G, Wen H, He X, Lu X, Lu J, Wu CI. 2021. On the founder effect in COVID-19 outbreaks: how many infected travelers may have started them all? Natl. Sci. Rev. 8:nwaa246.

      Ruan Y, Wen H, He X, Wu CI. 2021. A theoretical exploration of the origin and early evolution of a pandemic. Sci Bull (Beijing) 66:1022-1029.

      Review comments

      eLife assessment 

      This study presents a useful modification of a standard model of genetic drift by incorporating variance in offspring numbers, claiming to address several paradoxes in molecular evolution.

      It is unfortunate that the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      We do not believe that the paradoxes can be resolved.

      In addition, while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors present a theoretical treatment of what they term the "Wright-Fisher-Haldane" model, a claimed modification of the standard model of genetic drift that accounts for variability in offspring number, and argue that it resolves a number of paradoxes in molecular evolution. Ultimately, I found this manuscript quite strange.

      The notion of effective population size as inversely related to the variance in offspring number is well known in the literature, and not exclusive to Haldane's branching process treatment. However, I found the authors' point about variance in offspring changing over the course of, e.g. exponential growth fairly interesting, and I'm not sure I'd seen that pointed out before.

      Nonetheless, I don't think the authors' modeling, simulations, or empirical data analysis are sufficient to justify their claims. 

      Weaknesses: 

      I have several outstanding issues. First of all, the authors really do not engage with the literature regarding different notions of an effective population. Most strikingly, the authors don't talk about Cannings models at all, which are a broad class of models with non-Poisson offspring distributions that nonetheless converge to the standard Wright-Fisher diffusion under many circumstances, and to "jumpy" diffusions/coalescents otherwise (see e.g. Mohle 1998, Sagitov (2003), Der et al (2011), etc.). Moreover, there is extensive literature on effective population sizes in populations whose sizes vary with time, such as Sano et al (2004) and Sjodin et al (2005).

      Of course in many cases here the discussion is under neutrality, but it seems like the authors really need to engage with this literature more. 

      The most interesting part of the manuscript, I think, is the discussion of the Density Dependent Haldane model (DDH). However, I feel like I did not fully understand some of the derivation presented in this section, which might be my own fault. For instance, I can't tell if Equation 5 is a result or an assumption - when I attempted a naive derivation of Equation 5, I obtained E(K_t) = 1 + r/c*(c-n)*dt. It's unclear where the parameter z comes from, for example. Similarly, is equation 6 a derivation or an assumption? Finally, I'm not 100% sure how to interpret equation 7. I that a variance effective size at time t? Is it possible to obtain something like a coalescent Ne or an expected number of segregating sites or something from this? 

      Similarly, I don't understand their simulations. I expected that the authors would do individual-based simulations under a stochastic model of logistic growth, and show that you naturally get variance in offspring number that changes over time. But it seems that they simply used their equations 5 and 6 to fix those values. Moreover, I don't understand how they enforce population regulation in their simulations---is N_t random and determined by the (independent) draws from K_t for each individual? In that case, there's no "interaction" between individuals (except abstractly, since logistic growth arises from a model that assumes interactions between individuals). This seems problematic for their model, which is essentially motivated by the fact that early during logistic growth, there are basically no interactions, and later there are, which increases variance in reproduction. But their simulations assume no interactions throughout! 

      The authors also attempt to show that changing variance in reproductive success occurs naturally during exponential growth using a yeast experiment. However, the authors are not counting the offspring of individual yeast during growth (which I'm sure is quite hard). Instead, they use an equation that estimates the variance in offspring number based on the observed population size, as shown in the section "Estimation of V(K) and E(K) in yeast cells". This is fairly clever, however, I am not sure it is right, because the authors neglect covariance in offspring between individuals. My attempt at this derivation assumes that I_t | I_{t-1} = \sum_{I=1}^{I_{t-1}} K_{i,t-1} where K_{i,t-1} is the number of offspring of individual i at time t-1. Then, for example, E(V(I_t | I_{t-1})) = E(V(\sum_{i=1}^{I_{t-1}} K_{i,t-1})) = E(I_{t-1})V(K_{t-1}) + E(I_{k-1}(I_{k-1}-1))*Cov(K_{i,t-1},K_{j,t-1}). The authors have the first term, but not the second, and I'm not sure the second can be neglected (in fact, I believe it's the second term that's actually important, as early on during growth there is very little covariance because resources aren't constrained, but at carrying capacity, an individual having offspring means that another individuals has to have fewer offspring - this is the whole notion of exchangeability, also neglected in this manuscript). As such, I don't believe that their analysis of the empirical data supports their claim. 

      Thus, while I think there are some interesting ideas in this manuscript, I believe it has some fundamental issues:

      first, it fails to engage thoroughly with the literature on a very important topic that has been studied extensively. Second, I do not believe their simulations are appropriate to show what they want to show. And finally, I don't think their empirical analysis shows what they want to show. 

      References: 

      Möhle M. Robustness results for the coalescent. Journal of Applied Probability. 1998;35(2):438-447. doi:10.1239/jap/1032192859 

      Sagitov S. Convergence to the coalescent with simultaneous multiple mergers. Journal of Applied Probability. 2003;40(4):839-854. doi:10.1239/jap/1067436085 

      Der, Ricky, Charles L. Epstein, and Joshua B. Plotkin. "Generalized population models and the nature of genetic drift." Theoretical population biology 80.2 (2011): 80-99 

      Sano, Akinori, Akinobu Shimizu, and Masaru Iizuka. "Coalescent process with fluctuating population size and its effective size." Theoretical population biology 65.1 (2004): 39-48 

      Sjodin, P., et al. "On the meaning and existence of an effective population size." Genetics 169.2 (2005): 1061-1070 

      Reviewer #2 (Public Review): 

      Summary: 

      This theoretical paper examines genetic drift in scenarios deviating from the standard Wright-Fisher model. The authors discuss Haldane's branching process model, highlighting that the variance in reproductive success equates to genetic drift. By integrating the Wright-Fisher model with the Haldane model, the authors derive theoretical results that resolve paradoxes related to effective population size. 

      Strengths: 

      The most significant and compelling result from this paper is perhaps that the probability of fixing a new beneficial mutation is 2s/V(K). This is an intriguing and potentially generalizable discovery that could be applied to many different study systems. 

      The authors also made a lot of effort to connect theory with various real-world examples, such as genetic diversity in sex chromosomes and reproductive variance across different species. 

      Weaknesses: 

      One way to define effective population size is by the inverse of the coalescent rate. This is where the geometric mean of Ne comes from. If Ne is defined this way, many of the paradoxes mentioned seem to resolve naturally. If we take this approach, one could easily show that a large N population can still have a low coalescent rate depending on the reproduction model. However, the authors did not discuss Ne in light of the coalescent theory. This is surprising given that Eldon and Wakeley's 2006 paper is cited in the introduction, and the multiple mergers coalescent was introduced to explain the discrepancy between census size and effective population size, superspreaders, and reproduction variance - that said, there is no explicit discussion or introduction of the multiple mergers coalescent. 

      The Wright-Fisher model is often treated as a special case of the Cannings 1974 model, which incorporates the variance in reproductive success. This model should be discussed. It is unclear to me whether the results here have to be explained by the newly introduced WFH model, or could have been explained by the existing Cannings model. 

      The abstract makes it difficult to discern the main focus of the paper. It spends most of the space introducing "paradoxes". 

      The standard Wright-Fisher model makes several assumptions, including hermaphroditism, non-overlapping generations, random mating, and no selection. It will be more helpful to clarify which assumptions are being violated in each tested scenario, as V(K) is often not the only assumption being violated. For example, the logistic growth model assumes no cell death at the exponential growth phase, so it also violates the assumption about non-overlapping generations. 

      The theory and data regarding sex chromosomes do not align. The fact that \hat{alpha'} can be negative does not make sense. The authors claim that a negative \hat{alpha'} is equivalent to infinity, but why is that? It is also unclear how theta is defined. It seems to me that one should take the first principle approach e.g., define theta as pairwise genetic diversity, and start with deriving the expected pair-wise coalescence time under the MMC model, rather than starting with assuming theta = 4Neu. Overall, the theory in this section is not well supported by the data, and the explanation is insufficient. 

      {Alpha and alpha' can both be negative.  X^2 = 0.47 would yield x = -0.7}

      Reviewer #3 (Public Review): 

      Summary: 

      Ruan and colleagues consider a branching process model (in their terminology the "Haldane model") and the most basic Wright-Fisher model. They convincingly show that offspring distributions are usually non-Poissonian (as opposed to what's assumed in the Wright-Fisher model), and can depend on short-term ecological dynamics (e.g., variance in offspring number may be smaller during exponential growth). The authors discuss branching processes and the Wright-Fisher model in the context of 3 "paradoxes": (1) how Ne depends on N might depend on population dynamics; (2) how Ne is different on the X chromosome, the Y chromosome, and the autosomes, and these differences do match the expectations base on simple counts of the number of chromosomes in the populations; (3) how genetic drift interacts with selection. The authors provide some theoretical explanations for the role of variance in the offspring distribution in each of these three paradoxes. They also perform some experiments to directly measure the variance in offspring number, as well as perform some analyses of published data. 

      Strengths: 

      (1) The theoretical results are well-described and easy to follow. 

      (2) The analyses of different variances in offspring number (both experimentally and analyzing public data) are convincing that non-Poissonian offspring distributions are the norm. 

      (3) The point that this variance can change as the population size (or population dynamics) change is also very interesting and important to keep in mind. 

      (4) I enjoyed the Density-Dependent Haldane model. It was a nice example of the decoupling of census size and effective size. 

      Weaknesses: 

      (1) I am not convinced that these types of effects cannot just be absorbed into some time-varying Ne and still be well-modeled by the Wright-Fisher process. 

      (2) Along these lines, there is well-established literature showing that a broad class of processes (a large subset of Cannings' Exchangeable Models) converge to the Wright-Fisher diffusion, even those with non-Poissonian offspring distributions (e.g., Mohle and Sagitov 2001). E.g., equation (4) in Mohle and Sagitov 2001 shows that in such cases the "coalescent Ne" should be (N-1) / Var(K), essentially matching equation (3) in the present paper. 

      (3) Beyond this, I would imagine that branching processes with heavy-tailed offspring distributions could result in deviations that are not well captured by the authors' WFH model. In this case, the processes are known to converge (backward-in-time) to Lambda or Xi coalescents (e.g., Eldon and Wakely 2006 or again in Mohle and Sagitov 2001 and subsequent papers), which have well-defined forward-in-time processes. 

      (4) These results that Ne in the Wright-Fisher process might not be related to N in any straightforward (or even one-to-one) way are well-known (e.g., Neher and Hallatschek 2012; Spence, Kamm, and Song 2016; Matuszewski, Hildebrandt, Achaz, and Jensen 2018; Rice, Novembre, and Desai 2018; the work of Lounès Chikhi on how Ne can be affected by population structure; etc...) 

      (5) I was also missing some discussion of the relationship between the branching process and the Wright-Fisher model (or more generally Cannings' Exchangeable Models) when conditioning on the total population size. In particular, if the offspring distribution is Poisson, then conditioned on the total population size, the branching process is identical to the Wright-Fisher model. 

      (6) In the discussion, it is claimed that the last glacial maximum could have caused the bottleneck observed in human populations currently residing outside of Africa. Compelling evidence has been amassed that this bottleneck is due to serial founder events associated with the out-of-Africa migration (see e.g., Henn, Cavalli-Sforza, and Feldman 2012 for an older review - subsequent work has only strengthened this view). For me, a more compelling example of changes in carrying capacity would be the advent of agriculture ~11kya and other more recent technological advances. 

      Recommendations for the authors: 

      Reviewing Editor Comments: 

      The reviewers recognize the value of this model and some of the findings, particularly results from the density-dependent Haldane model. However, they expressed considerable concerns with the model and overall framing of this manuscript.

      First, all reviewers pointed out that the manuscript does not sufficiently engage with the extensive literature on various models of effective population size and genetic drift, notably lacking discussion on Cannings models and related works.

      Second, there is a disproportionate discussion on the paradoxes, yet some of the paradoxes might already be resolved within current theoretical frameworks. All three reviewers found the modeling and simulation of the yeast growth experiment hard to follow or lacking justification for certain choices. The analysis approach of sex chromosomes is also questioned. 

      The reviewers recommend a more thorough review of relevant prior literature to better contextualize their findings. The authors need to clarify and/or modify their derivations and simulations of the yeast growth experiment to address the identified caveats and ensure robustness. Additionally, the empirical analysis of the sex chromosome should be revisited, considering alternative scenarios rather than relying solely on the MSE, which only provides a superficial solution. Furthermore, the manuscript's overall framing should be adjusted to emphasize the conclusions drawn from the WFH model, rather than focusing on the "unresolved paradoxes", as some of these may be more readily explained by existing frameworks. Please see the reviewers' overall assessment and specific comments. 

      Reviewer #2 (Recommendations For The Authors): 

      In the introduction -- "Genetic drift is simply V(K)" -- this is a very strong statement. You can say it is inversely proportional to V(K), but drift is often defined based on changes in allele frequency. 

      Page 3 line 86. "sexes is a sufficient explanation."--> "sex could be a sufficient explanation" 

      The strongest line of new results is about 2s/V(K). Perhaps, the paper could put more emphasis on this part and demonstrate the generality of this result with a different example. 

      The math notations in the supplement are not intuitive. e.g., using i_k and j_k as probabilities. I also recommend using E[X] and V[X]for expectation and variance rather than \italic{E(X)} to improve the readability of many equations. 

      Eq A6, A7, While I manage to follow, P_{10}(t) and P_{10} are not defined anywhere in the text. 

      Supplement page 7, the term "probability of fixation" is confusing in a branching model. 

      E.q. A 28. It is unclear eq. A.1 could be used here directly. Some justification would be nice. 

      Supplement page 17. "the biological meaning of negative..". There is no clear justification for this claim. As a reader, I don't have any intuition as to why that is the case.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      Franke et al. explore and characterize the color response properties in the mouse primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The data is solid; however, the evidence supporting some conclusions is incomplete. In its current form, the paper makes a useful contribution to how color is coded in mouse V1. Significance would be enhanced with some additional analyses and a clearer discussion of the limitations of the data presented.

      We thank the reviewers for appreciating our manuscript. We have rewritten the conclusions of the paper to be more conservative and now more explicitly focus on color processing in mouse V1, rather than comparing V1 to the retina. Additionally, we discuss the limitations of our approach in detail in the Discussion section. Finally, we have addressed all comments from the reviewers below.

      Referee 1 (Remarks to the Author):

      In this study, Franke et al. explore and characterize color response properties across primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The authors use awake 2P imaging to define the spectral response properties of visual interneurons in layer 2/3. They find that opponent responses are more pronounced at photopic light levels, and that diversity in color opponent responses exists across the visual field, with green ON/ UV OFF responses more strongly represented in the upper visual field. This is argued to be relevant for the detection of certain features that are more salient when using chromatic space, possibly due to noise reduction. In the revised version, Franke et al. have addressed the potential pitfalls in the discussion, which is an important point for the non-expert reader. Thus, this study provides a solid characterization of the color properties of V1 and is a valuable addition to visual neuroscience research.

      My remaining concerns are based more on the interpretation. I’m still not convinced by the statement "This type of color-opponency in the receptive field center of V1 neurons was not present in the receptive field center of retinal ganglion cells and, therefore, is likely computed by integrating center and surround information downstream of the retina." and I would suggest rewording it in the abstract.

      As discussed previously and now nicely added to the discussion, it is difficult to make a direct comparison given the different stimulus types used to characterize the retina and V1 recordings and the different levels of adaptation in both tissues. I will leave this point to the discussion, which allows for a more nuanced description of the phenomenon. Why do I think this is important? In the introduction, the authors argue that "the discrepancy [of previous studies] may be due to differences in stimulus design or light levels." However, while different light levels can be tested in V1, this cannot be done properly in the retina with 2P experiments. To address this, one would have to examine color-opponency in RGC terminals in vivo, which is beyond the scope of this study. Addressing these latter points directly in the discussion would, in my opinion, only strengthen the study.

      We thank the reviewer for the feedback. We removed the sentence mentioned by the reviewer from the abstract, as well as from the summary of our results in the Introduction. Additionally, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Minor:

      In the abstract, the second sentence says that we already know the mechanisms in primates.

      Unfortunately, I do not think this is true. First, primates refers to an order with several species, which might have adaptations to their color-processing. Second, I’m aware of several characterizations in "primates" that have led to convincing models (as referenced), but in my opinion, this is far from a true understanding the mechanisms, especially since very little is known about foveal color processing due to the difficulties of these experiments. Similarly in the introduction. "Primates" is indirectly defined as a species. Perhaps some rewording is needed here as well, since we know how different cone distributions can be in rodents (see Peichl’s work).

      Thanks. We have reworded the Abstract and Introduction towards indicating that many studies have been performed in primate species, without suggesting that the mechanisms are described.

      The legend in Fig. 2 has a "Fig. ???"

      Fixed.

      Referee 2 (Remarks to the Author):

      Franke et al. characterize the representation of color in the primary visual cortex of mice, highlighting how this changes across the visual field. Using calcium imaging in awake, head-fixed mice, they characterize the properties of V1 neurons (layer 2/3) using a large center-surround stimulation where green and ultra-violet colors were presented in random combinations. Clustering of responses revealed a set of functional cell-types based on their preference to different combinations of green and UV in their center and surround. These functional types were demonstrated to have different spatial distributions across V1, including one neuronal type (Green-ON/UV-OFF) that was much more prominent in the posterior V1 (i.e. upper visual field). Modelling work suggests that these neurons likely support the detection of predator-like objects in the sky.

      Strengths: The large-scale single-cell resolution imaging used in this work allows the authors to map the responses of individual neurons across large regions of the visual cortex. Combining this large dataset with clustering analysis enabled the authors to group V1 neurons into distinct functional cell types and demonstrate their relative distribution in the upper and lower visual fields. Modelling work demonstrated the different capacity of each functional type to detect objects in the sky, providing insight into the ethological relevance of color opponent neurons in V1.

      We thank the reviewer for appreciating our study.

      Weaknesses: While the study presents convincing evidence about the asymmetric distribution of color-opponent neurons in V1, the paper would greatly benefit from a more in-depth discussion of the caveats related to the conclusions drawn about their origin. This is particularly relevant regarding the conclusion drawn about the contribution of color opponent neurons in the retina. The mismatch between retinal color opponency and V1 color opponency could imply that this feature is not solely inherited from the retina, however, there are other plausible explanations that are not discussed here. Direct evidence for this statement remains weak.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      In addition, the paper would benefit from adding explicit neuron counts or percentages to the quadrants of each of the density plots in Figures 2-5. The variance explained by the principal components does not capture the percentage of color opponent cells. Additionally, there appear to be some remaining errors in the figure legend and labels that have not been addressed (e.g. ’??’ in Fig 2 legend).

      Thank you for this suggestion. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels. Additionally, we have fixed the broken reference in the legend of Fig. 2.

      Overall, this study will be a valuable resource for researchers studying color vision, cortical processing, and the processing of ethologically relevant information. It provides a useful basis for future work on the origin of color opponency in V1 and its ethological relevance.

      General Suggestions:

      -  Please add possible caveats of using ETA method to the discussion section. For example, it is unclear to what extent ON/OFF cells are being overlooked by using ETA method.

      We now discuss the limitations of the ETA approach in the Discussion section.

      - The caveats of using the percentage of variance explained in the retina as evidence against V1 solely inheriting color-opponency from retinal output neurons are not adequately addressed. For example, could the mismatch in explained variance of the color axis between V1 and RGCs be explained by a subset of non-color opponent RGCs projecting elsewhere (not dLGN-V1) or that color opponent cells project to a larger number of neurons in V1 than non-color opponent cells? We suggest adding a paragraph to the discussion to address this issue.

      We have removed these conclusions from the paper, more carefully interpret the retinal results and mention that comparing ex-vivo retina data with in-vivo cortical data is challenging.

      - Please clarify how the different response types shown in Figure 5e-f lead to differences in noise detection and thereby differences in predator discriminability. For example, why does Gon/UVoff not respond to the noise scene while Goff/UVoff does?

      We added this to the Results section.

      - Please clarify the relationship between ETA amplitude, neural response probability, and neural response amplitude. For example, do color-opponent cells have equal absolute neural response amplitudes to the different colors?

      Thank you for bringing up this point. The ETA is obtained by summing the stimulus sequences that elicit an event (i.e., response), weighted by the amplitude of the response. Consequently, the absolute amplitude of the ETA correlates with the calcium amplitude. Importantly, the ETA amplitudes of different stimulus conditions are comparable because they were estimated on the same normalized calcium trace. Therefore, comparing the absolute amplitudes of ETAs of color-opponent neurons reveals the response magnitude of the cells to different colors. We have now included this information in the Results section.

      Abstract: - "more than a third of neurons in mouse V1 are color-opponent in their receptive field center". It is unclear what data supports this statement. Can you please provide a statement in the manuscript that supports this directly using the number of neurons?

      We added the following sentence to the Results section: Nevertheless, a substantial fraction of neurons (33.1%) preferred color-opponent stimuli and scattered along the off-diagonal in the upper left and lower right quadrants, especially for the RF center.

      Figure 2: - There is a ?? in the figure legend. Which figure should this refer to? - please provide explicit neuron counts/percentages for each quadrant in b.

      We fixed the figure reference. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels.

      Figure 3: - Fig 3: Color scheme makes it very difficult to differentiate the different conditions, especially when printed.

      Thanks we changed the color scheme.

      - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 4: - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 5: - Add explicit neuron counts/percentages for each quadrant in c.

      See above.

      Methods: - "we modeled each response type to have a square RF with 10 degrees visual angle in diameter". There appears to be a mismatch between this statement and Figure 5e where 18 degrees is reported.

      Thanks we fixed that.

      Referee 3 (Remarks to the Author):

      This paper studies chromatic coding in mouse primary visual cortex. Calcium responses of a large collection of cells are measured in response to a simple spot stimulus. These responses are used to estimate chromatic tuning properties - specifically sensitivity to UV and green stimuli presented in a large central spot or a larger still surrounding region. Cells are divided based on their responses to these stimuli into luminance or chromatic sensitive groups. The results are interesting and many aspects of the experiments and conclusions are well done; several technical concerns, however, limit the support for several main conclusions,

      Limitations of stimulus choice The paper relies on responses to a large (37.5 degree diameter) modulated spot and surround region. This spot is considerably larger than the receptive fields of both V1 cells and retinal ganglion cells (it is twice the area of the average V1 receptive field). As a result, the spot itself is very likely to strongly activate both center and surround mechanisms, and responses of cells are likely to depend on where the receptive fields are located within the spot

      (and, e.g., how much of the true neural surround samples the center spot vs the surround region). Most importantly, the surrounds of most of the recorded cells will be strongly activated by the central spot. This brings into question statements in the paper about selective activation of center and surround (e.g. page 2, right column). This in turn raises questions about several subsequent analyses that rely on selective center and surround activation.

      Thank you for this comment. A similar point was raised by a reviewer in the first round of revision. We agree with the reviewers that it is critical to discuss both the rationale behind our stimulus design and its limitations to facilitate better interpretation by the reader.

      To be able to record from many V1 neurons simultaneously, we used a stimulus size of 37.5 degree visual angle in diameter, which is slightly larger than center RFs of single V1 neurons (between 20 - 30 degrees visual angle depending on the stimulus, see here). The disadvantage of this approach is that the stimulus is only roughly centered on the neurons’ center RFs. To reduce the impact of potential stimulus misalignment on our results, we used the following steps: { For each recording, we positioned the monitor such that the mean RF across all neurons lies within the center of the stimulus field of view.

      We confirmed that this procedure results in good stimulus alignment for the large majority of recorded neurons within individual recording fields by using a sparse noise stimulus (Suppl. Fig. 1a-c). Specifically, we found that for 83% of tested neurons, more than two thirds of their center RF, determined by the sparse noise stimulus, overlapped with the center spot of the color noise stimulus.

      For analysis, we excluded neurons without a significant center STA, which may be caused by misalignment of the stimulus.

      Together, we believe these points strongly suggest that the center spot and the surround annulus of the noise stimulus predominantly drive center (i.e. classical RF) and surround (i.e. extraclassical RF), respectively, of the recorded V1 neurons. This is further supported by the fact that color response types identified using an automated clustering method were robust across mice (Suppl. Fig. 6c), indicating consistent stimulus centering.

      Nevertheless, we cannot exclude the possibility that the stimulus was misaligned for a subset of the recorded neurons used in our analysis. We agree with the reviewer that such misalignment might have caused the center stimulus to partially activate the surround. To further address this issue beyond the controls we have already implemented, one could compare the results of our approach with an approach that centers the stimulus on individual neurons. However, we believe that performing these additional experiments is beyond the scope of the current study.

      To acknowledge the experimental limitations of our study and the concerns brought up by the reviewer, we have added the steps we perform to reduce the effects of stimulus misalignment in the Results section and discuss the problem of stimulus alignment in the Discussion in a separate section. With this, we believe our manuscript explains both the rationale behind our stimulus design as well as important limitations of the approach.

      Comparison with retina A key conclusion of the paper is that the chromatic tuning in V1 is not inherited from retinal ganglion cells. This conclusion comes from comparing chromatic tuning in a previously-collected data set from retina with the present results. But the retina recordings were made using a considerably smaller spot, and hence it is not clear that the comparison made in the paper is accurate. For example, the stimulus used for the V1 experiments almost certainly strongly stimulates both center and surround of retinal ganglion cells. The text focuses on color opponency in the receptive field centers of retinal ganglion cells, but center-surround opponency seems at least as relevant for such large spots. This issue needs to be described more clearly and earlier in the paper.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Limitations associated with ETA analysis One of the reviewers in the previous round of reviews raised the concern that the ETA analysis may not accurately capture responses of cells with nonlinear receptive field properties such as On/Off cells. This possibility and whether it is a concern should be discussed.

      Thanks for this comment. We now discuss the limitation of using an ETA analysis in the

      Discussion section.

      Discrimination performance poor Discriminability of color or luminance is used as a measure of population coding. The discrimination performance appears to be quite poor - with 500-1000 neurons needed to reliably distinguish light from dark or green from UV. Intuitively I would expect that a single cell would provide such discrimination. Is this intuition wrong? If not, how do we interpret the discrimination analyses?

      Thank you for raising this point. The plots in Fig. 2c (and Figs. 3-5) show discriminability in bits, with the discrimination accuracy in % highlighted by the dotted horizontal lines. For 500 neurons, the discriminability is approx. 0.8 bits, corresponding to 95% accuracy. Even for 50 neurons, the accuracy is significantly above chance level. We now mention in the legends that the dotted lines indicate decoding accuracy in %.

    1. Author response:

      The following is the authors’ response to the current reviews.

      (1) Though we cannot survey all mutants, our observation that 774 genetically diverse adaptive mutants converge at the level of phenotype is important. It adds to growing evidence (see PMID33263280, PMID37437111, PMID22282810, PMID25806684) that the genetic basis of adaptation is not as diverse as the phenotypic basis. This convergence could make evolution more predictable.

      (2) Previous fitness competitions using this specific barcode system have been run for greater than 25 generations (PMID33263280, PMID27594428, PMID37861305, PMID27594428). We measure fitness per cycle, rather than per generation, so our fitness advantages are comparable to those in the aforementioned studies, including Venkataram and Dunn et al. (PMID27594428).

      (3) Our results remain the same upon removing the ~150 lineages with the noisiest fitness inferences, including those the reviewer mentions (see Figure S7).

      (4) We agree that there are likely more than the 6 clusters that we validated with follow-up studies (see Discussion). The important point is that we see a great deal of convergence in the behavior of diverse adaptive mutants.

      (5) The growth curves requested by the reviewer were included in our original manuscript; several more were added in the revision (see Figures 5D, 5E, 7D, S11B, S11C).


      The following is the authors’ response to the original reviews.

      Public Reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In their manuscript, Schmidlin, Apodaca, et al try to answer fundamental questions about the evolution of new phenotypes and the trade-offs associated with this process. As a model, they use yeast resistance to two drugs, fluconazole and radicicol. They use barcoded libraries of isogenic yeasts to evolve thousands of strains in 12 different environments. They then measure the fitness of evolved strains in all environments and use these measurements to examine patterns in fitness trade-offs. They identify only six major clusters corresponding to different trade-off profiles, suggesting the vast genotypic landscape of evolved mutants translates to a highly constrained phenotypic space. They sequence over a hundred evolved strains and find that mutations in the same gene can result in different phenotypic profiles.  

      Overall, the authors deploy innovative methods to scale up experimental evolution experiments, and in many aspects of their approach tried to minimize experimental variation. 

      We thank the reviewer for this positive assessment of our work. We are happy that the reviewer noted what we feel is a unique strength of our approach: we scaled up experimental evolution by using DNA barcodes and by exploring 12 related selection pressures.  Despite this scaling up, we still see phenotypic convergence among the 744 adaptive mutants we study. 

      Weaknesses: 

      (1) One of the objectives of the authors is to characterize the extent of phenotypic diversity in terms of resistance trade-offs between fluconazole and radicicol. To minimize noise in the measurement of relative fitness, the authors only included strains with at least 500 barcode counts across all time points in all 12 experimental conditions, resulting in a set of 774 lineages passing this threshold. This corresponds to a very small fraction of the starting set of ~21 000 lineages that were combined after experimental evolution for fitness measurements. 

      This is a misunderstanding that we clarified in this revision. Our starting set did not include 21,000 adaptive lineages. The total number of unique adaptive lineages in this starting set is much lower than 21,000 for two reasons. 

      First, ~21,000 represents the number of single colonies we isolated in total from our evolution experiments. Many of these isolates possess the same barcode, meaning they are duplicates. Second, and perhaps more importantly, most evolved lineages do not acquire adaptive mutations, meaning that many of the 21,000 isolates are genetically identical to their ancestor. In our revised manuscript, we explicitly stated that these 21,000 isolated lineages do not all represent unique, adaptive lineages. We changed the word “lineages” to “isolates” where relevant in Figure 2 and the accompanying legend. And we have added the following sentence to the figure 2 legend (line 212), “These ~21,000 isolates do not represent as many unique, adaptive lineages because many either have the same barcode or do not possess adaptive mutations.”

      More broadly speaking, several previous studies have demonstrated that diverse genetic mutations converge at the level of phenotype and have suggested that this convergence makes adaptation more predictable (PMID33263280, PMID37437111, PMID22282810, PMID25806684). Most of these studies survey fewer than 774 mutants. Further, our study captures mutants that are overlooked in previous studies, such as those that emerge across subtly different selection pressures (e.g., 4 𝜇g/ml vs. 8 𝜇g/ml flu) and those that are undetectable in evolutions lacking DNA barcodes. Thus, while our experimental design misses some mutants (see next comment), it captures many others. Thus, we feel that “our work – showing that 774 mutants fall into a much smaller number of groups” is important because it “contributes to growing literature suggesting that the phenotypic basis of adaptation is not as diverse as the genetic basis (lines 176 - 178).”

      As the authors briefly remark, this will bias their datasets for lineages with high fitness in all 12 environments, as all these strains must be fit enough to maintain a high abundance. 

      We now devote 19 lines of text to discussing this bias (on lines 160 - 162, 278-284, and in more detail on 758 - 767).

      We walk through an example of a class of mutants that our study misses. One lines 759 - 763, we say, “our study is underpowered to detect adaptive lineages that have low fitness in any of the 12 environments. This is bound to exclude large numbers of adaptive mutants. For example, previous work has shown some FLU resistant mutants have strong tradeoffs in RAD (Cowen and Lindquist 2005). Perhaps we are unable to detect these mutants because their barcodes are at too low a frequency in RAD environments, thus they are excluded from our collection of 774.”

      In our revised version, we added more text earlier in the manuscript that explicitly discusses this bias. Lines 278 – 283 now read, “The 774 lineages we focus on are biased towards those that are reproducibly adaptive in multiple environments we study. This is because lineages that have low fitness in a particular environment are rarely observed >500 times in that environment (Figure S4). By requiring lineages to have high-coverage fitness measurements in all 12 conditions, we may be excluding adaptive mutants that have severe tradeoffs in one or more environments, consequently blinding ourselves to mutants that act via unique underlying mechanisms.”

      Note that while we “miss” some classes of mutants, we “catch” other classes that may have been missed in previous studies of convergence. For example, we observe a unique class of FLU-resistant mutants that primarily emerged in evolution experiments that lack FLU (Figure 3). Thus, we think that the unique design of our study, surveying 12 environments, allows us to make a novel contribution to the study of phenotypic convergence.

      One of the main observations of the authors is phenotypic space is constrained to a few clusters of roughly similar relative fitness patterns, giving hope that such clusters could be enumerated and considered to design antimicrobial treatment strategies. However, by excluding all lineages that fit in only one or a few environments, they conceal much of the diversity that might exist in terms of trade-offs and set up an inclusion threshold that might present only a small fraction of phenotypic space with characteristics consistent with generalist resistance mechanisms or broadly increased fitness. This has important implications regarding the general conclusions of the authors regarding the evolution of trade-offs. 

      We agree and discussed exactly the reviewer’s point about our inclusion threshold in the 19 lines of text mentioned previously (lines 160 - 162, 278-284, and 758 - 767). To add to this discussion, and avoid the misunderstanding the reviewer mentions, we added the following strongly-worded sentence to the end of the paragraph on lines 749 – 767 in our revised manuscript: “This could complicate (or even make impossible) endeavors to design antimicrobial treatment strategies that thwart resistance”. 

      More generally speaking, we set up our study around Figure 1, which depicts a treatment strategy that works best if there exists but a single type of adaptive mutant. Despite our inclusion threshold, we find there are at least 6 types of mutants. This diminishes hopes of designing simple multidrug strategies like Figure 1. Our goal is to present a tempered and nuanced discussion of whether and how to move forward with designing multidrug strategies, given our observations. On one hand, we point out how the phenotypic convergence we observe is promising. But on the other hand, we also point out how there may be less convergence than meets the eye for various reasons including the inclusion threshold the reviewer mentions (lines 749 - 767).

      We have made several minor edits to the text with the goal of providing a more balanced discussion of both sides. For example, we added the words, “may yet” to the following sentences on lines 32 – 36 of the abstract: “These findings, on one hand, demonstrate the difficulty in relying on consistent or intuitive tradeoffs when designing multidrug treatments. On the other hand, by demonstrating that hundreds of adaptive mutations can be reduced to a few groups with characteristic tradeoffs, our findings may yet empower multidrug strategies that leverage tradeoffs to combat resistance.”

      (2) Most large-scale pooled competition assays using barcodes are usually stopped after ~25 to avoid noise due to the emergence of secondary mutations. 

      The rate at which new mutations enter a population is driven by various factors such as the mutation rate and population size, so choosing an arbitrary threshold like 25 generations is difficult. 

      We conducted our fitness competition following previous work using the Levy/Blundell yeast barcode system, in which the number of generations reported varies from 32 to 40 (PMID33263280, PMID27594428, PMID37861305, see PMID27594428 for detailed calculation of the fraction of lineages biased by secondary mutations in this system). 

      The authors measure fitness across ~40 generations, which is almost the same number of generations as in the evolution experiment. This raises the possibility of secondary mutations biasing abundance values, which would not have been detected by the whole genome sequencing as it was performed before the competition assay. 

      Previous work has demonstrated that in this evolution platform, most mutations occur during the transformation that introduces the DNA barcodes (Levy et al. 2015). In other words, these mutations are already present and do not accumulate during the 40 generations of evolution. Therefore, the observation that we collect a genetically diverse pool of adaptive mutants after 40 generations of evolution is not evidence that 40 generations is enough time for secondary mutations to bias abundance values.

      We have added the following sentence to the main text to highlight this issue (lines 247 - 249): “This happens because the barcoding process is slightly mutagenic, thus there is less need to wait for DNA replication errors to introduce mutations (Levy et al. 2015; Venkataram et al. 2016).

      We also elaborate on this in the method section entitled, “Performing barcoded fitness competition experiments,” where we added a full paragraph to clarify this issue (lines 972 - 980).

      (3) The approach used by the authors to identify and visualize clusters of phenotypes among lineages does not seem to consider the uncertainty in the measurement of their relative fitness. As can be seen from Figure S4, the inter-replicate difference in measured fitness can often be quite large. From these graphs, it is also possible to see that some of the fitness measurements do not correlate linearly (ex.: Med Flu, Hi Rad Low Flu), meaning that taking the average of both replicates might not be the best approach.  Because the clustering approach used does not seem to take this variability into account, it becomes difficult to evaluate the strength of the clustering, especially because the UMAP projection does not include any representation of uncertainty around the position of lineages. This might paint a misleading picture where clusters appear well separate and well defined but are in fact much fuzzier, which would impact the conclusion that the phenotypic space is constricted. 

      Our noisiest fitness measurements correspond to barcodes that are the least abundant and thus suffer the most from stochastic sampling noise. These are also the barcodes that introduce the nonlinearity the reviewer mentions. We removed these from our dataset by increasing our coverage threshold from 500 reads to 5,000 reads. The clusters did not collapse, which suggests that they were not capturing this noise (Figure S7B).

      More importantly, we devoted 4 figures and 200 lines of text to demonstrating that the clusters we identified capture biologically meaningful differences between mutants (and not noise). We have modified the main text to point readers to figures 5 through 8 earlier, such that it is more apparent that the clustering analysis is just the first piece of our data demonstrating convergence at the level of phenotype.

      (4) The authors make the decision to use UMAP and a gaussian mixed model to cluster and represent the different fitness landscapes of their lineages of interest. Their approach has many caveats. First, compared to PCA, the axis does not provide any information about the actual dissimilarities between clusters. Using PCA would have allowed a better understanding of the amount of variance explained by components that separate clusters, as well as more interpretable components. 

      The components derived from PCA are often not interpretable. It’s not obvious that each one, or even the first one, will represent an intuitive phenotype, like resistance to fluconazole.  Moreover, we see many non-linearities in our data. For example, fitness in a double drug environment is not predicted by adding up fitness in the relevant single drug environments. Also, there are mutants that have high fitness when fluconazole is absent or abundant, but low fitness when mild concentrations are present. These types of nonlinearities can make the axes in PCA very difficult to interpret, plus these nonlinearities can be missed by PCA, thus we prefer other clustering methods. 

      Still, we agree that confirming our clusters are robust to different clustering methods is helpful. We have included PCA in the revised manuscript, plotting PC1 vs PC2 as Figure S9 with points colored according to the cluster assignment in figure 4 (i.e. using a gaussian mixture model). It appears the clusters are largely preserved.

      Second, the advantages of dimensional reduction are not clear. In the competition experiment, 11/12 conditions (all but the no drug, no DMSO conditions) can be mapped to only three dimensions: concentration of fluconazole, concentration of radicicol, and relative fitness. Each lineage would have its own fitness landscape as defined by the plane formed by relative fitness values in this space, which can then be examined and compared between lineages. 

      We worry that the idea stems from apriori notions of what the important dimensions should be. The biology of our system is unfortunately not intuitive. For example, it seems like this idea would miss important nonlinearities such as our observation that low fluconazole behaves more like a novel selection pressure than a dialed down version of high fluconazole. 

      Third, the choice of 7 clusters as the cutoff for the multiple Gaussian model is not well explained. Based on Figure S6A, BIC starts leveling off at 6 clusters, not 7, and going to 8 clusters would provide the same reduction as going from 6 to 7. This choice also appears arbitrary in Figure S6B, where BIC levels off at 9 clusters when only highly abundant lineages are considered. 

      We agree. We did not rely on the results of BIC alone to make final decisions about how many clusters to include. Another factor we considered were follow-up genotyping and phenotyping studies that confirm biologically meaningful differences between the mutants in each cluster (Figures 5 – 8). We now state this explicitly. Here is the modified paragraph where we describe how we chose a model with 7 clusters, from lines 436 – 446 of the revised manuscript:

      “Beyond the obvious divide between the top and bottom clusters of mutants on the UMAP, we used a gaussian mixture model (GMM) (Fraley and Raftery, 2003) to identify clusters. A common problem in this type of analysis is the risk of dividing the data into clusters based on variation that represents measurement noise rather than reproducible differences between mutants (Mirkin, 2011; Zhao et al., 2008). One way we avoided this was by using a GMM quality control metric (BIC score) to establish how splitting out additional clusters affected model performance (Figure S6). Another factor we considered were follow-up genotyping and phenotyping studies that demonstrate biologically meaningful differences between mutants in different clusters (Figures 5 – 8). Using this information, we identified seven clusters of distinct mutants, including one pertaining to the control strains, and six others pertaining to presumed different classes of adaptive mutant (Figure 4D). It is possible that there exist additional clusters, beyond those we are able to tease apart in this study.”

      This directly contradicts the statement in the main text that clusters are robust to noise, as more a stringent inclusion threshold appears to increase and not decrease the optimal number of clusters. Additional criteria to BIC could have been used to help choose the optimal number of clusters or even if mixed Gaussian modeling is appropriate for this dataset. 

      We are under the following impression: If our clustering method was overfitting, i.e. capturing noise, the optimal number of clusters should decrease when we eliminate noise. It increased. In other words, the observation that our clusters did not collapse (i.e.

      merge) when we removed noise suggests these clusters were not capturing noise. 

      Most importantly, our validation experiments, described below, provide additional evidence that our clusters capture meaningful differences between mutants (and not noise).  

      (5) Large-scale barcode sequencing assays can often be noisy and are generally validated using growth curves or competition assays. 

      Some types of bar-seq methods, in particular those that look at fold change across two time points, are noisier than others that look at how frequency changes across multiple timepoints (PMID30391162). Here, we use the less noisy method. We also reduce noise by using a stricter coverage threshold than previous work (e.g., PMID33263280), and by excluding batch effects by performing all experiments simultaneously, since we found this to be effective in our previous work (PMID37237236). 

      Perhaps also relevant is that the main assay we use to measure fitness has been previously validated (PMID27594428) and no subsequent study using this assay validates using the methods suggested above (see PMID37861305, PMID33263280, PMID31611676, PMID29429618, PMID37192196, PMID34465770, PMID33493203). Similarly, bar-seq has been used, without the suggested validation, to demonstrate that the way some mutant’s fitness changes across environments is different from other mutants (PMID33263280, PMID37861305, PMID31611676, PMID33493203, PMID34596043). This is the same thing that we use bar-seq to demonstrate. 

      For all of these reasons above, we are hesitant to confirm bar-seq itself as a valid way to infer fitness. It seems this is already accepted as a standard in our field. However, please see below.

      Having these types of results would help support the accuracy of the main assay in the manuscript and thus better support the claims of the authors. 

      While we don’t agree that fitness measurements obtained from this bar-seq assay generally require validation, we do agree that it is important to validate whether the mutants in each of our 6 clusters indeed are different from one another in meaningful ways.

      Our manuscript has 4 figures (5 - 8) and over 200 lines of text dedicated to validating whether our clusters capture reproducible and biologically meaningful differences between mutants. In the revised manuscript, we added additional validation experiments, such that three figures (Figures 5, 7 and S11) now involve growth curves, as the reviewer requested. 

      Below, we walk through the different types of validation experiments that are present in our manuscript, including those that were added in this revision.

      (1) Mutants from different clusters have different growth curves: In our original manuscript, we measured growth curves corresponding to a fitness tradeoff that we thought was surprising. Mutants in clusters 4 and 5 both have fitness advantages in single drug conditions. While mutants from cluster 4 also are advantageous in the relevant double drug conditions, mutants from cluster 5 are not! We validated these different behaviors by studying growth curves for a mutant from each cluster (Figures 7 and S11), finding that mutants from different clusters have different growth curves. In the revised manuscript, we added growth curves for 6 additional mutants (3 from cluster 1 and 3 from cluster 3), demonstrating that only the cluster 1 mutants have a tradeoff in high concentrations of fluconazole (see Figure 5D & 5E). In sum, this work demonstrates that mutants from different clusters have predictable differences in their growth phenotypes.

      (2) Mutants from different clusters have different evolutionary origins: In our original manuscript, we came up with a novel way to ask whether the clusters capture different types of adaptive mutants. We asked whether the mutants in each cluster originate from different evolution experiments. They often do (see pie charts in Figures 5, 6, 7, 8). In the revised manuscript, we extended this analysis to include mutants from cluster 1. Cluster 1 is defined by high fitness in low fluconazole that declines with increasing fluconazole. In our revised manuscript, we show that cluster 1 lineages were overwhelmingly sampled from evolutions conducted in our lowest concentration of fluconazole (see pie chart in new Figure 5A). No other cluster’s evolutionary history shows this pattern (compare to pie charts in figures 6, 7, and 8).

      **These pie charts also provide independent confirmation supporting the fitness tradeoffs observed for each cluster in figure 4E. For example, mutants in cluster 5 appear to have a tradeoff in a particular double drug condition (HRLF), and the pie charts confirm that they rarely originate from that evolution condition. This differs from cluster 4 mutants, which do not have a fitness tradeoff in HRLF, and are more likely to originate from that environment (see purple pie slice in figure 7). Additional cases where results of evolution experiments (pie charts) confirm observed fitness tradeoffs are discussed in the manuscript on lines 320 – 326, 594 – 598, 681 – 685.

      (3) Mutants from each cluster often fall into different genes: We sequenced many of these mutants and show that mutants in the same gene are often found in the same cluster. For example, all 3 IRA1 mutants are in cluster 6 (Fig 8), both GPB2 mutants are in cluster 4 (Figs 7 & 8), and 35/36 PDR mutants are in either cluster 2 or 3 (Figs 5 & 6). 

      (4) Mutants from each cluster have behaviors previously observed in the literature: We compared our sequencing results to the literature and found congruence. For example, PDR mutants are known to provide a fitness benefit in fluconazole and are found in clusters that have high fitness in fluconazole (lines 485 - 491). Previous work suggests that some mutations to PDR have different tradeoffs than others, which corresponds to our finding that PDR mutants fall into two separate clusters (lines 610 - 612). IRA1 mutants were previously observed to have high fitness in our “no drug” condition and are found in the cluster that has the highest fitness in the “no drug” condition (lines 691 - 696). Previous work even confirms the unusual fitness tradeoff we observe where IRA1 and other cluster 6 mutants have low fitness only in low concentrations of fluconazole (lines 702 - 704).

      (5) Mutants largely remain in their clusters when we use alternate clustering methods:  In our original manuscript, we performed various different re-clustering and/or normalization approaches on our data (Fig 6, S5, S7, S8, S10). The clusters of mutants that we observe in figure 4 do not change substantially when we re-cluster the data. In our revised manuscript, we added another clustering method: principal component analysis (PCA) (Fig S9).  Again, we found that our clusters are largely preserved.

      While these experiments demonstrate meaningful differences between the mutants in each cluster, important questions remain. For example, a long-standing question in biology centers on the extent to which every mutation has unique phenotypic effects versus the extent to which scientists can predict the effects of some mutations from other similar mutations. Additional studies on the clusters of mutants discovered here will be useful in deepening our understanding of this topic and more generally of the degree of pleiotropy in the genotype-phenotype map.

      Reviewer #2 (Public Review): 

      Summary: 

      Schmidlin & Apodaca et al. aim to distinguish mutants that resist drugs via different mechanisms by examining fitness tradeoffs across hundreds of fluconazole-resistant yeast strains. They barcoded a collection of fluconazole-resistant isolates and evolved them in different environments with a view to having relevance for evolutionary theory, medicine, and genotypephenotype mapping. 

      Strengths: 

      There are multiple strengths to this paper, the first of which is pointing out how much work has gone into it; the quality of the experiments (the thought process, the data, the figures) is excellent. Here, the authors seek to induce mutations in multiple environments, which is a really large-scale task. I particularly like the attention paid to isolates with are resistant to low concentrations of FLU. So often these are overlooked in favour of those conferring MIC values >64/128 etc. What was seen is different genotype and fitness profiles. I think there's a wealth of information here that will actually be of interest to more than just the fields mentioned (evolutionary medicine/theory). 

      We are grateful for this positive review. This was indeed a lot of work! We are happy that the reviewer noted what we feel is a unique strength of our manuscript: that we survey adaptive isolates across multiple environments, including low drug concentrations.  

      Weaknesses: 

      Not picking up low fitness lineages - which the authors discuss and provide a rationale as to why. I can completely see how this has occurred during this research, and whilst it is a shame I do not think this takes away from the findings of this paper. Maybe in the next one! 

      We thank the reviewer for these words of encouragement and will work towards catching more low fitness lineages in our next project.

      In the abstract the authors focus on 'tradeoffs' yet in the discussion they say the purpose of the study is to see how many different mechanisms of FLU resistance may exist (lines 679-680), followed up by "We distinguish mutants that likely act via different mechanisms by identifying those with different fitness tradeoffs across 12 environments". Whilst I do see their point, and this is entirely feasible, I would like a bit more explanation around this (perhaps in the intro) to help lay-readers make this jump. The remainder of my comments on 'weaknesses' are relatively fixable, I think: 

      We have expanded the introduction, in particular lines 129 – 157 of the revised manuscript, to walk readers through the connection between fitness tradeoffs and molecular mechanisms. For example, here is one relevant section of new text from lines 131 - 136: “The intuition here is as follows. If two groups of drug resistant mutants have different fitness tradeoffs, it could mean that they provide resistance through different underlying mechanisms. Alternatively, both could provide drug resistance via the same mechanism, but some mutations might also affect fitness via additional mechanisms (i.e. they might have unique “side-effects” at the molecular level) resulting in unique fitness tradeoffs in some environments.”

      In the introduction I struggle to see how this body of research fits in with the current literature, as the literature cited is a hodge-podge of bacterial and fungal evolution studies, which are very different! So example, the authors state "previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms" (lines 129-131) and then cite three papers, only one of which is a fungal research output. However, the next sentence focuses solely on literature from fungal research. Citing bacterial work as a foundation is fine, but as you're using yeast for this I think tailoring the introduction more to what is and isn't known in fungi would be more appropriate. It would also be great to then circle back around and mention monotherapy vs combination drug therapy for fungal infections as a rationale for this study. The study seems to be focused on FLU-resistant mutants, which is the first-line drug of choice, but many (yeast) infections have acquired resistance to this and combination therapy is the norm. 

      We ourselves are broadly interested in the structure of the genotype-phenotype-fitness map (PMID33263280, PMID32804946). For example, we are interested in whether diverse mutations converge at the level of phenotype and fitness. Figure 1A depicts a scenario with a lot of convergence in that all adaptive mutations have the same fitness tradeoffs.

      The reason we cite papers from yeast, as well as bacteria and cancer, is that we believe general conclusions about the structure of the genotype-phenotype-fitness map apply broadly. For example, the sentence the reviewer highlights, “previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms” is a general observation about the way genotype maps to fitness. So, we cited papers from across the tree of life to support this sentence.  And in the next sentence, where we cite 3 papers focusing solely on fungal research, we cite them because they are studies about the complexity of this map. Their conclusions, in theory, should also apply broadly, beyond yeast.

      On the other hand, because we study drug resistant mutations, we hope that our dataset and observations are of use to scientists studying the evolution of resistance. We use our introduction to explain how the structure of the genotype-phenotype-fitness map might influence whether a multidrug strategy is successful (Figure 1).

      We are hesitant to rework our introduction to focus more specifically on fungal infections as this is not our primary area of expertise.

      Methods: Line 769 - which yeast? I haven't even seen mention of which species is being used in this study; different yeast employ different mechanisms of adaptation for resistance, so could greatly impact the results seen. This could help with some background context if the species is mentioned (although I assume S. cerevisiae). 

      In the revised manuscript, we have edited several lines (line 95, 186, 822) to state the organism this work was done with is Saccharomyces cerevisiae. 

      In which case, should aneuploidy be considered as a mechanism? This is mentioned briefly on line 556, but with all the sequencing data acquired this could be checked quickly? 

      We like this idea and we are working on it, but it is not straightforward. The reviewer is correct in that we can use the sequencing data that we already have. But calling aneuploidy with certainty is tough because its signal can be masked by noise. In other words, some regions of the genome may be sequenced more than others by chance.

      Given this is not straightforward, at least not for us, this analysis will likely have to wait for a subsequent paper. 

      I think the authors could be bolder and try and link this to other (pathogenic) yeasts. What are the implications of this work on say, Candida infections? 

      Perhaps because our background lies in general study of the genotype-phenotype map, we are hesitant about making bold assertions about how our work might apply to pathogenic yeasts. We are hopeful that our work will serve as a stepping-stone such that scientists from that community can perhaps make (and test) such statements.   

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I found the ideas and the questions asked in this manuscript to be interesting and ambitious. The setup of the evolution and fitness competition experiments was well poised to answer them, but the analysis of the data is not currently enough to properly support the claims made. I would suggest revising the analysis to address the weaknesses raised in the public review and if possible, adding some more experimental validations. As you already have genome sequencing data showing the causal mutation for many mutants across the different clusters, it should be possible for you to reconstruct some of the strains and test validate their phenotypes and cluster identity. 

      Yes, this is possible. We added more validation experiments (see figure 5). We already had quite a few validation experiments (figures 5 - 8 and lines 479 - 718), but we did not clearly highlight the significance of these analyses in our original manuscript. Therefore, we modified the text in our revised manuscript in various places to do so. For example, we now make clearer that we jointly use BIC scores as well as validation experiments to decide how many clusters to describe (lines 436 - 446). We also make clearer that our clustering analysis is only the first step towards identifying groups of mutants with similar tradeoffs by using words and phrases like, “we start by” (line 411) and “preliminarily” (line 448) when discussing the clustering analysis.  We also point readers to all the figures describing our validation experiments earlier (line 443), and list these experiments out in the discussion (lines 738 - 741).

      Also, please deposit your genome sequencing data in a public database (I am not sure I saw it mentioned anywhere). 

      We have updated line 1088 of the methods section to include this sentence: “Whole genome sequences were deposited in GenBank under SRA reference PRJNA1023288.”

      Reviewer #2 (Recommendations For The Authors):

      I don't think the figures or experiments can be improved upon, they are excellent. There are a few times I feel things are written in a rather confusing way and could be explained better, but also I feel there are places the authors jump from one thing to another really quickly and the reader (who might not be an expert in this area) will struggle to keep up. For example: 

      Explaining what RAD is - it is introduced in the methods, but what it is, is not really explained. 

      Since the introduction is already very long, we chose not to explain radicicol’s mechanism of action here. Instead, we bring this up later on lines 614 – 621 when it becomes relevant.

      More generally, in response to this advice and that from reviewer 1, we also added text to various places in the manuscript to help explain our work more clearly. In particular, we clarified the significance of our validation experiments and various important methodological details (see above). We also better explained the connection between fitness tradeoffs and mechanisms (see above) and added more details about the potential use cases of our approach (lines 142 – 150).

      The abstract states "some of the groupings we find are surprising. For example, we find some mutants that resist single drugs do not resist their combination, and some mutants to the same gene have different tradeoffs than others". Firstly, this sentence is a bit confusing to read but if I've read it as intended, then is it really surprising? It's difficult for organisms (bacteria and fungi) to develop multiple beneficial mutations conferring drug resistance on the same background, hence why combination antifungal drug therapy is often used to treat infections. 

      This is a place where brevity got in the way of clarity. We added a bit of text to make clear why we were surprised. Specifically, we were surprised because not all mutants behave the same. Some resist single drugs AND their combination. Some resist single drugs but not their combination. The sentence in the abstract now reads, “For example, we find some mutants that resist single drugs do not resist their combination, while others do. And some mutants to the same gene have different tradeoffs than others.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Responses to recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors):

      The manuscript would be strengthened with the following key revisions mostly having to do with image quality: 

      (1) It is very difficult in Figure 4B to see which nuclei actually have evidence of mitochondrial transcripts. It might be helpful to provide arrows to specific cells and also to provide some estimate of the percentage of cells with nuclear mt-transcripts as measured by ISH compared to the 3-6% of cortex cell estimate seen in the snRNAseq analysis. 

      As suggested, now we have added arrows to help readers to see the signals in nuclei. The detection threshold of ISH and single-nucleus RNA-seq should be different, and therefore, measuring estimates of PT-Mito by ISH would not be reliable.

      (2) The phospho-PKR images provided as evidence of C16 activity (Supplemental Figure 1) are too dim to be very useful. Could brighter images be provided? 

      We have now adjusted the LUTs of images in Supplemental Figure 1.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases. 

      We appreciate the reviewer for the positive assessment as well as all the comments and suggestions.

      Reviewer #2 (Public Review): 

      Summary: 

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine. 

      Strengths: 

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach. 

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6. 

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general. 

      Thank you very much for your comments and suggestions.

      Weaknesses: 

      Two relatively minor issues are raised here for consideration: 

      p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....". 

      We appreciate the clarification regarding the description of our experimental approach. We agree that our structures do not represent reaction intermediates but rather mixtures of substrate and product states within the enzyme-bound environment. We have revised the text accordingly to more accurately reflect our methodology.

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn. 

      Thank you for your insightful comments. We recognize the importance of visualizing metal ion density alongside product density data. To address this, we included in Figure S4 to present Mg2+/Mn2+ and product densities concurrently.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Figure 6. I understand that pre-reaction state (left panel) and Metal-binding state (two middle panels) are in equilibrium. But can we state that the Metal-binding state (two middle panels) and the product state (right panel) are in equilibrium and connected by two arrows? 

      Thank you for your comments. We agree that the DNA hydrolysis reaction process may not be reversible within I-Ppo1 active site. To clarify, we removed the backward arrows between the metal-binding state and product state. In addition, we thank the reviewer for giving a name for the middle state and think it would be better to label the middle state. We added the metal-binding state label in the revised Figure 6 and also added “on the other hand, optimal alignment of a deprotonated water and Mg2+ within the active site, labeled as metal-binding state, leads to irreversible bond breakage (Fig. 6a)” within the text.

      (2) The section on DNA hydrolysis assay (Materials and Methods) is not well described. In this section, the authors should summarize the methods for the experiments in Figure 4 AC, Figure 5BC, Figure S3C, Figure S4EF, and Figure S6AB. The authors presented some graphs for the reactions. For clarity, the author should state in the legends which experiments the results are from (in crystallo or in solution). Please check and modify them. 

      Thank you for the suggestion. We have added four paragraphs to detail the experimental procedures for experiments in these figures. In addition, we have checked all of the figure legends and labeled them as “in crystallo or in solution.” To clarify, we also added “in crystallo” or “solution” in the corresponding panels.

      (3) The authors showed the anomalous signals of Mn2+ and Tl+. The authors should mention which wavelength of X-rays was used in the data collections to calculate the anomalous signals. 

      Thank you for the suggestion. We have included the wavelength of the X-ray in the figure legends that include anomalous maps, which were all determined at an X-ray wavelength of 0.9765 Å.

      (4) The full names of "His-Me" and "HNH" are necessary for a wide range of readers. 

      Thank you for the suggestion. We have included the full nomenclature for His-Me (histidine-metal) nucleases and HNH (histidine-asparagine-histidine) nuclease.

      (5) The authors should add the side chain of Arg61 in Figure 1E because it is mentioned in the main text. 

      Thank you for the suggestion. We have added Arg61 to Figure 1E.

      (6) Figure 5D. For clarity, the electron densities should cover the Na+ ion. The same request applies to WatN in Figure S3B.

      Thank you for catching this detail. We have added the electron density for the Na+ ion in Figure 5D and WatN in Figure S3B.

      (7) At line 269 on page 8, what is "previous H98A I-PpoI structure with Mn2+"? Is the structure 1CYQ? If so, it is a complex with Mg2+. 

      Thank you for catching this detail. We have edited the text to “previous H98A I-PpoI structure with Mg2+.”

      (8) At line 294 on page 9, "and substrate alignment or rotation in MutT (66)." I think "alignment of the substrate and nucleophilic water" is preferred rather than "substrate alignment or rotation". 

      Thank you for the suggestion. We have edited the text to “alignment of the substrate and nucleophilic water.”

      (9) At line 305 on page 9, "Second, (58, 69-71) single metal ion binding is strictly correlated with product formation in all conditions, at different pH and with different mutants (Figure 3a and Supplementary Figure 4a-c) (58)". The references should be cited in the correct positions. 

      Thank you for catching this typo. We have removed the references.

      (10) At line 347 on page 10, "Grown in a buffer that contained (50 g/L glucose, 200 g/L α-lactose, 10% glycerol) for 24 hrs." Is this sentence correct? 

      Thank you for catching this detail. We have corrected the sentence.

      (11) At line 395 on page 11, "The His98Ala I-PpoI crystals of first transferred and incubated in a pre-reaction buffer containing 0.1M MES (pH 6.0), 0.2 M NaCl, 1 mM MgCl2 or MnCl2, and 20% (w/v) PEG3350 for 30 min." In the experiments using this mutant, does a pre-reaction buffer contain MgCl2 or MnCl2? 

      Thank you for bringing this to our attention. We have performed two sets of experiments: 1) metal ion soaking in 1 mM Mn2+, which is performed similarly as WT and does not have Mn2+ in the pre-reaction buffer; 2) imidazole soaking, 1 mM Mn2+ was included in the pre-reaction buffer. We reasoned that the Mn2+ will not bind or promote reaction with His98Ala I-PpoI, but pre-incubation may help populate Mn2+ within the lattice for better imidazole binding. However, neither Mn2+ nor imidazole were observed. We have added experimental details for both experiments with His98Ala I-PpoI.

      (12) In the figure legends of Figure 1, is the Fo-Fc omit map shown in yellow not in green? Please remove (F) in the legends. 

      We have changed the Fo-Fc map to be shown in violet. We have also removed (f) from the figure legends.

      (13) I found descriptions of "MgCl". Please modify them to "MgCl2". 

      Thank you for catching these details. We have modified all “MgCl” to “MgCl2.”

      (14) References 72 and 73 are duplicated. 

      We have removed the duplicated reference.

      Reviewer #2 (Recommendations For The Authors): 

      p. 9, first paragraph, last three lines: "Thus, we suspect that the metal ion may play a crucial role in the chemistry step to stabilize the transition state and reduce the electronegative buildup of DNA, similar to the third metal ion in DNA polymerases and RNaseH." This point is significant but the statement seems a little uncertain. You are saying that the single metal plays the role of two metals in polymerase, in both the ground state and the transition state. I believe the sentence can be stronger and more explicit. 

      Thank you for raising this point. We suspect the single metal ion in I-PpoI is different from the A-site or B-site metal ion in DNA polymerases and RNaseH, but similar to the third metal ion in DNA polymerases and nucleases. As we stated in the text,

      (1) the metal ion in I-PpoI is not required for substrate alignment. The water molecule and substrate can be observed in place even in the presence of the metal ion. In contrast, the A-site or B-site metal ion in DNA polymerases and RNaseH are required for aligning the substrates.

      (2) Moreover, the appearance of the metal ion is strictly correlated with product formation, similar as the third metal ion in DNA polymerase and RNaseH.

      To emphasize our point, we have revised the sentence as

      “Thus, similar to the third metal ion in DNA polymerases and RNaseH, the metal ion in I-PpoI is not required for substrate alignment but is essential for catalysis. We suspect that the single metal ion helps stabilize the transition state and reduce the electronegative buildup of DNA, thereby promoting DNA hydrolysis.”

      Minor typos: 

      p. 2, line 4 from bottom: due to the relatively low resolution... 

      Thank you for catching this. We have edited the text to “due to the relatively low resolution.”

      Figure 4F: What is represented by the pink color? 

      The structures are color-coded as 320 s at pH 6 (violet), 160 s at pH 7 (yellow), and 20 s at pH 8 (green). We have included the color information in figure legend and make the labeling clearer in the panel.

      p. 9, first paragraph, last line: ...similar to the third... 

      Thank you for catching this. We have edited the text.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The study answers the important question of whether the conformational dynamics of proteins are slaved by the motion of solvent water or are intrinsic to the polypeptide. The results from neutron scattering experiments, involving isotopic labelling, carried out on a set of four structurally different proteins are convincing, showing that protein motions are not coupled to the solvent. A strength of this work is the study of a set of proteins using spectroscopy covering a range of resolutions. A minor weakness is the limited description of computational methods and analysis of data. The work is of broad interest to researchers in the fields of protein biophysics and biochemistry.

      We thank the editors and reviewers for the positive and encouraging comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zheng et al. study the 'glass' transitions that occurs in proteins at ca. 200K using neutron diffraction and differential isotopic labeling (hydrogen/deuterium) of the protein and solvent. To overcome limitations in previous studies, this work is conducted in parallel with 4 proteins (myoglobin, cytochrome P450, lysozyme and green fluorescent protein) and experiments were performed at a range of instrument time resolutions (1ns - 10ps). The author's data looks compelling, and suggests that transitions in the protein and solvent behavior are not coupled and contrary to some previous reports, the apparent water transition temperature is a 'resolution effect'; i.e. instrument response limited. This is likely to be important in the field, as a reassessment of solvent 'slaving' and the role of the hydration shell on protein dynamics should be reassessed in light of these findings.

      Strengths:

      The use of multiple proteins and instruments with a rate of energy resolution/ timescales.

      We thank the reviewer for highlighting our key findings.

      Weaknesses:

      The paper could be organised to better allow the comparison of the complete dataset collected. The extent of hydration clearly influences the protein transition temperature. The authors suggest that "water can be considered here as lubricant or plasticizer which facilitates the motion of the biomolecule." This may be the case, but the extent of hydration may also alter the protein structure.

      Following the reviewer’s suggestion, we studied the secondary structure content and tertiary structure of CYP protein at different hydration levels (h = 0.2 and 0.4) through molecular dynamics simulation. As shown in Table S2 and Fig. S6, the extent of hydration does not alter the protein secondary structure content and overall packing. Thus, this result also suggests that water molecules have more influence on protein dynamics than on protein structure. We added the above results in the revised SI.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript entitled "Decoupling of the Onset of Anharmonicity between a Protein and Its Surface Water around 200 K" by Zheng et al. presents a neutron scattering study trying to elucidate if at the dynamical transition temperature water and protein motions are coupled. The origin of the dynamical transition temperature is highly debated since decades and specifically its relation to hydration.

      Strengths:

      The study is rather well conducted, with a lot of efforts to acquire the perdeuterated proteins, and some results are interesting.

      We thank the reviewer for highlighting our key findings.

      Weaknesses:

      The MD data presented appears to be missing description of the methods used.

      If these data support the authors claim that different levels of hydration do not affect the protein structure, careful analysis of the MD simulation data should be presented that show the systems are properly equilibrated under each condition. Additionally, methods are needed to describe the MD parameters and methods used, and for how long the simulations were run.

      We have now added the methods of MD simulation into the revised SI.

      “The initial structure of protein cytochrome P450 (CYP) for simulations was taken from PDB crystal structure (2ZAX). Two protein monomers were filled in a cubic box. 1013 and 2025 water molecules were inserted into the box randomly to reach a mass ratio of 0.2 and 0.4 gram water/1 gram protein, respectively, which mimics the experimental condition. Then 34 sodium counter ions were added to keep the system neutral in charge. The CHARMM 27 force field in the GROMACS package was used for CYP, whereas the TIP4P/Ew model was chosen for water. The simulations were carried out at a broad range of temperatures from 360 K to 100 K, with a step of 5 K. At each temperature, after the 5000 steps energy-minimization procedure, a 10 ns NVT is conducted. After that, a 30 ns NPT simulation was carried out at 1 atm with the proper periodic boundary condition. As shown in Fig. S7, 30 ns is sufficient to equilibrate the system. The temperature and pressure of the system is controlled by the velocity rescaling method and the method by Parrinello and Rahman, respectively. All bonds of water in all the simulations were constrained with the LINCS algorithm to maintain their equilibration length. In all the simulations, the system was propagated using the leap-frog integration algorithm with a time step of 2 fs. The electrostatic interactions were calculated using the Particle Mesh Ewalds (PME) method. A non-bond pair-list cutoff of 1 nm was used and the pair-list was updated every 20 fs. All MD simulations were performed using GROMACS 4.5.1 software packages.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Response to author's changes:

      See public review: The MD data presented appears to be missing description of the methods used.

      If these data support the authors claim that different levels of hydration do not affect the protein structure, careful analysis of the MD simulation data should be presented that show the systems are properly equilibrated under each condition. Additionally, methods are needed to describe the MD parameters and methods used, and for how long the simulations were run.

      We have now added the methods of MD simulation into the revised SI. Please see Reply 5.

      Reviewer #2 (Recommendations For The Authors):

      The authors answered my questions and substantially improved the manuscript.

      We thank the reviewer for the encouraging comments .

    1. Author response:

      'We thank the reviewers for their helpful comments and criticisms of our manuscript and are pleased by the overall positive nature of the comments. For the eLife Version of Record, we plan to carry out the following experiments to address reviewer comments:

      - We will use genetic approaches (e.g., driving p35 in glia to block apoptosis) and molecular markers, such as phospho-Histone H3, to assess whether reduced glial proliferation or increased glial apoptosis contributes to reduced glial cell number.

      - We will assess the ability of glial-specific expression of the Drosophila or Human ifc/DEGS1 transgenes to rescue the ifc lethal phenotype to adulthood.

      - We will replicate key phenotypic findings with additional ifc alleles.

      - We will enhance our characterization of 3xP3 RFP transgenes with respect to glial subtypes both for the insert we used in our study and at least one independent insert.

      - We will edit the text of the manuscript to clarify additional points raised by the reviewers.

      Once we complete the above approaches, we will modify our manuscript accordingly and submit a full response to the reviews to eLife along with the revised manuscript,'

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This important study explores the potential influence of physiologically relevant mechanical forces on the extrusion of vesicles from C. elegans neurons. The authors provide compelling evidence to support the idea that uterine distension can induce vesicular extrusion from adjacent neurons. The work would be strengthened by using an additional construct (preferably single-copy) to demonstrate that the observed phenotypes are not unique to a single transgenic reporter. Overall, this work will be of interest to neuroscientists and investigators in the extracellular vesicle and proteostasis fields. 

      We now include supporting data using a single copy alternate fluorescent reporter expressed in touch neurons (Fig. 3H).

      In brief, we examined the induction of exophergenesis in an alternative single-copy transgene strain that expresses mKate fluorescent protein specifically in touch receptor neurons. As compared to the multi-copy transgene that is broadly used in this study and expresses mCherry fluorescent protein specifically in touch receptor neurons, the mKate single-copy transgene is associated with a much lower frequency of exophergenesis. However, increasing uterine distension via blocking egg-laying can increase the exophergenesis of the mKate single-copy transgenic line from 0% to approximately 60% on adult day 1, indicating that the observed response is not tied to a single reporter.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors sought to understand the stage-dependent regulation of exophergenesis, a process thought to contribute to promoting neuronal proteostasis in C. elegans. Focusing on the ALMR neuron, they show that the frequency of exopher production correlates with the timing of reproduction. Using many genetic tools, they dissect the requirements of this pathway to eventually find that occupancy of the uterus acts as a signal to induce exophergenesis. Interestingly, the physical proximity of neurons to the egg zone correlates with exophergenesis frequency. The authors conclude that communication between the uterus and proximal neurons occurs through the sensing of mechanic forces of expansion normally provided by egg occupancy to coordinate exophergenesis with reproduction. 

      Strengths: 

      The genetic data presented is thorough and solid, and the observation is novel. 

      Weaknesses: 

      The main weakness of the study is that the detection of exophers is based on the overexpression of a fluorescent protein in touch neurons, and it is not clear whether this process is actually stimulated in wild-type animals, or if neurons have accumulated damaged proteins in relatively young day 2 animals. 

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (new Fig. 3H), supporting that uterine distention, rather than reporter identity, is associated with early life exopher elevation. Data also add to our observations indicating that high protein-expressing strains generally produce higher baseline levels of exophers in early adulthood (for example, Melentijevic et al. (PMID 28178240) documented that mCherry RNAi knockdown in the strain primarily studied here can lower exopher levels).

      The second point raised here, regarding the occurrence and physiological role of early-adult exophers in “native” non-stressed neurons is a fascinating question that we are beginning to address in continuing experiments. Readers will appreciate that quantifying relatively rare, “invisible” touch receptor neuron exophergenesis accurately without expressing a fluorescent reporter is technically challenging. Our speculation, outlined now a bit more clearly in the Discussion here, is that certain molecular and organelle debris that cannot readily be degraded in cells during larval development may be stored until release to more capable degradative neighbors or to the coelomocytes for later management, as one component of the early adult transition in proteostasis (see J. Labbadia and R. I. Morimoto, PMID 24592319). Receiving cells may be primed for this at a particular timepoint, possibly analogous to the “bulky garbage” collection of over-sized difficult-to-dispose-of household items that a town will address with specialized action only at specific times. The prediction is that we should be able to detect some mass protein aggregation through early development, and at least partial elimination by adult day 3; this elimination should be impaired when eggs are eliminated. Initial testing is underway.

      Reviewer #2 (Public Review): 

      Summary: 

      This paper reports that mechanical stress from egg accumulation is a biological stimulus that drives the formation of extruded vesicles from the neurons of C. elegans ALMR touch neurons. Using powerful genetic experiments only readily available in the C. elegans system, the authors manipulate oocyte production, fertilization, embryo accumulation, and egg-laying behavior, providing convincing evidence that exopher production is driven by stretch-dependent feedback of fertilized, intact eggs in the adult uterus. Shifting the timing of egg production and egg laying alters the onset of observed exophers. Pharmacological manipulation of egg laying has the predicted effects, with animals retaining fewer eggs having fewer exophers and animals with increased egg accumulation having more. The authors show that egg production and accumulation have dramatic consequences for the viscera, and moving the ALMR process away from eggs prevents the formation of exophers. This effect is not unique to ALMR but is also observed in other touch neurons, with a clear bias toward neurons whose cell bodies are adjacent to the filled uterus. Embryos lacking an intact eggshell with reduced rigidity have impaired exopher production. Acute injection into the uterus to mimic the stretch that accompanies egg production causes a similar induction of exopher release. Together these results are consistent with a model where stretch caused by fertilized embryo accumulation, and not chemical signals from the eggs themselves or egg release, underlies ALMR exopher production seen in adult animals. 

      Strengths: 

      Overall, the experiments are very convincing, using a battery of RNAi and mutant approaches to distinguish direct from indirect effects. Indeed, these experiments provide a model generally for how one would methodically test different models for exopher production. The paper is well-written and easy to understand. I had been skeptical of the origin and purpose of exophers, concerned they were an artefact of imaging conditions, caused by deranged calcium activity under stressful conditions, or as evidence for impaired animal health overall. As this study addresses how and when they form in the animal using otherwise physiologically meaningful manipulations, the stage is now set to address at a cellular level how exophers like these are made and what their functions are. 

      Weaknesses: 

      Not many. The experiments are about as good as could be done. Some of the n's on the more difficult-to-work strains or experiments are comparatively low, but this is not a significant concern because of the number of different, complementary approaches used. The microinjection experiment in Figure 7 is very interesting, there are missing details that would confirm whether this is a sound experiment. 

      We expanded description of details for the microinjection experiment in both the figure legend and the methods section, to enhance clarity and substantiate approach.

      Reviewer #3 (Public Review): 

      Summary: 

      In this paper, the authors use the C. elegans system to explore how already-stressed neurons respond to additional mechanical stress. Exophers are large extracellular vesicles secreted by cells, which can contain protein aggregates and organelles. These can be a way of getting rid of cellular debris, but as they are endocytosed by other cells can also pass protein, lipid, and RNA to recipient cells. The authors find that when the uterus fills with eggs or otherwise expands, a nearby neuron (ALMR) is far more likely to secrete exophers. This paper highlights the importance of the mechanical environment in the behavior of neurons and may be relevant to the response of neurons exposed to traumatic injury. 

      Strengths: 

      The paper has a logical flow and a compelling narrative supported by crisp and clear figures. 

      The evidence that egg accumulation leads to exopher production is strong. The authors use a variety of genetic and pharmacological methods to show that increasing pressure leads to more exopher production, and reducing pressure leads to lower exopher production. For example, egg-laying defective animals, which retain eggs in the uterus, produce many more exophers, and hyperactive egg-laying is accompanied by low exopher production. The authors even inject fluid into the uterus and observe the production of exophers. 

      Weaknesses: 

      The main weakness of the paper is that it does not explore the molecular mechanism by which the mechanical signals are received or responded to by the neuron, but this could easily be the subject of a follow-up study. 

      We agree that the molecular mechanisms operative are of considerable interest, and our initial pursuit suggests that a comprehensive study will be required for satisfactory elaboration of how mechanical signals are received or responded to by the neuron.

      I was intrigued by this paper, and have many questions. I list a few below, which could be addressed in this paper or which could be the subject of follow-up studies. 

      - Why do such a low percentage of ALMR neurons produce exophers (5-20%)? Does it have to do with the variability of the proteostress? 

      We do not yet understand why some ALMR neurons within a same genotype will produce exophers and some will not. We know that in addition to the uterine occupation we report here, proteostasis compromise, feeding status, oxidative stress, and osmotic stress can elevate exopher numbers (PMID 34475208); cell autonomous influences on exopher levels include aggresome-associated biology (PMID 37488107) and expression levels of the mCherry protein (PMID 28178240). Turek reports that social interaction on plates can influence muscle exopher levels (PMID 34288362). Thus, although variable proteostress experienced by neurons is likely a factor, we have not yet experimentally defined specific trigger rules. We suspect the summation of internal proteostasis crisis and environmental conditions, including particular force vectors/frequency will underlie the variable exopher production phenomeonon.

      - Why does the production of exophers lag the peak in progeny production by 24-48 hours? Especially when the injection method produces exophers right away?

      The progeny production can track well with exopher production (Fig. 1B), although the nature of egg counts (permanent, one time events) vs. exophers (which are slowly degraded) can skew the peak scores apart. We synchronized animals at the L4 stage. 24 hours later was adult day 1, and we measured then and every subsequent 24 hours. The daily progeny count reflects the total number of progeny produced every 24 hours; exopher events were scored once a day, but exophers can persist such that the daily exopher count can partially reflect slow degradation, with some exophers being counted on two days. We now explain our scoring details better in the Methods section.

      The rapid appearance of exophers, as early as about ~10 minutes after sustained injection, is fascinating and probably holds mechanistic implications for exopher biology. For one thing, we can infer that in the mCherry Ag2 background, touch neurons can be poised to extrude exophers, but that the pressure/push acts to trigger or license final expulsion. It is interesting that we found we needed to administer sustained injection of two minutes to find exopher increase (now better emphasized in the expanded Methods section). We speculate that a multiple pressure events, or sustained force vector might be critical (like an egg slowly passing through??). Minimally, this assay may help us assign molecular roles to pathway components as we identify them moving forward. 

      - As mentioned in the discussion, it would be interesting to know if PEZO-1/PIEZO is required for uterine stretching to activate exophergenesis. pezo-1 animals accumulate crushed oocytes in the uterus. 

      We have begun to test the hypothesis that PEZO-1 is a signaling component for ALMR exophergenesis, initially using the N and C terminal pezo-1 deletion mutants as in Bai et al. (PMID 32490809). These pezo-1 mutants have a mild decrease in ALMR exophergenesis under normal conditions. However, vulva-less conditions in pezo-1N and piezo-1C increased ALMR exophergenesis from approximately 10% to 60%, similar to the response of wild-type worms to high mechanical stress, data that suggest PEZO-1 is not a required player in mediating mechanical force-induced ALMR exophergenesis. We are currently testing genetic requirements for other known mechanosensors. We intend comprehensive investigation of the molecular mechanisms of mechanical signaing in a future study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      -The study would be significantly strengthened by the addition of data detecting regulation of exophergenesis by uterine forces in a more physiological context, in the absence of overexpression of a toxic protein. In other words, is this a process that occurs naturally during reproduction, or is it specific to proteotoxic stress induced by overexpression? Perhaps the authors could repeat key experiments using a single copy transgene, and challenge the animals with exogenous proteotoxic stress if necessary.

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (Fig. 3H), supporting that uterine distention, rather than reporter identity or over-expression alone dries early life exopher elevation.

      Also noteworthy is that we find exophergenesis in the single-copy transgenic line is only approximately 0.3% on adult day 2 (average in three trials, data not shown), which is much lower than the 5-20% exophergenesis rate typically observed in the multi-copy high expression mCherry transgenic line. Therefore, consequences of overexpression of mCherry likely potentiate exophergenesis.

      -The authors mention that exophergenesis has been described in muscle cells. Is this also dependent on the proximity to the uterus? It would have been interesting to include data on other cell types in the vicinity of the reproductive system.

      Yes, in interesting work on exophers produced by muscle, Turek et al. reported that muscle exopher events are mostly located in a region proximal to the uterus. Moreover, this work also documented that sterile hermaphrodites are associated with approximately 0% muscle exophergenesis, and egg retention in the uterus strongly increases muscle exophergenesis (PMID: 34288362).  

      -Is exophergenesis also induced by other forms of mechanical stress? For example, swimming.

      We have looked at crude treatments such as centrifugation or vortexing without observing changes in exopher levels. Our preliminary work indicates that swimming can increase exophergenesis, and this effect depends on the presence of eggs in the uterus. We appreciate the question, and expect to include documentation of alternative pressure screening in our planned future paper on molecular mechanisms.

      -In Figure 1E, the profile of exopher production for the control condition at 25oC is very similar to the profile observed at 20oC in Figure 1B. However, the profile of progeny production at 25oC is known to have an earlier peak of progeny production. Perhaps egg retention is differently correlated with progeny production at this temperature? The authors could easily test this.

      Overall, exophers (which degrade with time) and progeny counts (a fixed number) have slightly different temporal features, anchored in part by how long exophers or their “starry night” debris persist. Most exophers start to degrade within 1-6 hours (PMID: 36861960), but exopher debris can persist for more than 24 hours. An exopher event observed on day 1 may thus also be recorded at the day 2 time point, which leads to a higher frequency of exopher events on day 2 as compared to day 1.

      We have previously published on the impact of temperature on exopher number (Supplemental Figure 2 in PMID 34475208). In brief, increasing culture temperature for animals that are raised over constant lifetime temperature modestly increases exopher number; a greater increase in exophers is observed under conditions in which animals were switched to a higher temperature in adult life, suggesting changes in temperature (a mandatory part of the ts mutant studies) engages complex biology that modulates exopher production. Our previous data show that in a temperature shift to 25oC, the peak of exophers was at adult day 1. Here, Fig. 1B is constant temperature, 20oC; Fig. 1E has a temperature shift 15-25oC. That egg retention might be temperature-influenced is a plausible hypothesis, but given the complexities of temperature shifts for some mutants, we elected to defer drill-down on the temperature-exopher-egg relationship. 

      -It is not clear how to compare panels A and B in Figure 3. In panel A the males are present throughout the adult life of the hermaphrodites whereas in panel B the males are added in later life. Therefore, the effect of later-life mating on progeny production is not shown and the title of panel A in the legend is misleading. The authors need to perform a progeny count in the same conditions of mating presented in Figure 3B to allow direct comparison.

      As Reviewer 1 suggested, we performed a new progeny count now presented in new Fig. 3A, which more appropriately matches the study presented in Fig. 3B; legends adjusted.

      -On page 12, the authors state that the baseline of exophergenesis in rollers is 71%, but then attribute the 71% in Figure 4F to exophergenesis specifically in ALMR that is posterior to AVM. The authors need to clarify this point.

      Good catch on our error. The baseline of exophergenesis in rollers is ~40%, and we corrected the main text.

      -Considering the conclusion of Figure 2 that blocking embryonic events passed the 4-cell stage does not impact exopher production, it would have been interesting to compare the uterine length for emb-8 and for mex-3, since it is quite intriguing that the former suppresses exopher production while the latter has no effect.

      We repeated the emb-8 and mex-3 RNAi for these studies and encountered variability in outcome for 2 cell stage disruption via emb-8 RNAi, which is consistent with the range of published endpoints for emb-8 RNAi. We elected to include these emb-8 findings in the figure legend 2G, but removed the RNAi data from the main text figure. mex-3 uterine measures are added to revised panels 5H, 6I.

      Reviewer #2 (Recommendations For The Authors): 

      -Leaving the worms in halocarbon oil for too long (e.g. 10 min) can desiccate and kill them. Did the authors take them out of the oil before analyzing exopher production? The authors refer to these as 'sustained injections' without much description beyond that. As the worms are very small, the flow rate needed for a sustained injection over 2 minutes must be very low - so low that the needle is in danger of being clogged. Do the authors have an estimate of how much fluid was injected or the overall flow rate? I realize the flow rate measured outside of the worm may not compare directly to that of a pressurized worm, but such estimates would be instructive, particularly if they can be related to the relative volume of the eggs the injection is trying to mimic.

      After injection or mock injection, we removed the animal from the oil and flipped it if necessary to observe the ALMR neuron on the NGM-agar plate. We now expanded description of the experimental details of injection, including the estimated flow rate, in the revised Methods section.

      - The authors describe the ALMR neurons as "proteostressed", but I am not clear on whether these neurons were treated in a unique procedure to induce such a state or if the authors are merely building on other observations that egg-laying adults are dedicating significant resources to egg production, so they must be proteostressed. If they are not inducing a proteostressed state in their experiments, the authors should refrain from describing their neurons and effects as depending on such a state.

      We revised to more explicity feature published evidence that the ALMR neurons we track with mCherryAg2 bz166 are likely protestressed. Overexpression of mCherry in bz166 is associated with enlargement of lysosomes and formation of large mCherry foci that often correspond toe LAMP::GFP-positive structures in ALMR neurons (PMID: 28178240; PMID: 37488107). Marked changes in ultrastructure reflect TN stress in this background. These cellular features are not seen in wild type animals. We previously published that mCherry, polyQ74, polyQ128, Ab1-42 (which enhance proteostress) over-expression all increase exophers (PMID: 28178240). Likewise most genetic compromise of different proteostasis branches--heat shock chaperones, proteasome and autophagy--promote exophergenesis, supporting exophergenesis as a response to proteostress. In sum, the mCherryAg2 bz166 appear markedly stressed above a non-over expressing line and produce more exophers. RNAi knockdown of the mCherry lowers exopher levels (PMID: 28178240).

      In response to reviewer comment, we added a study with a single copy mKate reporter (new data Fig. 3H). We find a very low baseline of exophers in this background. This would support that high autonomous compromise associated with over-expression influences exopher levels. Interestingly, however, we found that ALMR neurons expressing mKate under a single-copy transgene still exhibit excessive exopher production (>60%) under high mechanical stress (Fig. 3H). These data are consistent with ideas that mechanical stresses can enhance exopher production, and may markedly lower the threshold for exophergenesis in close-to-native stress level neurons.

      - The authors should include more details on the source and use of the RNAi, for example, if the clones were from the Ahringer RNAi library, made anew for this study, or both.

      We now add this information in the methods section.

      - I would be curious if the authors would similarly see an induction in exopher production after acute vulval muscle silencing with histamine. I'm not suggesting this experiment, but it may offer a way to induce exophers in a more controlled manner.

      This is a great suggestion that we will try in future studies.

      - I am not sure if Figure 5 needs to be a main figure in the paper or if it would be more appropriate as a supplement.

      We considered this suggestion but we think that the strikingly strong correleation of uterus length and exopher levels is a major point of the story and these data establish a metric that we will use moving forward to distinquish whethere an exopher modulation disruption is more likely to act by modulation of reproduction or modulation of touch neuron biology. For this reason we elected to keep Figure 5 in the main text.

      Reviewer #3 (Recommendations For The Authors): 

      -The Statistics section in the methods should be expanded to describe the statistics used in the experiments that aren't nominal, of which there are many.

      We have updated and expanded the statistics section.

      -P.2 Line 49 spelling 'que' should be queue (I remember this by the useless queue of letters lined up after the 'q').

      Corrected 

      -The introduction has a bit too much information about oocyte maturation, not relevant to the study.

      We agree that the information about oocyte maturation is not critical for the laying out the related experiments and cut this section to improve focus.

      -p.3 line 22: Some exophers are seen on Day 3, so this should be restated for accuracy.

      Corrected

      -p.3 line 26. Explain here why sperm is necessary (ooyctes don't mature or ovulate effectively without sperm).

      We added this clarifying explanation.

      -p.3 line 44 Clarify in the spe-44 the oocytes are in the oviduct (not the uterus). Might be helpful to include a DIC image to accompany the helpful diagram in Figure 1D. 

      We added a sentence describing the impact of sperm absence on oocyte maturation, progression into the uterus, and retention in the gonad, with reference to PMID: 17472754.  We were able to add a DIC in the tightly packed Figure 1.

      In Supplemental Figure 6, we now include a field picture of oocyte retention in the sem-2 mutant and upon treatment of lin-39(RNAi).

      -p.5 line 3 in the Figure 1D legend; recommend delete 'light with' which is confusing and just refer to the sperm as dark dots. 

      Corrected

      -p.6 line 22-24 Check for alignment of the statements with Figure 2 (2F is cited, but it should be 2G).

      Corrected

      -p12 line 13-15; Many ALMRs not in the egg zone (70%) did not produce exophers - this is still quite a lot. It would be good to state this section in a more straightforward way (less leading the reader) and if possible to give a possible explanation.

      We modified the text to be less leading: “Thus, although ALMR soma positioning in the egg zone does not guarantee exophergenesis in the mCherryAg2 strain, the neurons that did make exophers were nearly always in the egg zone.”

      -p.15 paragraph 3 - clarify how uterine length was controlled for the overall body length of the worm.

      We did not systematically measure body length, but rather focused on uterine distention. It would be of interest to determine if length of the body correlates with uterine size, and then address how that relationship translates to exopher production but here our attention came to rest on the striking correlation of uterine length and number of exophers.

      -p.17 line 23-25; Could be stated more simply. 

      We adjusted the text: “Moreover, the oocyte retention was similarly efficacious in elevating exopher production to egg retention, increasing ALMR exophergenesis to approximately 80% in the sem-2(rf) mutant (Fig. 6C)”.

      -p.23 Line 4. I think by the time the reader reaches this sentence, the egg-coincident exophorgenesis will not be 'puzzling'. 

      Agreed, corrected.

      -p.26, Line 22, Male 'mating', not 'matting'.

      Corrected.

      -Throughout, leave space between number and unit (this is not required for degree or percent, but be consistent). 

      Corrected.

    1. Author Response:

      We thank the reviewers for their careful reading of the manuscript and for their comments. Generally, we agree with the reviewers on the strengths and weaknesses of our manuscript. It is true that this work is a first step towards understanding the molecular mechanisms underlying TNT formation, and that further biochemical and biophysical analyses will be necessary to elucidate CD9 and CD81 roles. It also provides a toolbox for the future identification of important TNT factors, and perhaps biological markers.

      However, we would like to better explain our choice of focusing on CD9 and CD81 in TNTs, given the fact that they are also expressed in EVPs. First, both were among the most abundant integral membrane proteins in TNTs, and overexpression of CD9 was previously shown to increase TNT number. However, a recent work directed by our coauthor E. Rubinstein clearly showed that the absence of CD9, CD81 or even both has minimal impact on the production or composition of EVs in MCF7 (Fan et al, Differential proteomics argues against a general role for CD9, CD81 or CD63 in the sorting of proteins into extracellular vesicles, J. Extracell Vesicles, 2023;12:12352. https://doi.org/10.1002/jev2.12352). This is in line with another recent publication (Tognoli, Commun biol 2023) and with our results showing that the concentration of EVPs was the same when CD9 was overexpressed, i.e. in conditions where TNT number and vesicle transfer were increased. Therefore, it is highly probable that the role of CD9 and CD81 in TNT vs. EVP formation is different, even if we cannot completely exclude a crosstalk between the two pathways.

      Regarding the importance of CD9 and CD81 in TNT formation, our results are consistent with a non-exclusive regulation of the TNTs by these tetraspanins, and/or with partial compensatory mechanisms occurring in the absence of them by yet unknown factors. Interestingly, to our knowledge, none of the TNT regulators described in the literature has a complete inhibitory effect when KO. These results confirm that several pathways can converge to regulate TNTs and are consistent with cellular plasticity. So it is hard to say whether factors like CD9 and CD81, which regulate TNTs and have other functions in cells, are “key” or simply “important”.

      Finally, the model we present in Figure 7 is a schematic working model of possible CD9/CD81 roles, which is obviously simplified for ease of understanding. It is important to note that when we write “no TNT” above an empty space between 2 cells, this describes what is drawn, and corresponds to real conditions where fewer TNTs are detected. It was never our intention to over-interpret our data, but rather to make it clearer with this diagram, and we hope that reading the article will make this clear.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      We thank the reviewer for the time and effort in reviewing our revised manuscript and are grateful for their constructive comments and for acknowledging the significance of our work.

      Summary: 

      Their findings elucidate the mechanisms underlying 2-AA-mediated reduction of pyruvate transport into mitochondria, which impairs the interaction between ERRα and PGC1α, consequently suppressing MPC1 expression and reducing ATP production in tolerized macrophages. While the data presented is intriguing and the paper is well-written, there are several points that warrant consideration. The authors should enhance the clarity, relevance, and impact of their study. 

      Strengths: 

      This paper presents a novel discovery regarding the mechanisms through which PA regulates the bioenergetics of tolerized macrophages. 

      Weaknesses: 

      The relevance of the in vivo model to support the conclusions is questionable. Further clarification is needed on this point. 

      We appreciate the reviewer’s comment. Our conclusion that 2-AA decreases bioenergetics while sustains bacterial burden is further supported by additional in vivo data we present now in Fig. S5. To strengthen the relevance of our in vivo data, we performed additional in vivo experiments. In this set of in vivo studies, mice received the first exposure to 2-AA by injecting 2-AA only and the 2nd exposure through infection with PA14 or ΔmvfR four days post-2-AA injection.  As shown in the supplementary Figure S5 the levels of ATP and acetyl-CoA in the spleen of infected animals and the enumeration of the bacterial counts were the similar between PA14 or ΔmvfR receiving the 1st 2-AA exposure and agree with the “one-shot infection” findings presented in Figure 5 with the PA14 or ΔmvfR+2-AA infected mice or those receiving 2-AA only. These results are consistent with our previous findings showing that 2-AA impedes the clearance of PA14 (Bandyopadhaya et al. 2012; Bandyopadhaya et al. 2016; Tzika et al. 2013) and provide compelling evidence that the metabolic alterations identified may favor PA persistence in infected tissues.

      Reviewer #2 (Public Review): 

      We thank the reviewer for the time and effort in reviewing our revised manuscript and are grateful for their constructive comments and for acknowledging the significance of our work.

      Summary: 

      The study tries to connect energy metabolism with immune tolerance during bacterial infection. The mechanism details the role of pyruvate transporter expression via ERRalpha-PGC1 axis, resulting in pro-inflammatory TNF alpha signalling responsible for acquired infection tolerance. 

      Strengths: 

      Overall, the study is an excellent addition to the role of energy metabolism during bacterial infection. The mechanism-based approach in dissecting the roles of metabolic coactivator, transcription factor, mitochondrial transporter, and pro-inflammatory cytokine during acquired tolerance towards infections indicates a detailed and well-written study. The in vivo studies in mice nicely corroborate with the cell line-based data, indicating the requirement for further studies in human infections with another bacterial model system. 

      Weaknesses:

      The authors have involved various mechanisms to justify their findings. However, they have missed out on certain aspects which connect the mechanism throughout the paper. For example, they measured ATP and acetyl COA production linked with bacterial re-exposures and added various targets like MCP1, EER alpha, PGC1 alpha, and TNF alpha. However, they skipped PGC1 alpha levels, ATP and acetyl COA in various parts of the paper. Including the details would make the work more comprehensive. 

      We appreciate the reviewer’s comments and apologize for omitting the PGC-1α levels.  Per the reviewer’s suggestion, we have added the PGC-1α transcript levels (Figure 4C) in the section describing 2-AA-mediated dysregulation of the ERRα and MPC1 transcription (lines 243-252). Moreover, we have added Figure S5, which shows additional ATP and acetyl CoA levels in vivo. In our view, ATP and acetyl-CoA levels are shown in all appropriate settings, interrogating the bioenergetics, including in the presence of bacteria and in their absence, where only 2-AA is added. Please see Figures 1 and 5 and the newly added Figure S5.

      The use of public data sets to support their claim on immune tolerance is missing. Including various data sets of similar studies will strengthen the findings independently. 

      Suppose we understand correctly the reviewer’s comment regarding public data sets on immune tolerance. In that case, we are referring to our data since there are no published data from other groups on 2-AA tolerization and because the outcome of the 2-AA effect on the bacterial burden differs from that of LPS. Therefore, this study did not consider comparing with published data from LPS.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Animal model: The authors appropriately initiated the study with an in vitro tolerization model involving 2-AA re-exposure, providing foundational insights for further investigation. However, the rationale for the one-shot injection in the in vivo model lacks clarity. To strengthen the relevance of the in vivo data, the authors should consider establishing a model involving bacterial re-exposure, such as a two-challenge paradigm with antibiotic treatment in between. This approach would allow for the examination of peritoneal macrophages harvested from mice, assessing ATP levels, acetyl CoA, TNF production, and bacterial counts. Such an approach would better align the in vivo findings with the in vitro experiments, confirming the role of tolerized macrophages in controlling PA infection in the presence of 2-AA. 

      We thank the reviewer for this comment.  Indeed, we have performed a similar two-challenge paradigm study in which first exposure to 2-AA is achieved by injecting 2-AA, and 2nd exposure through infection with PA14 or ΔmvfR four days post -2-AA injection.  The results of Figure S5 can be directly compared with those in Fig 5 in vivo studies. As shown in supplementary Figure S5 the levels of ATP and acetyl-CoA in the spleen of infected animals and the enumeration of the bacterial counts agree with the “one-shot infection” presented in Fig 5 (PA14 or ΔmvfR+2-AA).  Figure S5 study although not included initially to simplify data presentation, it was performed in parallel with Fig 5 and thus they can be directly compared. 

      (2) Exogenous ATP treatment: It is crucial to explore whether 2-AA re-exposure suppresses inflammasome activation and whether this suppression can be reversed by exogenous ATP treatment. Specifically, the authors should investigate whether NLRP3 inflammasome activation is inhibited in tolerized macrophages and whether such activation is necessary for host defense. Clarifying these points would provide valuable insights into the mechanisms underlying macrophage tolerization induced by 2-AA. 

      Excellent point. We agree, indeed, this is planned in the near future.

      (3) Figures 4C and D: The authors should exercise care in describing these figures. For instance, line 263 states that "UK5099 had no effect on the PA14 burden in macrophages," which requires correction for accuracy. 

      We apologize and rephrase this sentence and other sentences referring to Fig 4D and 4E in this section. Please see the highlighted sentences in the results section referring to Fig 4. For example, “The addition of the UK5099 inhibitor strongly enhanced the bacterial intracellular burden in ΔmvfR infected macrophages compared to the non-inhibited ΔmvfR infected cells, reaching a similar burden to those infected with PA14 (Fig. 4D)”.

      (4) ERRα expression: While the study intriguingly demonstrates a decrease in ERRα levels in tolerized macrophages following exposure to 2-AA, the discussion of this finding is lacking. It is worth exploring the possibility of increasing ERRα expression to counteract the tolerization induced by 2-AA and enhance clearance of PA infection. This avenue should be thoroughly discussed in the manuscript's Discussion section, offering insights into potential therapeutic strategies to mitigate the effects of 2-AA on macrophage function. 

      Thank you so much for this additional comment.  We have now included this point in the discussion section (lines 373-376).

      Reviewer #2 (Recommendations For The Authors): 

      Overall, the study is an excellent addition to the role of energy metabolism during bacterial infection. The mechanism-based approach in dissecting the roles of metabolic coactivator, transcription factor, mitochondrial transporter, and pro-inflammatory cytokine during acquired tolerance indicates a detailed and well-written study. However, connecting the mechanisms often was not reflected in some of the experiments, and answering a few concerns/suggestions will undoubtedly improve the study's readability, appeal, and overall impact on a broader audience. 

      (1) The authors should rephrase the title if possible. The title indicates 2AA as a bacterial quorum sensing signal; however, throughout the manuscript, there are no studies associated with actual quorum sensing in bacteria. 

      Thank you for this comment. However, the title indicates 2-AA as a quorum sensing molecule because the synthesis of this signaling molecule is uniquely regulated by quorum sensing. Because of its importance in the virulence of Pseudomonas aeruginosa and its regulation by quorum sensing, we feel that it is appropriate to refer to it as such.

      (2) The authors generalised immunotolerance and memory of 2AA-exposed cells to broad-spectrum microbial exposure by just testing with LPS exposure. I would suggest they test at least 2 more heterologous microbial products known to illicit response and confirm their claim from Figure 1. 

      We appreciate the reviewer’s comment. We intend not to generalize immunotolerance and memory of 2-AA exposed cells to broad-spectrum microbial exposure. Moreover, since the manuscript is not focused on comparing other bacterial molecules to 2-AA and multiple studies have focused on LPS tolerance, we tested LPS only in the manuscript.

      (3) LPS triggers ATP production through glycolysis in nitric oxide (NO) dependent mechanisms in various immune and non-immune cells. The authors should study the concentrations of NO, Glucose, and Pyruvate levels to clarify the mechanism of energy dynamics and the source of ATP and Acetyl CoA generated/scavenged during primary and secondary exposures to both 2AA and LPS. 

      We agree that a cross-tolerization experiment using 2-AA and LPS would reveal interesting insights into immune response during PA infections.  However, this is out of the scope of this article. Please notice that the mechanism of 2-AA and LPS tolerization is mechanistically distinct, e.g. they rely on different HDAC enzymes, and LPS tolerization predominantly involves changes in H3K27 acetylation (Lauterbach et al. 2019). In contrast, 2-AA tolerization involves H3K18 modifications (Bandyopadhaya, Tsurumi, and Rahme 2017). For this reason, the complexity of such interactions would require a comprehensive set of experiments that are not part of the focus of this study.

      (4) Immunogenic triggers often rapidly alter mitochondrial membrane potential, which alters oxygen consumption rates. However, the authors tend to generalize energy homeostasis and claim the deregulation of OXPHOS-inducing quiescent phenotype depending upon OCR measurements from Figure 1D. The authors must evaluate mitochondrial health and membrane potential during first and second exposure in a time-dependent manner to strengthen their theory of mitochondrial dysfunction. The authors should also check the phenomena in vivo (mice exposed to infection) if possible. 

      Thank you for this suggestion. We now include electron microscopy images of mitochondria isolated from macrophages exposed to 2-AA. Results revealed that 2-AA alters mitochondrial morphology and cristae, supporting the mitochondrial dysfunctionality caused by 2-AA. These results are shown in Figure S4 and lines 185-188.

      (5) Since both MCP1 and MCP2 transporters are known to transport pyruvate to mitochondria, checking both MCP1 and 2 at transcript and protein levels in exposed cells will be essential. I suggest authors use MCP inhibitors or use RNA interference against MCPs to check the effect on tolerance of the cells exposed for a second time. 

      To our understanding, mitochondrial pyruvate carrier proteins, MPC1 and MPC2, form a hetero-oligomeric complex in the inner mitochondrial membrane to facilitate pyruvate import into mitochondria (McCommis and Finck 2015). We also used UK5099 an MPC carrier inhibitor for enumeration of bacterial load in macrophages in Figure 4 and observed a similar effect as 2-AA suggesting a similar mechanism of action.

      (6) The pyruvate levels of mitochondria in Figure 2A are shallow, and the authors claim statistical significance within a 1.5-fold change. The authors should cross-check the number of mitochondria they are isolating while estimating pyruvate from only mitochondrial fractions. Another point is, correlating mitochondrial pyruvate with the burst of ATP during first exposure in comparison to second exposure, one can argue that the number of mitochondria is variable between the exposures leading to a change in pyruvate amount (mitochondria number increases to compensate for the first exposure and decreases quickly to maintain homeostasis and remains quiescent during a second exposure due to activation of compensatory immune mechanism towards primary exposure). How do authors address the issue? 

      Our electron microscopic studies indicate that although after 2-AA exposure, no reduction in mitochondrial numbers is observed in macrophages, alterations in mitochondrial morphology and cristae are observed. Please also see our answer to point # 4.

      (7) The authors claim that ERR alpha regulates MCP1 transcription via activation of ERRalpha-PGC1 alpha axis and tolerization in cells to second exposure is due to impairment of the axis (Figure 3). PGC1 alpha is known to be induced during various metabolic, physiological, and immune-challenge-related stress in a tissue-dependent manner. In this context, one should expect changes in transcript and protein levels of PGC1 alpha. The authors must study PGC1 alpha levels with time-dependent exposures. LPS was shown to induce oscillations in PGC1 alpha levels in a tissue-specific manner. In experiments, authors should verify if such oscillations persist during time-dependent exposure, emphasising mitochondrial uncoupling that might get dampened during re-exposures to microbial challenges. 

      We appreciate the suggestion. We have now included PGC-1α (Figure 4C) transcript levels, which show the same profile as the transcript levels of ERRα and MPC1. Please note that PGC-1α is only one of several ERRα co-activators; therefore, the amount of ERRα protein is the most relevant assessment regarding the activation of the MPC1 transcription.

      (8) The authors claim that ERRalpha induces MCP1 through ChIP data in Figure 3. However, the physical verifications at mRNA levels and mutational/inhibitor-based experiments are missing. The authors should study the alterations of MCP1 mRNA in relation to exposures and inhibitors of ERRalpha and PGC1 alpha to strengthen their work. 

      This is an interesting approach; however, this experiment exceeds the scope of our manuscript. We will certainly consider this suggestion in our future experiments. Thank you.

      (9) Publicly available data sets with LPS exposures should be analyzed for gene sets pertaining to mitochondrial OXPHOS, metabolism, immune response, etc. This will support the authors' work and provide a global overview of transcriptome associated with immune tolerance. 

      We appreciate the reviewer’s comment. For the reasons explained in #3 point and because the bacterial burden outcome of the 2-AA effect is different from that of LPS, comparison with LPS published data was not considered in this study.  We agree that in the future, a comprehensive comparison of whole genome transcriptome studies between LPS and 2-AA may reveal important insights that may also help better understand and potentially classify the immune tolerance triggered by 2-AA.

      (10) In Figure 4, the authors study the role of MCP1 and associated pyruvate-dependent bacterial clearance during tolerization and associate them with a decrease in TNF alpha. I would suggest the addition of an ERR alpha inhibitor in these experiments. It is not clear as to why (mechanism) TNF alpha transcription was affected via pyruvate transport during bacterial exposure. I would suggest that the authors clarify the mechanism of TNF alpha activation/inactivation and its association with energy metabolism during acquired tolerance. 

      This is an excellent suggestion, given that a similar effect of ERRα on TNF-α was observed by other researchers (Chaltel-Lima et al. 2023).  Here, to clarify the mechanism of TNF alpha activation/inactivation and its association with energy metabolism, we elaborate on this aspect in the discussion section.

      Lines 388-393. The text reads:

      Previously, we reported that 2-AA tolerization induces histone deacetylation via HDAC1, reducing H3K18ac at the TNF-α promoter (Bandyopadhaya et al. 2016). The findings with acetyl-CoA reduction, the primary substrate of histone acetylation, and the TNF-α transcription  using UK5099 and ATP in 2-AA treated macrophages are in support of the bioenergetics disturbances observed in macrophages and their link to epigenetic modifications we have shown to be promoted by 2-AA (Bandyopadhaya et al. 2016)

      (11) It is surprising that authors specifically target TNF alpha as a pro-inflammatory cytokine during tolerance. Various reports of cytokines and immune modulatory factors play a vital role in immune tolerance upon bacterial exposure. I would suggest authors perform cytokine profiling or check public data sets to specify their reason for choosing TNF alpha. 

      The choice of TNF-α is based on the results obtained in our previous study  (Bandyopadhaya et al. 2016).

      Bandyopadhaya, A., M. Kesarwani, Y. A. Que, J. He, K. Padfield, R. Tompkins, and L. G. Rahme. 2012. 'The quorum sensing volatile molecule 2-amino acetophenon modulates host immune responses in a manner that promotes life with unwanted guests', PLoS pathogens, 8: e1003024.

      Bandyopadhaya, A., A. Tsurumi, D. Maura, K. L. Jeffrey, and L. G. Rahme. 2016. 'A quorum-sensing signal promotes host tolerance training through HDAC1-mediated epigenetic reprogramming', Nat Microbiol, 1: 16174.

      Bandyopadhaya, A., A. Tsurumi, and L. G. Rahme. 2017. 'NF-kappaBp50 and HDAC1 Interaction Is Implicated in the Host Tolerance to Infection Mediated by the Bacterial Quorum Sensing Signal 2-Aminoacetophenone', Front Microbiol, 8: 1211.

      Chaltel-Lima, L., F. Domínguez, L. Domínguez-Ramírez, and P. Cortes-Hernandez. 2023. 'The Role of the Estrogen-Related Receptor Alpha (ERRa) in Hypoxia and Its Implications for Cancer Metabolism', Int J Mol Sci, 24.

      Lauterbach, M. A., J. E. Hanke, M. Serefidou, M. S. J. Mangan, C. C. Kolbe, T. Hess, M. Rothe, R. Kaiser, F. Hoss, J. Gehlen, G. Engels, M. Kreutzenbeck, S. V. Schmidt, A. Christ, A. Imhof, K. Hiller, and E. Latz. 2019. 'Toll-like Receptor Signaling Rewires Macrophage Metabolism and Promotes Histone Acetylation via ATP-Citrate Lyase', Immunity, 51: 997-1011 e7.

      McCommis, K. S., and B. N. Finck. 2015. 'Mitochondrial pyruvate transport: a historical perspective and future research directions', Biochem J, 466: 443-54.

      Tzika, A. A., C. Constantinou, A. Bandyopadhaya, N. Psychogios, S. Lee, M. Mindrinos, J. A. Martyn, R. G. Tompkins, and L. G. Rahme. 2013. 'A small volatile bacterial molecule triggers mitochondrial dysfunction in murine skeletal muscle', PloS one, 8: e74528.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study by Paoli et al. used a resonant scanning multiphoton microscope to examine olfactory representation in the projection neurons (PNs) of the honeybee with improved temporal resolution. PNs were classified into 9 groups based on their response patterns. Authors found that excitatory repose in the PNs precedes the inhibitory responses for ~40ms, and ~50% of PN responses contain inhibitory components. They built the neural circuit model of the mushroom body (MB) with evolutionally conserved features such as sparse representation, global inhibition, and a plasticity rule. This MB model fed with the experimental data could reproduce a number of phenomena observed in experiments using bees and other insects, including dynamical representations of odor onset and offset by different populations of Kenyon cells, prolonged representations of after-smell, different levels of odorspecificity for early/delay conditioning, and shift of behavioral timing in delay conditioning. The trace conditioning was not modeled and tested experimentally. Also, the experimental result itself is largely confirmatory to preceding studies using other organisms. Nonetheless, the experimental data and the model provide a solid basis for future studies.  

      We thank the reviewer for summarizing the value of our study and recognizing its generality and significance. As suggested, in a revised version of the manuscript, we will discuss the implication of our approach for the context of trace conditioning. The model we presented hinges on the learning-induced plasticity of KC-to-MBON synapses recruited during the learning window (i.e., the simulated US arrival). In the case of trace conditioning, the model predicts that the time of the behavioral response time should match the expected US arrival. Contrary to this prediction, preliminary analyses on empirical measurements of PER latency upon trace conditioning indicate this is not the case. In a revised version of the manuscript, we will discuss the differences between the predictions of the model and the experimental observations in a trace conditioning paradigm.

      Reviewer #2 (Public Review):

      The study presented by Paoli et al. explores temporal aspects of neuronal encoding of odors and their perception, using bees as a general model for insects. The neuronal encoding of the presence of an odor is not a static representation; rather, its neuronal representation is partly encoded by the temporal order in which parallel olfactory pathways participate and are combined. This aspect is not novel, and its relevance in odor encoding and recognition has been discussed for more than the past 20 years. 

      The temporal richness of the olfactory code and its significance have traditionally been driven by results obtained based on electrophysiological methods with temporal resolution, allowing the identification and timing of the action potentials in the different populations of neurons whose combination encodes the identity of an odor. On the other hand, optophysiological methods that enable spatial resolution and cell identification in odor coding lack the temporal resolution to appreciate the intricacies of olfactory code dynamics. 

      (1) In this context, the main merit of Paoli et al.'s work is achieving an optical recording that allows for spatial registration of olfactory codes with greater temporal detail than the classical method and, at the same time, with greater sensitivity to measure inhibitions as part of the olfactory code. 

      The work clearly demonstrates how the onset and offset of odor stimulation triggers a dynamic code at the level of the first interneurons of the olfactory system that changes at every moment as a natural consequence of the local inhibitory interactions within the first olfactory neuropil, the antennal lobe. This gives rise to the interesting theory that each combination of activated neurons along this temporal sequence corresponds to the perception of a different odor. The extent to which the corresponding postsynaptic layers integrate this temporal information to drive the perception of an odor, or whether this sequence is, in a sense, a journey through different perceptions, is challenging to address experimentally. 

      In their work, the authors propose a computational approach and olfactory learning experiments in bees to address these questions and evaluate whether the sequence of combinations drives a sequence of different perceptions. In my view, it is a highly inspiring piece of work that still leaves several questions unanswered. 

      We thank the reviewer for considering that our work has an inspiring nature. Below we have tried to answer the questions raised by the following comments, and we will include part of these answers in the revised version of our manuscript.

      (2) In my opinion, the detailed temporal profile of the response of projection neurons and their respective probabilities of occurrence provide valuable information for understanding odor coding at the level of neurons transferring information from the antennal lobes to the mushroom bodies. An analysis of these probabilities in each animal, rather than in the population of animals that were measured, would aid in better comprehending the encoding function of such temporal profiles. Being able to identify the involved glomeruli and understanding the extent to which the sequence of patterns and inhibitions is conserved for each odor across different animals, as it is well known for the initial excitatory burst of activity observed in previous studies without the fine temporal detail, would also be highly significant. 

      We thank the reviewer for recognizing the relevance of the findings in understanding the logic of olfactory coding. We agree about the importance of establishing if the different glomerular response profiles are evenly distributed across individuals or have individual biases. In the revised version of the manuscript, we will provide data on the distribution of response profiles for each animal and for different olfactory stimuli. Also, we fully agree on the importance of assessing to what extent such response profiles - largely determined by the local network of AL interneurons - are glomerulus-specific and conserved across individuals.

      In my view, the computational approach serves as a useful tool to inspire future experiments; however, it appears somewhat simplistic in tackling the complexity of the subject. One question that I believe the researchers do not address is to what extent the inhibitions recorded in the projection neurons are integrated by the Kenyon cells and are functional for generating odor-specific patterns at that level. 

      The model we proposed represents, indeed, a simplification of olfactory signal processing throughout the honey bee olfactory circuit. Still, it shows that simple but realistic rules can be sufficient to grasp some fundamental aspects of olfactory coding. However, we agree with the reviewer and believe that such a minimalistic model can provide a basis for designing future experiments in which complexity can be increased by adding relevant features, such as the learning-induced plasticity of PN-to-KC synapses or the divergence of multiple PNs from the same glomerulus to different KCs.

      Concerning the reviewer's question on the involvement of inhibitory inputs in generating odor-specific patterns at the level of the KCs, the short answer is yes, they contribute to the summed input of a target KC, thus to the odor representation. In designing the model, we considered that a given glomerulus provides maximal input at maximal excitation and minimal input (=0 input) at maximal inhibition. For this reason, an inhibited glomerulus contributes less (to KC action potential probability) than a glomerulus showing baseline activity. This, in turn, contributes less than an excited glomerulus. From the modeling point of view, normalizing the signal between 0 and 1 (i.e., setting minimal inhibition to 0 and maximal excitation to 1) would yield a similar result as with the current approach, where values range from -25% to +30% F/F. We implement the model's description to clarify this point.

      Lastly, the behavioral result indicating a difference in conditioned response latency after early or delayed learning protocol is interesting. However, it does not align with the expected time for the neuronal representation that was theoretically rewarded in the delayed protocol. This final result does not support the authors' interpretation regarding the existence of a smell and an after-smell as separate percepts that can serve as conditioned stimuli.

      Considering that our odor stimulus lasted 5 seconds, glomerular activity is highly variable at odor onset (i.e., within the first 1s) because of short excitatory response profiles and the delayed and slower onset of inhibitory responses. After the initial phase, the neural representation of the stimulus becomes more stable. Consequently, a neural signature learned in the case of delay conditioning, i.e., with the US appearing towards the end of the olfactory stimulation (t = 4 - 5s), may present itself much earlier (t = 1.5s), triggering a behavioral response that largely anticipates the expected US arrival time. 

      In the model, we observe an early decrease in action potential probability even in the case of delay conditioning. This occurs because the synapses recruited during the last second of olfactory stimulation (within the learning window during which CS and US overlap) become inactive. Because odorant-induced activity recruits highly overlapping synaptic populations between 1.5 and 5 s from the onset, a learning-induced inactivation of part of these synapses will result in a reduced action-potential probability in the modeled MBON. Importantly, this event will not be governed by time but by the appearance of the learned synaptic configuration. 

      We will add a new section to the revised version of the manuscript to clarify this concept and perform further analyses to characterize the contribution of different response types to the modeled response latency.

    1. Author response:

      The following is the response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Nitta et al, in their manuscript titled, "Drosophila model to clarify the pathological significance of OPA1 in autosomal dominant optic atrophy." The novelty of this paper lies in its use of human (hOPA1) to try to rescue the phenotype of an OPA1 +/- Drosophilia DOA model (dOPA). The authors then use this model to investigate the differences between dominant-negative and haploinsufficient OPA1 variants. The value of this paper lies in the study of DN/HI variants rather than the establishment of the drosophila model per se as this has existed for some time and does have some significant disadvantages compared to existing models, particularly in the extra-ocular phenotype which is common with some OPA1 variants but not in humans. I judge the findings of this paper to be valuable with regards to significance and solid with regards to the strength of the evidence.

      Suggestions for improvements:

      (1) Stylistically the results section appears to have significant discussion/conclusion/inferences in section with reference to existing literature. I feel that this information would be better placed in the separate discussion section. E.g. lines 149-154.

      We appreciate the reviewer’s suggestion to relocate the discussion, conclusions, and inferences, particularly those that reference existing literature, to a separate discussion section. For lines 149–154, we placed them in the discussion section (lines 343–347) as follows. “Our established fly model is the first simple organism to allow observation of degeneration of the retinal axons. The mitochondria in the axons showed fragmentation of mitochondria. Former studies have observed mitochondrial fragmentation in S2 cells (McQuibban et al., 2006), muscle tissue (Deng et al., 2008), segmental nerves (Trevisan et al., 2018), and ommatidia (Yarosh et al., 2008) due to the LOF of dOPA1.”

      For lines 178–181, we also placed them in the discussion section (lines 347–351) as follows. “Our study presents compelling evidence that dOPA1 knockdown instigates neuronal degeneration, characterized by a sequential deterioration at the axonal terminals and extending to the cell bodies. This degenerative pattern, commencing from the distal axons and progressing proximally towards the cell soma, aligns with the paradigm of 'dying-back' neuropathy, a phenomenon extensively documented in various neurodegenerative disorders (Wang et al., 2012). ”

      For lines 213–217, 218–220, and 222–223, we also placed them in the discussion section (lines 363– 391) as follows. “To elucidate the pathophysiological implications of mutations in the OPA1 gene, we engineered and expressed several human OPA1 variants, including the 2708-2711del mutation, associated with DOA, and the I382M mutation, located in the GTPase domain and linked to DOA. We also investigated the D438V and R445H mutations in the GTPase domain and correlated with the more severe DOA plus phenotype. The 2708-2711del mutation exhibited limited detectability via HA-tag probing. Still, it was undetectable with a myc tag, likely due to a frameshift event leading to the mutation's characteristic truncated protein product, as delineated in prior studies (Zanna et al., 2008). Contrastingly, the I382M, D438V, and R445H mutations demonstrated expression levels comparable to the WT hOPA1. However, the expression of these mutants in retinal axons did not restore the dOPA1 deficiency to the same extent as the WT hOPA1, as evidenced in Figure 5E. This finding indicates a functional impairment imparted by these mutations, aligning with established understanding (Zanna et al., 2008). Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does not induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.

      (2) I do think further investigation as to why a reduction of mitochondria was noticed in the knockdown. There are conflicting reports on this in the literature. My own experience of this is fairly uniform mitochondrial number in WT vs OPA1 variant lines but with an increased level of mitophagy presumably reflecting a greater turnover. There are a number of ways to quantify mitochondrial load e.g. mtDNA quantification, protein quantification for tom20/hsp60 or equivalent. I feel the reliance on ICC here is not enough to draw conclusions. Furthermore, mitophagy markers could be checked at the same time either at the transcript or protein level. I feel this is important as it helps validate the drosophila model as we already have a lot of experimental data about the number and function of mitochondria in OPA+/- human/mammalian cells.

      We thank the reviewer for the insightful comments and suggestions regarding our study on the impact of mitochondrial reduction in a knockdown model. We concur with the reviewer’s observation that our initial results did not definitively demonstrate a decrease in the number of mitochondria in retinal axons. Furthermore, we measured mitochondrial quantity by conducting western blotting using antiCOXII and found no reduction in mitochondrial content with the knockdown of dOPA1 (Figure S4A and B). Consequently, we have revised our manuscript to remove the statement “suggesting a decreased number of mitochondria in retinal axons. However, whether this decrease is due to degradation resulting from a decline in mitochondrial quality or axonal transport failure remains unclear.” Instead, we have refocused our conclusion to reflect our electron microscopy findings, which indicate reduced mitochondrial size and structural abnormalities. The reviewer’s observation of consistent mitochondrial numbers in WT versus mutant variant lines and elevated mitophagy levels prompted us to evaluate mitochondrial turnover as a significant factor in our study. Regarding verifying mitophagy markers, we incorporated the mito-QC marker in our experimental design. In our experiments, mito-QC was expressed in the retinal axons of Drosophila to assess mitophagy activity upon dOPA1 knockdown. We observed a notable increase in mCherry positive but GFP negative puncta signals one week after eclosion, indicating the activation of mitophagy (Figure 2D–H). This outcome strongly suggests that dOPA1 knockdown enhances mitophagy in our Drosophila model. The application of mito-QC as a quantitative marker for mitophagy, validated in previous studies, offers a robust approach to analyzing this process. Our findings elucidate the role of dOPA1 in mitochondrial dynamics and its implications for neuronal health. These results have been incorporated into Figure 2, with the corresponding text updated as follows (lines 159–167): “Given that an increase in mitophagy activity has been reported in mouse RGCs and nematode ADOA models (Zaninello et al., 2022; Zaninello et al., 2020), the mitoQC marker, an established indicator of mitophagy activity, was expressed in the photoreceptors of Drosophila. The mito-QC reporter consists of a tandem mCherry-GFP tag that localizes to the outer membrane of mitochondria (Lee et al., 2018). This construct allows the measurement of mitophagy by detecting an increase in the red-only mCherry signal when the GFP is degraded after mitochondria are transported to lysosomes. Post dOPA1 knockdown, we observed a significant elevation in mCherry positive and GFP negative puncta signals at one week, demonstrating an activation of mitophagy as a consequence of dOPA1 knockdown (Figure 2D–H).”  

      (3) Could the authors comment on the failure of the dOPA1 rescue to return their biomarker, axonal number to control levels. In Figure 4D is there significance between the control and rescue. Presumably so as there is between the mutant and rescue and the difference looks less.

      As the reviewer correctly pointed out, there is a significant difference between the control and rescue groups, which we have now included in the figure. Additionally, we have incorporated the following comments in the discussion section (lines 329–342) regarding this significant difference: “In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a nonautonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, lOPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.”

      (4) The authors have chosen an interesting if complicated missense variant to study, namely the I382M with several studies showing this is insufficient to cause disease in isolation and appears in high frequency on gnomAD but appears to worsen the phenotype when it appears as a compound het. I think this is worth discussing in the context of the results, particularly with regard to the ability for this variant to partially rescue the dOPA1 model as shown in Figure 5.

      As the reviewer pointed out, the I382M mutation is known to act as a disease modifier. However, in our system, as suggested by Figure 5, I382M appears to retain more activity than DN mutations. Considering previous studies, we propose that I382M represents a mild hypomorph. Consequently, while I382M alone may not exhibit a phenotype, it could exacerbate severity in a compound heterozygous state. We have incorporated this perspective in our revised discussion (lines 375-391).

      “Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does no induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.”

      (5) I feel the main limitation of this paper is the reliance on axonal number as a biomarker for OPA1 function and ultimately rescue. I have concerns because a) this is not a well validated biomarker within the context of OPA1 variants b) we have little understanding of how this is affected by over/under expression and c) if it is a threshold effect e.g. once OPA1 levels reach <x% pathology develops but develops normally when opa1 expression is >x%. I think this is particularly relevant when the authors are using this model to make conclusions on dominant negativity/HI with the authors proposing that if expression of a hOPA1 transcript does not increase opa1 expression in a dOPA1 KO then this means that the variant is DN. The authors have used other biomarkers in parts of this manuscript e.g. ROS measurement and mito trafficking but I feel this would benefit from something else particularly in the latter experiments demonstrated in figure 5 and 6.

      The reviewer raised concerns regarding the adequacy of axonal count as a validated biomarker in the context of OPA1 mutants. In response, we corroborated its validity using markers such as MitoSOX, Atg8, and COXII. Experiments employing MitoSOX revealed that the augmented ROS signals resulting from dOPA1 knockdown were mitigated by expressing human OPA1. Conversely, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate these effects, paralleling the phenotype of axonal degeneration observed. These findings are documented in Figure 5F, and we have incorporated the following text into section lines 248–254 of the results:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      The reviewer also inquired about the effects of overexpressing and underexpressing OPA1 on axonal count and whether these effects are subject to a threshold. In response, we expressed both wild-type and variant forms of human OPA1 in Drosophila in vivo and assessed their protein levels using Western blot analysis. The results showed no significant differences in expression levels between the wild-type and variant forms in the OPA1 overexpression experiments, suggesting the absence of a variation threshold effect. These findings have been newly documented as quantitative data in Figure 5C. Furthermore, we have included a statement in the results section for Figure 6A, clarifying that overexpression of hOPA1 exhibited no discernible impact, as detailed on lines 274–276.

      “The results presented in Figure 5C indicate that there are no significant differences in the expression levels among the variants, suggesting that variations in expression levels do not influence the outcomes.”

      (6) Could the authors clarify what exons in Figure 5 are included in their transcript. My understanding is transcript NM_015560.3 contains exon 4,4b but not 5b. According to Song 2007 this transcript produces invariably s-OPA1 as it contains the exon 4b cleavage site. If this is true, this is a critical limitation in this study and in my opinion significantly undermines the likelihood of the proposed explanation of the findings presented in Figure 6. The primarily functional location of OPA1 is at the IMM and l-OPA1 is the primary opa1 isoform probably only that localizes here as the additional AA act as a IMM anchor. Given this is where GTPase likely oligomerizes the expression of s-OPA1 only is unlikely to interact anyway with native protein. I am not aware of any evidence s-OPA1 is involved in oligomerization. Therefore I don't think this method and specifically expression of a hOPA1 transcript which only makes s-OPA1 to be a reliable indicator of dominant negativity/interference with WT protein function. This could be checked by blotting UAS-hOPA1 protein with a OPA1 antibody specific to human OPA1 only and not to dOPA1. There are several available on the market and if the authors see only s-OPA1 then it confirms they are not expressing l-OPA1 with their hOPA1 construct.

      As suggested by the reviewer, we performed a Western blot using a human OPA1 antibody to determine if the expressed hOPA1 was producing the l-OPA1 isoform, as shown in band 2 of Figure 5D. The results confirmed the presence of both l-OPA1 and what appears to be s-OPA1 in bands 2 and 4, respectively. These findings are documented in the updated Figure 5D, with a detailed description provided in the manuscript at lines 224-226. Additionally, the NM_015560.3 refers to isoform 1, which includes only exons 4 and 5, excluding exons 4b and 5b. This isoform can express both l-OPA1 and s-OPA1 (refer to Figure 1 in Song et al., J Cell Biol. 2007). We have updated the schematic diagram in the figure to include these exons. The formation of s-OPA1 through cleavage occurs at the OMA1 target site located in exon 5 and the Yme1L target site in exon 5b of OPA1. Isoform 1 of OPA1 is prone to cleavage by OMA1, but a homologous gene for OMA1 does not exist in Drosophila. Although a homologous gene for Yme1L is present in Drosophila, exon 5b is missing in isoform 1 of OPA1, leaving the origin of the smaller band resembling s-OPA1 unclear at this point.

      Reviewer #2 (Public Review):

      The data presented support and extend some previously published data using Drosophila as a model to unravel the cellular and genetic basis of human Autosomal dominant optic atrophy (DOA). In human, mutations in OPA1, a mitochondrial dynamin like GTPase (amongst others), are the most common cause for DOA. By using a Drosophila loss-of-function mutations, RNAi- mediated knockdown and overexpression, the authors could recapitulate some aspects of the disease phenotype, which could be rescued by the wild-type version of the human gene. Their assays allowed them to distinguish between mutations causing human DOA, affecting the optic system and supposed to be loss-of-function mutations, and those mutations supposed to act as dominant negative, resulting in DOA plus, in which other tissues/organs are affected as well. Based on the lack of information in the Materials and Methods section and in several figure legends, it was not in all cases possible to follow the conclusions of the authors.

      We appreciate the reviewer's constructive feedback and the emphasis on enhancing clarity in our manuscript. We recognize the concerns raised about the lack of detailed information in the Materials and Methods section and several figure legends, which may have obscured our conclusions. In response, we have appended the detailed genotypes of the Drosophila strains used in each experiment to a supplementary table. Additionally, we realized that the description of 'immunohistochemistry and imaging' was too brief, previously referenced simply as “immunohistochemistry was performed as described previously (Sugie et al., 2017).” We have now expanded this section to include comprehensive methodological details. Furthermore, we have revised the figure legends to provide clearer and more thorough descriptions.

      Similarly, how the knowledge gained could help to "inform early treatment decisions in patients with mutations in hOPA1" (line 38) cannot be followed.

      To address the reviewer's comments, we have refined our explanation of the clinical relevance of our findings as follows. We believe this revision succinctly articulates the practical application of our research, directly responding to the reviewer’s concerns about linking the study's outcomes to treatment decisions for patients with hOPA1 mutations. By underscoring the model’s value in differential diagnosis and its influence on initiating treatment strategies, we have clarified this connection explicitly, within the constraints of the abstract’s word limit. The revised sentence now reads: "This fly model aids in distinguishing DOA from DOA plus and guides initial hOPA1 mutation treatment strategies."

      Reviewer #3 (Public Review):

      Nitta et al. establish a fly model of autosomal dominant optic atrophy, of which hundreds of different OPA1 mutations are the cause with wide phenotypic variance. It has long been hypothesized that missense OPA1 mutations affecting the GTPase domain, which are associated with more severe optic atrophy and extra-ophthalmic neurologic conditions such as sensorineural hearing loss (DOA plus), impart their effects through a dominant negative mechanism, but no clear direct evidence for this exists particularly in an animal model. The authors execute a well-designed study to establish their model, demonstrating a clear mitochondrial phenotype with multiple clinical analogs including optic atrophy measured as axonal degeneration. They then show that hOPA1 mitigates optic atrophy with the same efficacy as dOPA1, setting up the utility of their model to test disease-causing hOPA1 variants. Finally, they leverage this model to provide the first direct evidence for a dominant negative mechanism for 2 mutations causing DOA plus by expressing these variants in the background of a full hOPA1 complement.

      Strengths of the paper include well-motivated objectives and hypotheses, overall solid design and execution, and a generally clear and thorough interpretation of their results. The results technically support their primary conclusions with caveats. The first is that both dOPA1 and hOPA1 fail to fully restore optic axonal integrity, yet the authors fail to acknowledge that this only constitutes a partial rescue, nor do they discuss how this fact might influence our interpretation of their subsequent results.

      As the reviewer rightly points out, neither dOPA1 nor hOPA1 achieve a complete recovery. Therefore, we acknowledge that this represents only a partial rescue and have added the following explanations regarding this partial rescue in the results and discussion sections.

      Result:

      Significantly —> partially (lines 207 and 228) Discussion (lines 329–342):

      In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a non-autonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, l-OPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.

      The second caveat is that their effect sizes are small. Statistically, the results indeed support a dominant negative effect of DOA plus-associated variants, yet the data show a marginal impact on axonal degeneration for these variants. The authors might have considered exploring the impact of these variants on other mitochondrial outcome measures they established earlier on. They might also consider providing some functional context for this marginal difference in axonal optic nerve degeneration.

      In response to the reviewer’s comment regarding the modest effect sizes observed, we acknowledge that the magnitude of the reported changes is indeed small. To explore the impact of these variants on additional mitochondrial outcomes as suggested, we employed markers such as MitoSOX, Atg8, and COXII for validation. However, we could not detect any significant effects of the DOA plus-associated variants using these methods. We apologize for the redundancy, but to address Reviewer #1's fifth question, we present experimental results showing that while the increased ROS signals observed upon dOPA1 knockdown were rescued by expressing human OPA1, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate this effect. This outcome mirrors the axonal degeneration phenotype and is documented in Figure 5F. The following text has been added to the results section lines 248–254:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      Despite these caveats, the authors provide the first animal model of DOA that also allows for rapid assessment and mechanistic testing of suspected OPA1 variants. The impact of this work in providing the first direct evidence of a dominant negative mechanism is under-stated considering how important this question is in development of genetic treatments for DOA. The authors discuss important points regarding the potential utility of this model in clinical science. Comments on the potential use of this model to investigate variants of unknown significance in clinical diagnosis requires further discussion of whether there is indeed precedent for this in other genetic conditions (since the model is nevertheless so evolutionarily removed from humans).

      As suggested by the reviewer, we have expanded the discussion in our study to emphasize in greater detail the significance of the fruit fly model and the MeDUsA software we have developed, elaborating on the model's potential applications in clinical science and its precedents in other genetic disorders. Our text is as follows (lines 299–318):

      “We have previously utilized MeDUsA to quantify axonal degeneration, applying this methodology extensively to various neurological disorders. The robust adaptability of this experimental system is demonstrated by its application in exploring a wide spectrum of genetic mutations associated with neurological conditions, highlighting its broad utility in neurogenetic research. We identified a novel de novo variant in Spliceosome Associated Factor 1, Recruiter of U4/U6.U5 Tri-SnRNP (SART1). The patient, born at 37 weeks with a birth weight of 2934g, exhibited significant developmental delays, including an inability to support head movement at 7 months, reliance on tube feeding, unresponsiveness to visual stimuli, and development of infantile spasms with hypsarrhythmia, as evidenced by EEG findings. Profound hearing loss and brain atrophy were confirmed through MRI imaging. To assess the functional impact of this novel human gene variant, we engineered transgenic Drosophila lines expressing both wild type and mutant SART1 under the control of a UAS promoter.

      Our MeDUsA analysis suggested that the variant may confer a gain-of-toxic-function (Nitta et al.,  2023). Moreover, we identified heterozygous loss-of-function mutations in DHX9 as potentially causative for a newly characterized neurodevelopmental disorder. We further investigated the pathogenic potential of a novel heterozygous de novo missense mutation in DHX9 in a patient presenting with short stature, intellectual disability, and myocardial compaction. Our findings indicated a loss of function in the G414R and R1052Q variants of DHX9 (Yamada et al., 2023). This experimental framework has been instrumental in elucidating the impact of gene mutations, enhancing our ability to diagnose how novel variants influence gene function.”

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall I enjoyed reading this paper. It is well presented and represents a significant amount of well executed study. I feel it further characterizes a poorly understood model of OPA1 variants and one which displays significant differences with the human phenotype. However I feel the use of this model with the author's experiments are not enough to validate this model/experiment as a screening tool for dominant negativity. I have therefore suggested the above experiments as a way to both further validate the mitochondrial dysfunction in this model and to ensure that the expressed transcript is able affect oligomerization as this is a pre-requisite to the authors conclusions.

      We assessed the extent to which our model reflects mitochondrial dysfunction using COXII, Atg8, and MitoSOX markers. Unfortunately, neither COXII levels nor the ratio of Atg8a-1 to Atg8a-2 showed significant variations across genotypes that would clarify the impact of dominant negative mutations. Nonetheless, MitoSOX and mito-QC results revealed that mitochondrial ROS levels and mitophagy are increased in Drosophila following intrinsic knockdown of dOPA1. These findings are documented in Figures 2, 5, and S6.

      Regarding oligomer formation, the specifics remain elusive in this study. However, the expression of dOPA1K273A, identified as a dominant negative variant in Drosophila, significantly disrupted retinal axon organization, as detailed in Figure S7. From these observations, we hypothesize that oligomerization of wild-type and dominant negative forms in Drosophila results in axonal degeneration. Conversely, co-expression of Drosophila wild-type with human dominant negative forms does not induce degeneration, suggesting that they likely do not interact.

      Reviewer #2 (Recommendations For The Authors):

      Materials and Methods:

      The authors used GMR-Gal4 to express OPA1-RNAi. I) GMR is expressed in most cells in the developing eye behind the morphogenetic furrow. So the defects observed can be due to knock- down in support cells rather than in photoreceptor cells.

      We have added the following sentences in the result (lines 194–196)."The GMR-Gal4 driver does not exclusively target Gal4 expression to photoreceptor cells. Consequently, the observed retinal axonal degeneration could potentially be secondary to abnormalities in support cells external to the photoreceptors.”

      OPA1-RNAi: how complete is the knock-down? Have the authors tested more than one RNAi line?

      We conducted experiments with an additional RNAi line, and similarly observed degeneration in the retinal axons (Figure S2 A and B; lines 178–179).

      The loss-of-function allele, induced by a P-element insertion, gives several eye phenotypes when heterozygous (Yarosh et al., 2008). Does RNAi expression lead to the same phenotypes?

      A previous report indicated that the compound eyes of homozygous mutations of dOPA1 displayed a glossy eye phenotype (Yarosh et al., 2008). Upon knocking down dOPA1 using the GMR-Gal4 driver, we also observed a glossy eye-like rough eye phenotype in the compound eyes. These findings have been added to Figure S3 and lines 192–194.

      There is no description on the way the somatic clones were generated. How were mutant cells in clones distinguished from wild-type cells (e. g. in Fig. 4).

      In the Methods section, we described the procedure for generating clones and their genotypes as follows (lines 502–505): "The dOPA1 clone analysis was performed by inducing flippase expression in the eyes using either ey-Gal4 with UAS-flp or ey3.5-flp, followed by recombination at the chromosomal location FRT42D to generate a mosaic of cells homozygous for dOPA1s3475." Furthermore, we have created a table detailing these genotypes. In these experiments, it was not possible to differentiate between the clone and WT cells. Accordingly, we have noted in the Results section (lines 201–203): "Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.”

      Why were flies kept at 29{degree sign}C? this is rather unusual.

      Increased temperature was demonstrated to induce elevated expression of GAL4 (Kramer and Staveley, Genet. Mol. Res., 2003), which in turn led to an enhanced expression of the target genes. Therefore, experiments involving knockdown assays or Western blotting to detect human OPA1 protein were exclusively conducted at 29°C. However, all other experiments were performed at 25°C, as described in the methods sections: “Flies were maintained at 25°C on standard fly food. For knockdown experiments (Figures 1C–E, 1F–H, 2A–H, 3B–K, 5F, S1, S2 A and B, and S6A), flies were kept at 29°C in darkness.” Furthermore, “We regulated protein expression temporally across the whole body using the Tub-Gal4 and Tub-GAL80TS system. Flies harboring each hOPA1 variant were maintained at a permissive temperature of 20°C, and upon emergence, females were transferred to a restrictive temperature of 29°C for subsequent experiments.”

      Legends:

      It would be helpful to have a description of the genotypes of the flies used in the different experiments. This could also be included as a table.

      We have created a table detailing the genotypes. Additionally, in the legend, we have included a note to consult the supplementary table for genotypes.

      Results:

      Line 141: It is not clear what they mean by "degradation", is it axonal degeneration? And if so, what is the argument for this here?

      In the manuscript, we addressed the potential for mitochondrial degradation; however, recognizing that the expression was ambiguous, the following sentence has been omitted: "Nevertheless, the degradation resulting from mitochondrial fragmentation may have decreased the mitochondrial signal.”

      Fig. 2: Axons of which photoreceptors are shown?

      We have added "a set of the R7/8 retinal axons" to the legend of Figure 2.

      Line 167: The authors write that axonal degeneration is more severe after seven days than after eclosion. Is this effect light-dependent? The same question concerns the disappearance of the rhabdomere (Fig. 3G–J).

      We conducted the experiments in darkness, ensuring that the observed degeneration is not light- dependent. This condition has been added to the methods section to clarify the experimental conditions.

      Line 178/179: Based on what results do they conclude that there is degeneration of the "terminals" of the axons?

      Quantification via MeDUsA has enabled us to count the number of axonal terminals, and a noted decrease has led us to conclude axonal terminal degeneration. We have published two papers on these findings. We have added the following description to the results section to clarify how we defined degeneration (lines 174–176): "We have assessed the extent of their reduction from the total axonal terminal count, thereby determining the degree of axonal terminal degeneration (Richard JNS 2022; Nitta HMG 2023).

      Line 189: They write: ".. we observed dOPA1 mutant axons...". How did they distinguish es mutant from the controls?

      Fig. 5 and Fig. 6: How did they distinguish genetically mutant cells from genetically control cells in the somatic clones?

      Mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them. Accordingly, this point has been added to lines 201–203, “Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.” and the text in the results section has been modified as follows:

      (Before “To determine if dOPA1 is responsible for axon neurodegeneration, we observed the dOPA1 mutant axons by expressing full- length versions of dOPA1 in the photoreceptors at one day after eclosion and found that dOPA1 expression significantly rescued the axonal degeneration” —>

      (After “To determine if dOPA1 is responsible for axon neurodegeneration, we quantify the number of the axons in the dOPA1 eye clone fly with the expression of dOPA1 at one day after eclosion and found that dOPA1 expression partially rescued the axonal degeneration”

      Line 225/226: It is not clear to me how their approach "can quantitatively measure the degree of LOF".

      To address the reviewer's question and clarify how our approach quantitatively measures the degree of loss of function (LOF), we revised the statement (lines 238–247):

      "Our methodology distinctively facilitates the quantitative evaluation of LOF severity by comparing the rescue capabilities of various mutations. Notably, the 2708-2711del and I382M mutations demonstrated only partial rescue, indicative of a hypomorphic effect with residual activity. In contrast, the D438V and R445H mutations failed to show significant rescue, suggesting a more profound LOF. The correlation between the partial rescue by the 2708-2711del and I382M mutations and their classification as hypomorphic is significant. Moreover, the observed differences in rescue efficacy correspond to the clinical severities associated with these mutations, namely in DOA and DOA plus disorders. Thus, our results substantiate the model’s ability to quantitatively discriminate among mutations based on their impact on protein functionality, providing an insightful measure of LOF magnitude.”

      Discussion:

      Line 251, 252 and line 358: What is "the optic nerve" in the adult Drosophila?

      In humans, the axons of retinal ganglion cells (RGCs) are referred to as the optic nerve, and we posit that the retinal axons in flies are similar to this structure. In the introduction section, where it is described that the visual systems of flies and humans bear resemblance, we have appended the following definition (lines 107–108): “In this study, we defined the retinal axons of Drosophila as analogous to the human optic nerve.”

      Line 344: These bands appear only upon overexpression of the hOPA1 constructs, so this part of the is very speculative.

      Confirmation was achieved using anti-hOPA1, demonstrating that myc is not nonspecific. These results have been added to Figure 5D. Furthermore, the phrase “The upper band was expected as” has been revised to “From a size perspective, the upper band was inferred to represent the full-length hOPA1 including the mitochondria import sequence (MIS).” (lines 464–465)

      I was missing a discussion about the increase of ROS upon loss/reduction of dOPA1 observed by others and described here. Is there an increase of ROS upon expression of any of the constructs used?

      We demonstrated that not only axonal degeneration but also ROS can be suppressed by expressing human OPA1 in the genetic background of dOPA1 knockdown. Additionally, rescue was not possible with any variants except for I382M. Furthermore, we assessed whether there were changes in ROS in the evaluation of dominant negatives, but no significant differences were observed in this experimental system. These findings have been added to the discussion section as follows (lines 318–328). “Our research established that dOPA1 knockdown precipitates axonal degeneration and elevates ROS signals in retinal axons. Expression of human OPA1 within this context effectively mitigated both phenomena; it partially reversed axonal degeneration and nearly completely normalized ROS levels. These results imply that factors other than increased ROS may drive the axonal degeneration observed post-knockdown. Furthermore, while differences between the impacts of DN mutations and loss-of- function mutations were evident in axonal degeneration, they were less apparent when using ROS as a biomarker. The extensive use of transgenes in our experiments might have mitigated the knockdown effects. In a systemic dOPA1 knockdown, assessments of mitochondrial quantity and autophagy activity revealed no significant changes, suggesting that the cellular consequences of reduced OPA1 expression might vary across different cell types.”

      Reviewer #3 (Recommendations For The Authors):

      Consider being more explicit regarding literature that has or has failed to test a direct dominant negative effect by expressing a variant in question in the background of a full OPA1 complement. My understanding is that this is the first direct evidence of this widely held hypothesis. This lends to the main claim promoting the utility of fly as a model in general. The authors might also outline this in the introduction as a knowledge gap they fill through this study.

      In the introduction, we have incorporated a passage that highlights precedents capable of distinguishing between LOF and DN effects, and we note the absence of models capable of dissecting these distinctions within an in vivo organism. This study aims to address this gap, proposing a model that elucidates the differential impacts of LOF and DN within the context of a living model organism, thereby contributing to a deeper understanding of their roles in disease pathology. We added the following sentences in the introduction (lines 71–80).

      “In the quest to differentiate between LOF and DN effects within the context of genetic mutations, precedents exist in simpler systems such as yeast and human fibroblasts. These models have provided valuable insights into the conserved functions of OPA1 across species, as evidenced by studies in yeast models (Del Dotto et al., 2018) and fibroblasts derived from patients harboring OPA1 mutations (Kane et al., 2017). However, the ability to distinguish between LOF and DN effects in an in vivo model organism, particularly at the structural level of retinal axon degeneration, has remained elusive. This gap underscores the necessity for a more complex model that not only facilitates molecular analysis but also enables the examination of structural changes in axons and mitochondria, akin to those observed in the actual disease state.”

      The authors should clarify the language used in the abstract and introduction on the effect of hOPA1 DOA and DOA plus on the dOPA1- phenotype. Currently written as "none of the previously reports mutations known to cause DOA or DOA plus were rescued, their functions seems to be impaired." but presumably the authors mean that these variants failed to rescue to the dOPA1 deficient phenotype.

      We thank the reviewer for the constructive feedback. We acknowledge the need for clarity in our description of the effects of hOPA1 DOA and DOA plus mutations on the dOPA1- phenotype in both the abstract and the introduction. The current phrasing, "none of the previously reported mutations known to cause DOA or DOA plus were rescued, their functions seem to be impaired," may indeed be confusing. To address your concern, we have revised this statement to more accurately reflect our findings: "Previously reported mutations failed to rescue the dOPA1 deficiency phenotype." For Abstract site, we have changed as following. "we could not rescue any previously reported mutations known to cause either DOA or DOA plus.”→ “mutations previously identified did not ameliorate the dOPA1 deficiency phenotype.”

      DOA plus is associated with a multiple sclerosis-like illness; as written it suggests that the pathogenesis of sporadic multiple sclerosis and that associated with DOA plus share and underlying pathogenic mechanism. Please use the qualifier "-like illness." 

      We have added the term “multiple sclerosis-like illness” wherever “multiple sclerosis” is mentioned.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      Weaknesses:

      -On the other hand, cortical ectopia has been already described in mouse models in which the amyloid signalling has been disrupted (Herms et al., 2004; Guenette et al., 2006), making the current study less novel.

      We agree these previous studies have implicated amyloid precursor protein in cortical ectopia. However, since these studies use whole-body knockouts, they have not implicated the functional roles of specific cell types.  Nor have they identified the specific mechanisms underlying the formation of this unique class of cortical ectopia. In contrast, our studies show that the disruption of a novel Abeta-regulated signaling pathway in microglia is the primary cause of ectopia formation in this class of ectopia mutants. This is the first time that microglia have been specifically implicated in the development of cortical ectopia. We further show that elevated MMP activity and resulting cortical basement membrane degradation is the underlying mechanism leading to ectopia formation.  This is also the first time that MMP activity and basement membrane degradation (instead of maintenance) have been implicated in cortical ectopia development. As such, our results have provided novel insights into the diverse mechanisms underlying cortical ectopia formation in developmental brain disorders.

      One of the molecules analysed is Ric8a, a GTPase activator involved in neuronal development. Authors used the conditional mutant mice Emx1-Ric8a to delete Ric8a from early progenitors and glutamatergic neurons in the pallium. Emx1-Ric8a mutant mice present cortical ectopias and authors attributed this malformation to the increase in inflammatory response due to Ric8a deletion in microglia. Several discordances do not fit this interpretation:

      -The role of Ric8a in cortical development and function has been already described in several papers, but none of them has been cited in the current manuscript (Kask et al., 2015, 2018; Ruisu et al., 2013; Tonissoo et al., 2006).

      We will include reference to these publications in revision.

      -Ectopia formation in the cortex has been already described in Nestin-Ric8a cKO mice (Kask et al., 2015). In the current manuscript, authors analyzed the same mutant mice (Nestin-Ric8a), but they did not detect any ectopia. Authors should discuss this discordance.

      The expression pattern of nestin-cre is known to vary dependent on factors including transgene insertion site, genetic background, and sex. Early studies show, for example, that the nestin gene promoter drives cre expression in many non-neural tissues in another transgenic line in the FVB/N genetic background (Dubois et al Genesis. 2006 Aug;44(8):355-60. doi: 10.1002/dvg.20226).  The specific nestin-cre line used in Kask et al 2015 has also been shown to be active in brain microglia and lead to increased microglia pro-inflammatory activity upon breeding to a conditional allele of a cholesterol transporter gene (Karasinska et al., Neurobiol Dis. 2013 Jun:54:445-55; Karasinska et al.,  J Neurosci. 2009 Mar 18; 29(11): 3579–3589). The ectopia reported in Kask et al 2015 are also significantly more subtle than what we have observed and apparently not observed in all mutant animals (we observe severe ectopia in every single emx1-cre mutant).  We presume the ectopia reported in Kask et al 2015 may result from a combined deletion of ric8a gene from microglia and neural cells due to unique combinations of factors affecting nestin-cre expression in a subset of mutants.

      -Authors claim that microglia express Emx1, and therefore, Ric8a is deleted in microglia cells. However, the arguments for this assumption are very weak and the evidence suggests that this is not the case. This is an important point considering that authors want to emphasise the role of Ric8a in microglia activation, and therefore, additional experiments should demonstrate that Ric8a is deleted in microglia in Emx1-Ric8a mutant mice.

      We have observed altered mRNA expression of several genes in purified microglia cultured from the emx1-cre mutants (Supplemental Fig. 8), which indicates that ric8a is deleted from microglia and suggests a role of microglial ric8a deficiency in ectopia formation.  This interpretation is further strengthened by the observation that deletion of ric8a from microglia using a microglia-specific cx3cr1-cre results in similar ectopia (Fig. 2). We also have other data supporting this interpretation, including data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants. We will include these data in revision.

      Reviewer #2 (Public Review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      They first investigated ric8a, a Guanine Nucleotide Exchange Factor for Heterotrimeric G Proteins. They observed the above-mentioned phenotype when ric8a is deleted from microglia and neural cells (ric8a-emx1-cre or dual deletion with cre combination cx3cr1 (in microglia) and nestin (in neural cells)) but not in microglia alone or neural cells alone (whether it is in CR cells (ric8a-Wnt3a-cre), post-mitotic neurons (nex-cre or dlx5/6-cre), or in progenitors and their progeny (nestin-cre or foxg1-cre). They also show that ric8a KO mutant microglia cells stimulated in vitro by LPS exhibit an increased TNFa, IL6 and IL1b secretion compared to controls (Fig 2). They therefore injected LPS in vivo and observed the neuronal ectopia phenotype in the ric8a-cx3cr1-cre (microglial deletion) cortices at P0 (Fig 2). They suggest that ric8a KO in neuronal cells mimics immune stimulation (but we have no clue how ric8a KO in neural cells would induce immune stimulation).

      We agree we do not currently know the precise mechanisms by which mutant microglia are activated in the mutant brain.  However, this does not affect the conclusion that deficiency in the Abeta monomer-regulated APP/Ric8a pathway in microglia is the primary cause of cortical ectopia in these mutants, since we have shown that genetic disruption of this pathway in microglia alone by different means targeting different pathway components, using cell type specific cre, all results in similar cortical ectopia phenotypes.  Regarding the source of the immunogens, there are several possibilities which we plan to investigate in future studies. For example, the clearance of apoptotic cells and associated cellular debris is an important physiological process and deficits in this process have been linked to inflammatory diseases throughout life (Doran et al., Nat Rev Immunol. 2020 Apr;20(4):254-267; Boada-Romero et al., Nat Rev Mol Cell Biol. 2020 Jul;21(7):398-414.).  In the embryonic cortex, studies have shown that large numbers of cell death take place starting as early as E12 (Blaschke et al., Development. 1996 Apr;122(4):1165-74; Blaschke et al., J Comp Neurol. 1998 Jun 22;396(1):39-50).  Studies have also shown that radial glia and neuronal progenitors play critical roles in the clearance of apoptotic cells and associated cellular debris in the brain (Lu et al., Nat Cell Biol. 2011 Jul 31;13(9):1076-83; Ginisty et al., Stem Cells. 2015 Feb;33(2):515-25; Amaya et al., J Comp Neurol. 2015 Feb 1;523(2):183-96). Moreover, Ric8a-dependent heterotrimeric G proteins have been found to specifically promote the phagocytic activity of both professional and non-professional phagocytic cells (Billings et al., Sci Signal. 2016 Feb 2;9(413):ra14; Preissler et al., Glia. 2015 Feb;63(2):206-15; Pan et al. Dev Cell. 2016 Feb 22;36(4):428-39; Flak et al. J Clin Invest. 2020 Jan 2;130(1):359-373; Zhang et al., Nat Commun. 2023 Sep 14;14(1):5706).  Thus, it is likely that the failure to promptly clear up apoptotic cells and debris by radial glia may play a role in the triggering of microglial activation in ric8a mutants. We have not included discussion of these possibilities since the precise mechanisms remain to be determined.  Moreover, they also do not impact the conclusion of the current study.

      The authors then turned their attention on APP. They observed neuronal ectopia into the marginal zone when APP is deleted in microglia (app-cxcr3-cre) + intraperitoneal LPS injection (they did not show it, but we have to assume there would not be a phenotype without the injection of LPS) (Fig 3). (The phenotype is similar but not identical to ric8a-cx3cr1-cre + LPS. They suggest that the reason is because they had to inject 3 times less LPS due to enhanced immune sensitivity in this genetic background but it is only a hypothesis). After in vitro stimulation by LPS, app mutant microglia show a reduced secretion of TNFa and IL6 but not IL1b (this is the opposite to ric8a-cx3cr1-cre microglia cells) while peritoneal macrophages in culture show increased secretion of TNFa, IL1, IL6 and IL23 (fig 3 and Suppl. Fig 9).

      We have data showing that that app-cxcr3-cre mutants without LPS injection do not show ectopia and will include them in revision.  The reason we employ LPS injection is, in the first place, we do not see a phenotype without the injection. We agree, and have also stated in the text, that the phenotype of the app mutants is not as severe as that of the ric8a mutant.  Besides the low LPS dosage used, we also suggest that other app family members may compensate since the ectopia in the app family gene mutants reported previously were only observed in app/aplp1/2 triple knockouts, not even in any of the double knockouts (Herms et al., 2004). These potential causes are also not mutually exclusive. Nonetheless, the microglia specific app mutants clearly show ectopia upon immune stimulation, implicating a role of microglial APP in cortical ectopia formation.

      The distinct response of ric8a and app microglia to LPS results from in vitro culturing of microglia. Indeed, we have shown that, when acutely isolated macrophages are used, these mutants show changes in the same direction (both increased cytokine secretion).  The microglia used for analysis in this study have all been cultured in vitro for two weeks before assay. They have thus been under chronic stimulation exposing to dead cells and debris in the culture dish through this period.  Dependent on the degree of perturbation to inflammation-regulating pathways, such exposures are known to significantly change microglial cytokine expression, sometimes in an opposite direction from expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression as expected, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  In several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely that in microglia Ric8a-dependent heterotrimeric G proteins may also mediate only a subset of the signaling downstream of APP.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation and lead to changes in the opposite direction compared to ric8a knockout, as has been observed for trem2 null mutation vs heterozygosity discussed above. This may explain the subdued TNF and IL6 secretion by cultured app mutant microglia.

      Amyloid beta (Ab) being one of the molecules binding to APP, the authors showed that Ab40 monomers (they did not test Ab40 oligomers) partially inhibit cytokines (TNFa, IL6, IL1b, MCP-1, IL23a, IL10) secretion in vitro by microglia stimulated by LPS but does not affect secretion by microglia from app-cx3cr1-cre (tested for TNFa, IL6, IL1b, IL23a, IL10) (Fig 4, Suppl fig 10) (but still does it in aplp2-cx3cr1-cre) and does not affect secretion by ric8a-cx3cr1-cre microglia (tested for TNFa and IL6 but still suppress IL1b) (Therefore here is another difference between app and ric8a KO microglia).

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and will include the data in revision.  As mentioned above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  We assume that this is likely also true in microglia and that Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the signaling downstream of APP.  This may explain the difference in the effects of APP and ric8a knockout mutation in abolishing the anti-inflammatory effects of Abeta monomers on IL-1b vs TNF/IL-6.  It also suggests that TNF/IL-6 and IL-1b secretion must be regulated by different mechanisms. Indeed, it is well established in immunology that the secretion of IL1b, but not of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found it suppressed neuronal ectopia (Fig 5, Suppl fig 11). It is not clear whether it suppresses immune stimulation from neuronal cells or immune reaction from microglia cells.

      We agree at present the pharmacological approaches we have taken are not able to distinguish these possibilities.  However, whichever of these possibilities turns out to be the case would still implicate a role of excessive microglial activation in the formation of cortical ectopia and support the conclusion of the study.  Thus, while potentially worthwhile of further investigation, this question does not impact the conclusion of this study. Furthermore, as mentioned, we plan to determine the mechanisms of how ric8a mutation in neural cells induces immune activation in future studies. These results will likely enable us to adopt more specific approaches to address this question.

      Finally, the authors examined the activities of MMP2 and MMP9 in the developing cortex using gelatin gel zymography. The activity and protein levels of MMP9 but not MMP2 in the ric8a-emx1-cre cortex were claimed significantly increased (Fig 5, Suppl fig 12). Unfortunately, they did not show it in the app-cx3cr1-cre +LPS mouse. They make a connection between ric8a deletion and MMP9 but unfortunately do not make the connection between app deletion and MMP9, which is at the center of the pathway claimed to be important here). Then they injected BB94, a broad-spectrum inhibitor of MMPs or an inhibitor specific for MMP9 and 13. They both significantly suppress the number and the size of the ectopia in ric8a mutants (Fig5).

      For all the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus directly comparable. From the quantification, our results clearly show that MMP9, but not MMP2, levels are increased in the mutants (supplemental Figure 12).  The data on MMP2 also provide an internal control further supporting the observation of a specific change in MMP9.  For this analysis, we focus on the ric8a-emx1-cre mutants since the app-cx3cr1-cre +LPS animals show less severe, more localized ectopia and in most cases only in one of the hemispheres.  Any changes in MMP9 are therefore likely to be masked and the experiments unlikely to yield meaningful results.  On the other hand, we have clearly shown that the administration of different classes of MMP inhibitors significantly eliminate ectopia in ric8a-emx1-cre mutants. This has strongly implicated a functional contribution of MMPs.

      After reading the manuscript, I still do not know how ric8a in neural cells is involved in the immune inhibition. Is it through the control of Ab monomers? In addition, the authors did not show in vivo data supporting that Ab monomers are the key players here. As the authors said, this is not the only APP interactor. Finally, I still do not know how ric8a is linked to APP in microglia in the model.

      As detailed above, there are several possibilities including potential deficits in the clearance of apoptotic cells and associated debris that may trigger microglial activation in ri8ca-emx1-cre mutants. We will investigate these possibilities in future studies.  We have not included discussion since their roles remain to be determined.  As for the role of Abeta monomers, we have indicated that we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia.  We have also indicated in the manuscript that our conclusion is that an Abeta monomer-activated microglial pathway regulates normal brain development, not that Abeta monomers themselves regulate brain development.  Regarding the link between Ric8a and APP, the reviewer has missed several major lines of supporting evidence. For example, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  This inhibition is abolished when either app or ric8a gene is deleted from microglia.  This indicates that app and ric8a act in the same pathway activated by Abeta monomers in microglia. We also show that this Abeta monomer-activated pathway also inhibits the transcription of several cytokines in microglia.  This inhibition is also abolished when either app or ric8a gene is deleted from microglia.  This reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Furthermore, cell type specific deletion of app or ric8a from microglia in vivo also results in similar phenotypes of cortical ectopia. Together, these results thus strongly support the conclusion that app and ric8a act in the same pathway activated by Abeta monomers in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate subsets of APP signaling across different different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).         

      While several of the findings presented in this manuscript are of potential interest, there are a number of shortcomings. Here are some suggestions that could improve the manuscript and help substantiate the conclusions:

      (1) As the title suggests it, the focus is on Ab and APP functions in microglia. However, the analysis is more focused on ric8a. The connection between ric8a and APP in this study is not investigated, besides the fact that their deletion induces somewhat similar but not identical phenotypes. Showing a similar phenotype is not enough to conclude that they are working on the same pathway. The authors should find a way to make that connection between ric8a and app in the cells investigated here.

      As discussed above, the reviewer misses several major lines of evidence showing that APP and Ric8a acts in the same pathway in microglia.  For example, besides the similarity of the ectopia phenotypes, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  These inhibitory effects are completely abolished when either app or ric8a gene is deleted from microglia.  This indicates that app and ric8a act in the same pathway activated by Abeta monomers in microglia. We also show that this Abeta monomer-activated pathway inhibits the transcription of several cytokine genes in microglia.  These effects are again completely abolished when either app or ric8a gene is deleted from microglia.  This further reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Not only so we also show that the same results are true in macrophages.  Together, these results therefore strongly support the conclusion that app and ric8a act in the same pathway in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).

      (2) This would help to show the appearance of breaches in the pial basement membrane leading to neuronal ectopia; to investigate laminin debris, cell identity, Wnt pathway for app-cxcr3-cre + LPS injection as you did for ric8a-emx1-cre.

      We will provide further data on the breaches in the pial basement membrane.  We have not observed any changes in cell identity or Wnt pathway activity in ric8a-emx1-cre mutants. The ectopia phenotype in the app-cxcr3-cre + LPS animals is also less severe.  It is therefore likely of limited value to examine potential changes in these areas.

      (3) As a control, this would help to show that app-cxcr3-cre without the LPS injection does not display the phenotype.

      We have the data on app-cx3cr1-cre mutants without LPS injection, which show no ectopia, and will include the data in revision.

      (4) This would help to show the activity and protein levels of MMP9 and MMP2 and perform the rescue experiments with the inhibitors in the app-cx3cr1-cre cortex +LPS.

      As discussed above, we focus analysis on the ric8a-emx1-cre mutants since app-cx3cr1-cre +LPS animals show less severe, more localized ectopia and in most cases only in one of the hemispheres.  Determining potential changes in MMP9 levels and effects of MMP inhibitors are therefore not likely to yield useful data.  On the other hand, we have shown that MMP9 levels are increased and administration of different classes of MMP inhibitors eliminate cortical ectopia in ric8a-emx1-cre mutants.  This has strongly implicated a functional contribution of MMPs.

      (5) Is MMP9 secreted by microglia cells or neural cells?

      Our in situ hybridization data show MMP9 is most highly expressed in macrophage-like cells in the embryonic cortex, suggesting that microglia may be a major source of MMP9. We will incorporate these data in revision.

      (6) The in vitro evidence indicates that one of the multiple APP interactors, ie Ab40 monomers, is less effective in suppressing the expression of some cytokines by microglia cells mutants for ric8a (TNFa and IL6 but still suppress IL1b) or APP (TNFa, IL6, IL1b, IL23a, IL10) when compared to WT. But there are other interactors for APP. In order to support the claim, it seems crucial to have in vivo data to show that Ab40 monomers are the molecules involved in preventing the breach in the pial basement membrane.

      As addressed in detail above, we have indicated that our conclusion is that an Abeta monomer-activated microglial pathway regulates normal brain development, not that Abeta monomers themselves regulate brain development.  We currently do not have evidence that the Abeta monomers play a role in inhibiting microglia in the developing cortex.  There are candidate ligands for the pathway in the developing cortex, the functional study of which, however, is a major undertaking and beyond the scope of the current study.

      (7) In order to claim that this is specific to Ab40 monomers and not oligomers, it is necessary to show that the Ab40 oligomers do not have the same effect in vitro and in vivo. Also, an assay should be done to show that your Ab preparations are pure monomers or oligomers.

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and will include the data in revision. The protocols we use in preparing the monomers and oligomers are standard protocols employed in the field of Alzheimer’s disease research and have been optimized and validated repeatedly over the past several decades.  

      (8) Most of the cytokine secretion assays used microglia cells in culture. Two results draw my attention. Ric8a deletion increases TNFa and IL6 secretion after LPS stimulation in vitro on microglia cells while app deletion decreases their secretion. Then later, papers show that the decrease in IL1b induced by Ab on microglia cells is prevented by APP deletion but not ric8a deletion. Those two pieces of data suggest that ric8a and APP might not be in the same pathway. In addition, the phenotype from app-cxcr3-cre + LPS injection and ric8a-cxcr3-cre + LPS injection are not exactly the same. It could be due to the level of LPS as the author suggests or it might not be. More experiments are needed to prove they are in the same pathway.

      As discussed above, the reviewer misses several major lines of evidence, which strongly support the conclusion that APP and Ric8a act in the same pathway activated by Abeta monomers in microglia (see detailed discussion in point 1).  The differential response of app and ric8a mutant microglia likely results from chronic immune stimulation during in vitro culturing, which is known to alter microglia cytokine expression (see detailed discussion in point 9 below on how chronic immune stimulation changes microglial cytokine expression). We have demonstrated this by showing that, without culturing, acutely isolated app and ric8a mutant macrophages both display elevated cytokine secretion (Figure 4).  Regarding the distinct regulation of TNF/IL-6 and IL-1b by APP and Ric8a, as discussed above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely this is also the case in microglia and Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, this may explain why app, but ric8a, mutation abolishes the inhibitory effects of Abeta monomers on IL-1b.  This also suggests that the secretion of TNF/IL-6 and IL-1b must be regulated by different mechanisms. Indeed, it is well established in immunology that the secretion of IL1b, but not that of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      (9) How do the authors reconcile the reduced TNFa and IL6 secretion upon stimulation of app mutant microglia with the model where app is attenuating immune response in vivo? Line 213 says that microglia exhibit attenuated immune response following chronic stimulation but I don't know if 3 hours of LPS in vitro is a chronic stimulation.

      The reviewer has misunderstood.  The microglia used in this study have all been cultured in vitro for approximately two weeks before assay. They have thus been under chronic stimulation exposing to dead cells and debris in the culture dish throughout this period.  Dependent on the degree of perturbation to inflammation-regulating pathways, such exposures are known to significantly change microglial cytokine expression, sometimes in an opposite direction than expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression as expected, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  As mentioned, in several systems, Ric8a-dependent heterotrimeric G proteins have also been shown to bind to APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  It is likely that Ric8a-dependent heterotrimeric G proteins also mediate only a subset of the anti-inflammatory signaling activated by APP in microglia.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation, similar to the relationship between trem2 null mutation vs heterozygosity discussed above. This likely explains why TNF and IL6 secretion by cultured app mutant microglia is subdued.  In contrast, we find that acutely isolated app mutant macrophages show increased cytokine secretion. This is likely more representative of the response of app mutant microglia in the absence of chronic stimulation.

      (10) Line 119: In their model, the authors suggest that there is a breach in pial basement membrane but that the phenotype is different from the retraction of the radial fibers due to reduced adhesion. So, could the author discuss to what substrate the radial fibers are attached to, in their model where the pial surface is destroyed?

      Radial glial endfeet normally bind to the basement membrane via cell surface receptors including the integrin and the dystroglycan protein complexes. We observe free radial glial endfeet at the breach sites, apparently without attachment to any basement membrane.  However, we cannot exclude the possibility that there may be residual basement components not detected by the methodology employed. 

      (11) The authors should show that the increased cytokine secretion observed in vitro is also happening in vivo in ric8a-emx1-cre compared to WT mice and compared to ric8a-nestin-cre mice. Or when app is deleted in microglia (app-cxcr3-cre) + LPS injection compared to WT mice +LPS.

      Unfortunately, this is not technically feasible since it is impossible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool.  This, however, does not affect our conclusion that the Abeta monomer-regulated microglia pathway plays a key role in regulates normal brain development since its genetic disruption, by different approaches, clearly results in brain malformation.

      (12) The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found that it suppressed neuronal ectopia (Fig 5, Suppl fig 11). Does it suppress immune stimulation from neuronal cells or immune reaction from microglia cells?

      As discussed above, we agree at present the pharmacological approaches we have taken are not able to distinguish these two possibilities.  However, no matter which possibility is true, it does not affect our conclusion.  Furthermore, we also plan to determine the mechanisms of how ric8a mutation in neural cells induce immune activation in future studies. These results will likely enable us to adopt specific approaches to address this question.

      (13) Fig 5 and Supplementary fig 12: Please show a tubulin loading control in Fig 5i as you did in suppl fig 12 d (gel zymography). Please provide a gel zymography showing side by side Control, mutant and mutant +DM/S3I treatment. The same request for the MMP9 staining. Please provide statistics for control vs mutant for suppl fig 12c and d.

      For all experiments of the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus all comparable.  These experiments were also performed several years ago before the pandemic and we unfortunately no longer have the samples.  We will, however, provide the protein quantification information in revision.  The MMP9 staining images for the controls and mutants have also all been taken with the same parameters on the microscope and can be directly compared.  The statistics will be provided as suggested.

      (14) Please provide the name and the source of the MMP9/13 inhibitor used in this study.

      This inhibitor is MMP-9/MMP-13 inhibitor I (CAS 204140-01-2), from Santa Cruz Biotechnology. This information will be included in revision.

      (15) The results show that deletion of ric8a in microglia and neural cells induced pia membrane breaches but no phenotype is apparent in ric8a deletion in microglia or neural cells alone. Then, the results showed that intraperitoneal injection of LPS induced the phenotype in ric8a-cxcr3-cre mutants. It would be beneficial as a control supporting the model to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.

      We agree it may potentially be useful to show that LPS injection does not induce ectopia in ric8a-foxg1-cre mice.  Unfortunately, since the ric8a-foxg1-cre mutation shows no phenotype, we are no longer in possession of this line.

      Reviewer #1 (Recommendations For The Authors):

      -The information in the abstract and the introduction is only related to app. So, it is very abrupt how authors start the manuscript studying the role of Ric8a, with no information at all about this protein and why the authors want to investigate this role in microglial activation. Later in the manuscript, the authors tried to link Ric8a with app to study the role of app in the inflammatory response and ectopia formation. This link is quite weak as well.

      In the last paragraph of the Introduction, we explain the use of the ric8a mutant and how it leads to discovery of the Abeta monomer-regulated pathway. We will improve the writing in revision to make these points clearer.  We will also improve the writing of the potential link of Ric8a to APP by highlighting, especially, the fact that ric8a and app pathway mutants are among a unique group of only three mouse mutants (ric8a, app/aplp1/2, and apbb1/2) that show cortical ectopia exclusively in the lateral cortex, while all other cortical ectopia mutants show the most severe ectopia are at the midline.

      -In order to validate the mouse model, double immunofluorescence or immunofluorescence+in situ hybridization should be performed to show that microglia express ric8a and that is eliminated in the Emx1-Ric8a mutant mice.

      As mentioned above, we have additional lines of evidence showing that ric8a is deleted from microglia in emx1-cre mutants. This includes data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants.  We will include these data in revision.

      -In Supplemental Fig. 6, the authors claimed that cell proliferation is normal in Ric8a mutant mice without doing any quantification. They also quantified the angle of mitotic division of progenitors in the ventricular zone, but there are no images for the spindle orientation quantification, and no description of how they did it. In addition, this data is contrary to what has already been published in conditional Ric8a mutant mice (Kask et al., 2015). The Vimentin staining should be improved.

      We will provide quantification of cell proliferation in revision. We will also provide details on the quantification on mitotic spindle orientation.  We are not sure why the results are different from the other study. We were indeed anticipating deficits in mitotic spindle orientation and spent major efforts in the analysis.  However, based on the data, we could not draw the conclusion.

      -Analysis of the MMP9 expression should be done by western blot and not by immunofluorescence. In fact, the MMP9 expression shown in Figure 5g,h, does not correspond with RNA expression shown in gene expression atlas like genepaint or the allen atlas, doubting the specificity of the antibody. The expression of Mmp9 is quite low or absent in the cortex at E13.5-E14.5, making this protein very unlikely to be responsible for laminin degradation during development.

      We perform gelatin gel zymography on MMP2/9, which shows increased MMP9 activity levels in the mutant cortex. This is similar to Western blot analysis (all lanes are loaded with the same amounts of cortical lysates).  The immunofluorescence staining, a different type, of analysis, was designed as a complementary approach.  Regarding RNA expression, please also note that MMP9 is a secreted protein and the protein expression pattern is expected to be different from that of RNA. We also have in situ data showing that, while MMP9 mRNA is indeed low, it is strongly expressed in macrophage-like cells most prominently in cortical blood vessels at E12-E13 (we will include these data in revision).  We suspect that these cells are microglial lineage cells populating the embryonic cortex at this stage (see, for example, Squarzoni et al., Cell Rep. 2014 Sep 11;8(5):1271-9. doi: 10.1016/j.celrep.2014.07.042.) and may be a major source of cortical MMP9.  As for functional contributions, we agree that we cannot rule roles played by other MMPs.  However, based on the ectopia suppression data, our results clearly indicate a key functional contribution by MMP9/13.

      For MMP9 activity, authors should show the whole membrane with a minimum of three control and three mutant individual samples and with the quantification.<br /> -The graphs should be improved, including individual values and titles of the Y axes.

      We will include these data in revision (the quantification of MMP9 activity is provided in Supplemental Figure 12d) and improve the graphs as suggested.

    1. Author response:

      Puvlic Reviews:

      Reviewer #1 (Public Review): 

      Summary: 

      Dr. Santamaria's group previously utilized antigen-specific nanomedicines to induce immune tolerance in treating autoimmune diseases. The success of this therapeutic strategy has been linked to expanded regulatory mechanisms, particularly the role of T-regulatory type-1 (TR1) cells. However, the differentiation program of TR1 cells remained largely unclear. Previous work from the authors suggested that TR1 cells originate from T follicular helper (TFH) cells. In the current study, the authors aimed to investigate the epigenetic mechanisms underlying the transdifferentiation of TFH cells into IL-10-producing TR1 cells. Specifically, they sought to determine whether this process involves extensive chromatin remodeling or is driven by preexisting epigenetic modifications. Their goal was to understand the transcriptional and epigenetic changes facilitating this transition and to explore the potential therapeutic implications of manipulating this pathway. 

      The authors successfully demonstrated that the TFH-to-TR1 transdifferentiation process is driven by pre-existing epigenetic modifications rather than extensive new chromatin remodeling. The comprehensive transcriptional and epigenetic analyses provide robust evidence supporting their conclusions. 

      Strengths: 

      (1) The study employs a broad range of bulk and single-cell transcriptional and epigenetic tools, including RNA-seq, ATAC-seq, ChIP-seq, and DNA methylation analysis. This comprehensive approach provides a detailed examination of the epigenetic landscape during the TFH-to-TR1 transition. 

      (2) The use of high-throughput sequencing technologies and sophisticated bioinformatics analyses strengthens the foundation for the conclusions drawn. 

      (3) The data generated can serve as a valuable resource for the scientific community, offering insights into the epigenetic regulation of T-cell plasticity. 

      (4) The findings have significant implications for developing new therapeutic strategies for autoimmune diseases, making the research highly relevant and impactful. 

      We thank the reviewer for providing constructive feedback on the manuscript.

      Weaknesses: 

      (1) While the scope of this study lies in transcriptional and epigenetic analyses, the conclusions need to be validated by future functional analyses. 

      We fully agree with the reviewer’s suggestion. The current study provides a foundational understanding of how the epigenetic landscape of TFH cells evolves as they transdifferentiate into TR1 progeny in response to chronic ligation of cognate TCRs using pMHCII-NPs. Functional validation is indeed the focus of our current studies, where we are carrying out extensive perturbation studies of the TFH-TR1 transdifferentiation pathway in conditional transcription factor gene knock-out mice. In these ongoing studies, genes coding for a series of transcription factors expressed along the TFH-TR1 pathway are selectively knocked out in T cells, to ascertain (i) the specific roles of key transcription factors in the various cell conversion events and transcriptional changes that take place along the TFH-TR1 cell axis; (ii) the roles that such transcription factors play in the chromatin re-modeling events that underpin the TFH-TR1 transdifferentiation process; and (iii) the effects of transcription factor gene deletion on phenotypic and functional readouts of TFH and regulatory T cell function.

      (2) This study successfully identified key transcription factors and epigenetic marks. How these factors mechanistically drive chromatin closure and gene expression changes during the TFH-to-TR1 transition requires further investigation. 

      Agreed. Please see our response to point #1 above.  

      (3) The study provides a snapshot of the epigenetic landscape. Future dynamic analysis may offer more insights into the progression and stability of the observed changes. 

      We have previously shown that the first event in the pMHCII-NP-induced TFH-TR1 transdifferentiation process involves proliferation of cognate TFH cells in the splenic germinal centers. This event is followed by immediate conversion of the proliferated TFH cells into transitional and terminally differentiated TR1 subsets. Although the snapshot provided by our single cell studies reported herein documents the simultaneous presence of the different subsets composing the TFH-TR1 cell pathway upon the termination of treatment, the transdifferentiation process itself is extremely fast, such that proliferated TFH cells already transdifferentiate into TR1 cells after a single pMHCII-NP dose (Sole et al., 2023a). This makes it extremely challenging to pursue dynamic experiments. Notwithstanding this caveat, ongoing studies of cognate T cells post treatment withdrawal, coupled to single cell studies of the TFHTR1 pathway in transcription factor gene knockout mice exhibiting perturbed transdifferentiation processes are likely to shed light into the progression and stability of the epigenetic changes reported herein. 

      We will revise the manuscript accordingly, to address the three concerns raised by the reviewer, in the context of the ongoing studies mentioned above. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study, based on their previous findings that TFH cells can be converted into TR1 cells, conducted a highly detailed and comprehensive epigenetic investigation to answer whether TR1 differentiation from TFH is driven by epigenetic changes. Their evidence indicated that the downregulation of TFH-related genes during the TFH to TR1 transition depends on chromatin closure, while the upregulation of TR1-related genes does not depend on epigenetic changes. 

      Strengths: 

      (1) A significant advantage of their approach lies in its detailed and comprehensive assessment of epigenetics. Their analysis of epigenetics covers chromatin open regions, histone modifications, DNA methylation, and using both single-cell and bulk techniques to validate their findings. As for their results, observations from different epigenetic perspectives mutually supported each other, lending greater credibility to their conclusions. This study effectively demonstrates that (1) the TFH-to-TR1 differentiation process is associated with massive closure of OCRs, and (2) the TR1-poised epigenome of TFH cells is a key enabler of this transdifferentiation process. Considering the extensive changes in epigenetic patterns involved in other CD4+ T lineage commitment processes, the similarity between TFH and TR1 in their epigenetics is intriguing. 

      (2) They performed correlation analysis to answer the association between "pMHC-NPinduced epigenetic change" and "gene expression change in TR1". Also, they have made their raw data publicly available, providing a comprehensive epigenomic database of pMHC-NPinduced TR1 cells. This will serve as a valuable reference for future research. 

      We thank the reviewer for his/her constructive feedback and suggestions for improvement of the manuscript.

      Weaknesses: 

      (1) A major limitation is that this study heavily relies on a premise from the previous studies performed by the same group on pMHC-NP-induced T-cell responses. This significantly limits the relevance of their conclusion to a broader perspective. Specifically, differential OCRs between Tet+ and naïve T cells were limited to only 821, as compared to 10,919 differential OCRs between KLH-TFH and naïve T cells (Figure 2A), indicating that the precursors and T cell clonotypes that responded to pMHC-NP were extremely limited. This limitation should be clearly discussed in the Discussion section. 

      We agree that this study focuses on a very specific, previously unrecognized pathway discovered in mice treated with pMHCII-NPs. Despite this apparent narrow perspective, we now have evidence that this is a naturally occurring pathway that also develops in other contexts (i.e., in mice that have not been treated with pMHCII-NPs). Furthermore, this pathway affords a unique opportunity to further understand the transcriptional and epigenetic mechanisms underpinning T cell plasticity; the findings reported here can help guide/inform not only upcoming translational studies of pMHCII-NP therapy in humans, but also other research in this area. We will discuss the limitations and opportunities that this research provides more explicitly in a revised manuscript to provide a clearer context for the scope and applicability of our findings.

      We acknowledge that, in the bulk ATAC-seq studies, the differences in the number of OCRs found in tetramer+ cells or KLH-induced TFH cells vs. naïve T cells may be influenced by the intrinsic oligoclonality of the tetramer+ T cell pool arising in response to repeated pMHCII-NP challenge (Sole et al., 2023a). However, we note that scATAC-seq studies of the tetramer+ T cell pool found similar differences between the oligoclonal tetramer+ TFH subpool and its (also oligoclonal) tetramer+ TR1 counterparts (i.e., substantially higher number of OCRs in the former vs. the latter relative to naïve T cells). This will be clarified in a revised version of the manuscript.

      (2) This article uses peak calling to determine whether a region has histone modifications, claiming that the regions with histone modifications in TFH and TR1 are highly similar. However, they did not discuss the differences in histone modification intensities measured by ChIP-seq. For example, as shown in Figure 6C, IL10 H3K27ac modification in Tet+ cells showed significantly higher intensity than KLH-TFH, while in this article, it may be categorized as "possessing same histone modification region". This will strengthen their conclusions.

      We appreciate your suggestion to discuss differences in histone modification intensities as measured by ChIP-seq. However, we respectfully disagree with the reviewer’s interpretation of these data.

      Our study primarily focuses on the identification of epigenetic similarities and differences between pMHCII-NP-induced tetramer+ cells and KLH-induced TFH cells relative to naive T cells. The outcome of direct comparisons of histone deposition (ChIP-seq) between these cell types is summarized in the lower part of Figure 4B and detailed in Datasheet 5. Throughout this section, we report the number of differentially enriched regions, their overlap with OCRs shared between tetramer+ TFH and tetramer+ TR1 cells based on scATAC-seq data, and the associated genes. Clearly, most of the epigenetic modifications that TR1 cells inherit from TFH cells had already been acquired by TFH cells upon differentiation from naïve T cell precursors. 

      Regarding the specific point raised by the reviewer on differences in the intensity of the H3K27Ac peaks linked to Il10 in Figure 6C, we note that the genomic tracks shown are illustrative. However, thorough statistical analyses involving signal background for each condition and p-value adjustment did not support differential enrichment for H3K27Ac deposition around the Il10 gene between pMHCII-NP-induced tetramer+ T cells and KLHinduced TFH cells. 

      We acknowledge that peak calling alone does not account for intensity variations of histone modifications. However, our analysis includes both qualitative and quantitative assessments to ensure robust conclusions. We will edit the relevant sections of the manuscript to clarify these points and better communicate our methodology and findings to the readers.

      (3) Last, the key findings of this study are clear and convincing, but some results and figures are unnecessary and redundant. Some results are largely a mere confirmation of the relationship between histone marks and chromatin status. I propose to reduce the number of figures and text that are largely confirmatory. Overall, I feel this paper is too long for its current contents. 

      We understand this reviewer’s concern about the potential redundancy of some results and figures. The goal of including these analyses is to provide a comprehensive understanding of the intricate relationships between epigenetic features and transcriptomic differences. We believe that a detailed examination of these relationships is crucial for several reasons: (i) the breadth of the data allows for a thorough exploration of the relationships between histone marks, chromatin accessibility and transcriptional differences. This comprehensive approach helps ensure that our conclusions are robust and well-supported by the data; (ii) some of the results that may appear confirmatory are, in fact, important for validating and reinforcing the consistency of our findings across different contexts. These details intend to provide a nuanced understanding of the interactions between epigenetic features and gene expression; and (iii) by presenting a detailed analysis, we aim to offer a solid foundation for future research in this area. The extensive datasets that are presented in this paper will serve as a valuable resource for others in the field who may seek to build upon our findings.

      That said, we will carefully review the manuscript to identify and streamline any elements that may be overly redundant. We will consider consolidating figures and refining the text to ensure that the paper remains concise and focused while retaining the depth of analysis that we believe is essential.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something?

      Thank you to the reviewer for pointing this out. After careful consideration, we agree with your point of view. According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. This suggests that hMT+ does have the potential to become the core of MD system. However, due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated through the frontal cortex”, it is not yet sufficient to prove that hMT+is the core node of the MD system, we have adjusted the explanatory logic of the article. Briefly, we emphasize the de-redundancy of hMT+ in visual-spatial intelligence and the improvement of information processing efficiency, while weaken the significance of hMT+ in MD systems.

      (2) How was the sample size determined? Is it sufficient?

      Thank you to reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has reasonable power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 datasets to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank reviewer for pointing this out. There are several differences between us:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are describe in reviewer 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      We thank the reviewer for the good suggestion, we have made a correlation matrix to reporting all values in Figure Supplementary 9.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      We thank the reviewer for pointing this out. The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      We thank the reviewer for pointing this out. The main reason is that our MRS ROI is the left hMT+, we would like to make different models’ ROI consistent to each other. Use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

      We thank the reviewer for the good advice. Containing such result do help researchers compare the result between Melnick and us. We have made such figures in the revised version (Figure 3f, g).

      Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      (1) While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      We really appreciate reviewer’s opinions. The reason why we focus on hMT+ is following: According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with high correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. In addition, Fedorenko et al. 2013, the averaged MD activity region appears to overlap with hMT+. Based on these findings, we assume that hMT+ does have the potential to become the core of MD system.

      (2) Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. Due to our results only delving into visuo-spatial intelligence, it is not yet sufficient to prove that hMT is the core node of the MD system. We will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis

      (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:

      (1.1) The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.

      We thank the reviewer for pointing this out. In our study, the inhibitory mechanisms in hMT+ are conceptualized through two models: the neurotransmitter model and the behavioral model. The Suppression Index is essential for elucidating the local inhibitory mechanisms within the behavioral model. However, we acknowledge that our initial presentation in the introduction may not have clearly articulated our hypothesis, potentially leading to misunderstandings. We have revised the introduction to better clarify these connections and ensure the relevance of the Suppression Index is comprehensively understood.

      (1.2) The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.

      We thank the reviewer for pointing this out. We acknowledge that our manuscript may have inconsistently presented this construct across different sections, causing confusion. To address this, we ensured a consistent description of 3D visuo-spatial intelligence in both the introduction and the discussion sections. But we maintained ‘Block Design task score' within the results section to help readers clarify which subtest we use.

      (1.3) The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      We thank the reviewer for pointing this out. We have revised the Figure 1a and made it less abstract and more logical. For Figure 6, the schematic represents our theoretical framework of how hMT+ contributes to 3D visuo-spatial intelligence, we believe the elements within this framework are grounded in related theories and supported by evidence discussed in our results and discussions section, making them specific and testable.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      Thank you for pointing this out. Reviewer #1 also asked this question, we copy the answers in here “The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.”

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.

      Thank you for pointing this out. We have carefully read the corresponding references and considered the corresponding theories and agree with these comments. Due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+ is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems. In addition, regarding the potential central role of hMT+ in the MD system, we agree with your view that research on hMT+ as a multisensory integration hub mainly focuses on developmental processes. Meanwhile, in adults, the MST region of hMT+ is considered a multisensory integration area for visual and vestibular inputs, which potentially supports the role of hMT+ in multitasking multisensory systems (Gu et al., J. Neurosci, 26(1), 73–85, 2006; Fetsch et al., Nat. Neurosci, 15, 146–154, 2012.). Further research could explore how other intelligence sub-ability such as working memory and language comprehension are facilitated by hMT+'s features.

      Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Thank you for pointing this out. We acknowledge the inappropriate citation of "Born & Bradley, 2005," which focuses solely on the structure and function of the visual area MT. However, we believe that choosing hMT+ as the domain for 3D visual analysis and V1 as the control region is justified. Cumming and DeAngelis (Annu Rev Neurosci, 24:203–238.2001) state that binocular disparity provides the visual system with information about the three-dimensional layout of the environment, and the link between perception and neuronal activity is stronger in the extrastriate cortex (especially MT) than in the primary visual cortex. This supports our choice and emphasizes the relevance of hMT+ in our study. We have revised our reference in the revised version.

      Results & Discussion

      (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      We thank for the reviewer’s suggestion. We have placed it in the main text (Figure 3e).

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      We thank for the reviewer’s suggestion. We have drawn the V1 ROI MRS scanning area (Figure supplement 1). Using the template, we checked the coverage of V1, V2, and V3. Although the MRS overlap regions extend to V2 (3%) and V3 (32%), the major coverage of the MRS scanning area is in V1, with 65% overlap across subjects.

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+

      We thank for the reviewer’s suggestion. We have done the V1 FC-behavior connection as control analysis (Figure supplement 7). Only positive correlations in the frontal area were detected, suggesting that in the 3D visuo-spatial intelligence task, V1 plays a role in feedforward information processing. However, hMT+, which showed specific negative correlations in the frontal, is involved in the inhibition mechanism. These results further emphasize the de-redundancy function of hMT+ in 3D visuo-spatial intelligence.

      Regarding the mediation analysis, since GABA/Glu concentration in V1 has no correlation with BDT score, it is not sufficient to apply mediation analysis.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      We thank the reviewer for pointing this out. We have further interpreted the difference between a and b in the revised version. Panels a represents BDT score correlated hMT+-region FC, which is obviously involved in frontal cortex. While panels b represents SI correlated hMT+-region FC, which shows relatively less regions. The overlap region is what we are interested in and explain how local inhibitory mechanisms works in the 3D visuo-spatial intelligence. In addition, we have revised Figure 4 and point out the overlap region.

      (5) SI is not relevant to the authors‘ priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      We thank the reviewer for highlighting this point. The relationship between the Suppression Index (SI) and our a priori hypotheses is elaborated in the response to reviewer 3, section (1). SI plays a crucial role in explicating how local inhibitory mechanisms, on the psychological level, function within the context of the 3D visuo-spatial task. Additionally, Figure 5c illustrates the interaction between the frontal cortex and hMT+, showing how the effects from the frontal cortex (BA46) on the Block Design Task are fully mediated by SI. This further underscores the significance of SI in our model.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      We thank the reviewer for highlighting this point. There is no doubt that V1 involved in efficient visual information processing. However, in our result, the V1 GABA has no significant correlation between BDT score, suggesting that the V1 efficient processing might not sufficiently account for the individual differences in 3D visuo-spatial intelligence. Additionally, we will clarify our use of the neural efficiency hypothesis by incorporating it into the introduction of our paper to better frame our argument.

      Transparency Issues:

      (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      We thank the reviewer for pointing this out. We realized that such expression would lead to confusion. We have deleted this expression.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      We thank the reviewer for pointing this out. We have attached the GitHub link in the revised version.

      Minor:

      "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      We thank the reviewer for pointing this out. We have revised it.

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      We thank the reviewer for pointing this out. We agree to use hMT+ in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      We thank the reviewer for pointing this out. We have revised it.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

      We thank the reviewer for pointing this out. We have revised it.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The figures and tables should be substantially improved.

      We thank the reviewer for pointing this out. We have improved some of the figures’ quality.

      (2) Please explain the sample size, and the difference between Schallmo eLife 2018, and Melnick, 2013.

      We thank the reviewer for pointing this out. These questions are answered in the public review. We copy the answer in the public review.

      (2.1)  How was the sample size determined? Is it sufficient??

      Thank you to the reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has adequate power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 subjects to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (2.2)  In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank you to the reviewer for pointing this out. There are several differences between the two studies, ours and theirs:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are described in review 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (3) Table 1 and Table Supplementary 1-3 contain many correlation results. But what are the main points of these values? Which values do the authors want to highlight? Why are only p-values shown with significance symbols in Table Supplementary 2?

      (3.1) what are the main points of these values?

      Thank you to the reviewer for pointing this out. These correlations represent the relationship between behavior task (SI/BDT) and resting-state functional connectivity. It indicates that left hMT+ is involved in the efficient information integration network when it comes to the BDT task. In addition, left hMT+’s surround suppression is involved in several hMT+ - frontal connectivity. Furthermore, the overlapping regions between two tasks indicate a shared underlying mechanism.

      (3.2) Which values do the authors want to highlight?

      Table 1 and Table Supplementary 1-3 present the preliminary analysis results for Table 2 and Table Supplementary 4-6. So, we generally report all value. Conversely, in the Table 2 and Table Supplementary 4-6, we highlight (bold font) indicating the significant correlations survived from multi correlation correction.

      (3.3) Why are only p-values shown with significance symbols in Table Supplementary 2?

      Thank you for pointing this out, it is a mistake. We have revised it and delete the significance symbols.

      (4) Line 27, it is unclear to me what is "the canonical theory".

      We thank the reviewer for pointing this out. We have revised “the canonical theory" to “the prevailing opinion”.

      (5) Throughout the paper, the authors use "MT+", I would suggest using "hMT+" to indicate the human MT complex, and to be consistent with the human fMRI literature.

      We thank the reviewer for pointing this out. We have revised them and used "hMT+" to be consistent with the human fMRI literature.

      (6) At the beginning of the results section, I suggest including the total number of subjects. It is confusing what "31/36 in MT+, and 28/36 in V1" means.

      We thank the reviewer for pointing this out. We have included the total number of subjects in the beginning of result section.

      (7) Line 138, "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area". This sentence is strange because it is a well-established finding in numerous human fMRI papers. I think the authors should be more specific about what this finding implies.

      We thank the reviewer for pointing this out. We have deleted the inappropriate sentence "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area".

      (8) There are no unit labels for all x- and y-axies in Figure 1. I only see the unit for Conc is mmol per kg wet weight.

      We thank the reviewer for pointing this out. Figure 1 is a schematic and workflow chart, so labels for x- and y-axes are not needed. I believe this confusion might pertain to Figure 3. In Figures 3a and 3b, the MRS spectrum does not have a standard y-axis unit as it varies based on the individual physical conditions of the scanner; it is widely accepted that no y-axis unit is used. While the x-axis unit is ppm, which indicate the chemical shift of different metabolites. In Figure 3c, the BDT represents IQ scores, which do not have a standard unit. Similarly, in Figures 3d and 3e, the Suppression Index does not have a standard unit.

      (9) Although the correlations are not significant in Figure Supplement 2&3, please also include the correlation line, 95% confidence interval, and report the r values and p values (i.e., similar format as in Figure 1C).

      We thank the reviewer for pointing this out. We have revised them.

      (10) There is no need to separate different correlation figures into Figure Supplementary 1-4. They can be combined into the same figure.

      We thank the reviewer for the suggestion. However, each correlation figure in the supplementary figures has its own specific topic and conclusion. The correlation figures in Supplementary Figure 1 indicate that GABA in V1 does not show any correlation with BDT and SI, illustrating that inhibition in V1 is unrelated to both 3D visuo-spatial intelligence and motion suppression processing. The correlations in Supplementary Figure 2 indicate that the excitation mechanism, represented by Glutamate concentration, does not contribute to 3D visuo-spatial intelligence in either hMT+ or V1. Supplementary Figure 3 validates our MRS measurements. Supplementary Figure 4 addresses potential concerns regarding the impact of outliers on correlation significance. Even after excluding two “outliers” from Figures 3d and 3e, the correlation results remain stable.

      (11) Line 213, as far as I know, the study (Melnick et al., 2013) is a psychophysical study and did not provide evidence that the spatial suppression effect is associated with MT+.

      We thank the reviewer for pointing this out. It was a mistake to use this reference, and we have revised it accordingly.

      (12) At the beginning of the results, I suggest providing more details about the motion discrimination tasks and the measurement of the BDT.

      We thank the reviewer for pointing this out. We have included some brief description of task at the beginning of the result section.

      (13) Please include the absolute duration thresholds of the small and large sizes of all subjects in Figure 1.

      We thank the reviewer for the suggestion. We have included these results in Figure 3.

      (14) Figure 5 is too small. The items in plot a and b can be barely visible.

      We thank the reviewer for pointing this out. We increase the size and resolution of Figure 5.

      Reviewer #2 (Recommendations For The Authors):

      Recommendations for improving the writing and presentation.

      I highly recommend editing the manuscript for readability and the use of the English language. I had significant difficulties following the rationale of the research due to issues with the way language was used.

      We thank the reviewer for pointing this out. We apologize for any shortcomings in our initial presentation. We have invited a native English speaker to revise our manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review):  

      Summary:  

      Heer and Sheffield used 2 photon imaging to dissect the functional contributions of convergent dopamine and noradrenaline inputs to the dorsal hippocampus CA1 in head-restrained mice running down a virtual linear path. Mice were trained to collect water rewards at the end of the track and on test days, calcium activity was recorded from dopamine (DA) axons originating in the ventral tegmental area (VTA, n=7) and noradrenaline axons from the locus coeruleus (LC, n=87) under several conditions. When mice ran laps in a familiar environment, VTA DA axons exhibited ramping activity along the track that correlated with distance to reward and velocity to some extent, while LC input activity remained constant across the track, but correlated invariantly with velocity and time to motion onset. A subset of recordings taken when the reward was removed showed diminished ramping activity in VTA DA axons, but no changes in the LC axons, confirming that DA axon activity is locked to reward availability. When mice were subsequently introduced to a new environment, the ramping to reward activity in the DA axons disappeared, while LC axons showed a dramatic increase in activity lasting 90 s (6 laps) following the environment switch. In the final analysis, the authors sought to disentangle LC axon activity induced by novelty vs. behavioral changes induced by novelty by removing periods in which animals were immobile and established that the activity observed in the first 2 laps reflected novelty-induced signal in LC axons.  

      Strengths:  

      The results presented in this manuscript provide insights into the specific contributions of catecholaminergic input to the dorsal hippocampus CA1 during spatial navigation in a rewarded virtual environment, offering a detailed analysis of the resolution of single axons. The data analysis is thorough and possible confounding variables and data interpretation are carefully considered.  

      Weaknesses:  

      Aspects of the methodology, data analysis, and interpretation diminish the overall significance of the findings, as detailed below.  

      The LC axonal recordings are well-powered, but the DA axonal recordings are severely underpowered, with recordings taken from a mere 7 axons (compared to 87 LC axons).

      Additionally, 2 different calcium indicators with differential kinetics and sensitivity to calcium changes (GCaMP6S and GCaMP7b) were used (n=3, n=4 respectively) and the data pooled. This makes it very challenging to draw any valid conclusions from the data, particularly in the novelty experiment. The surprising lack of novelty-induced DA axon activity may be a false negative. Indeed, at least 1 axon (axon 2) appears to be showing a novelty-induced rise in activity in Figure 3C. Changes in activity in 4/7 axons are also referred to as a 'majority' occurrence in the manuscript, which again is not an accurate representation of the observed data.  

      We appreciate the reviewer's detailed feedback regarding the analysis of VTA axons in our dataset. The relatively low sample size for VTA axons is due to their sparsity in the dCA1 region of the hippocampus and the inherent difficulty in recording from these axons. VTA axons are challenging to capture due to their low baseline fluorescence and long-range axon segments, resulting in a typical yield of only a single axon per field of view (FOV) per animal. In contrast, LC axons are more abundant in dCA1.

      To address the disparity in sample sizes between LC and VTA axons, we down-sampled the LC axons to match the number of VTA axons, repeating this process 1000 times to create a distribution. However, we acknowledge the reviewer's concern that the relatively low sample size for VTA axons might result in insufficient sampling of this population. Increasing the baseline expression of GCaMP to record from VTA axons requires several months, limiting our ability to quickly expand the sample size.

      In response to the reviewer's comments, we have added recordings from 2 additional VTA axons, increasing the sample size from 7 to 9. We re-analyzed all data from the familiar environment with n=9 VTA axons, comparing them to down-sampled LC axons as previously described. However, the additional axons were not recorded in the novel environment. We agree with the reviewer that the lack of novelty-induced DA axon activity may be a false negative. To address this, we have revised the description of our results to include the following sentence:

      “However, 1 VTA ROI showed an increase in activity immediately following exposure to novelty, indicating heterogeneity across VTA axons in CA1, and the lack of a novelty signal on average may be due to a small sample size.”

      Regarding the use of two different GCaMP constructs, we understand the reviewer's concern. We used GCaMP6s and GCaMP7b variants to determine if one would improve the success rate of recording from VTA axons. Given the long duration of these experiments and the low yield, we pooled the data from both GCaMP variants to increase statistical power. However, we recognize the importance of verifying that there are no differences in the signals recorded with these variants.

      With the addition of 2 VTA DA axons expressing GCaMP6s, we now have n=5 GCaMP6s and n=4 GCaMP7b VTA DA axons. This allowed us to compare the activity of the two sensors in the familiar environment. As shown in new Supplementary Figure 2, both sets of axons responded similarly to the variables measured: position in VR, time to motion onset, and animal velocity (although the GCaMP6s expressing axons showed stronger correlations). Since all LC axons recorded expressed GCaMP6s, we also specifically compared VTA GCaMP6s axons to LC GCaMP6s axons (Supp Fig. 3). Our conclusions remained consistent when comparing this subset of VTA axons to LC axons.

      Overall, our paper now includes comparisons of combined VTA axons (n=9) and separately the GCaMP6s-expressing VTA axons (n=5) with LC axons. Both datasets support our initial conclusions that VTA axons signal proximity to reward, while LC axons encode velocity and motion initiation in familiar environments.

      The authors conducted analysis on recording data exclusively from periods of running in the novelty experiment to isolate the effects of novelty from novelty-induced changes in behavior. However, if the goal is to distinguish between changes in locus coeruleus (LC) axon activity induced by novelty and those induced by motion, analyzing LC axon activity during periods of immobility would enhance the robustness of the results.  

      We appreciate the reviewer's insightful suggestion to analyze LC axon activity during periods of immobility to distinguish between changes induced by novelty and those induced by motion. This additional analysis would indeed strengthen our conclusions regarding the LC novelty signal.

      In response to this suggestion, we performed the same analysis as before, but focused on periods of immobility. Our findings indicate that following exposure to novelty, there was a significant increase in LC activity specifically during immobility. This supports the idea that LC axons produce a novelty signal that is independent of novelty-induced behavioral changes. The results of this analysis are now presented in new Supplementary Figure 5b

      The authors attribute the ramping activity of the DA axons to the encoding of the animals' position relative to reward. However, given the extensive data implicating the dorsal CA1 in timing, and the remarkable periodicity of the behavior, the fact that DA axons could be signalling temporal information should be considered.  

      This is an insightful comment regarding the potential role of VTA DA axons in signaling temporal information. We agree that VTA DA axons could indeed be encoding temporal information, as previous work from our lab has shown that these axons exhibit ramping activity when averaged by time to reward (Krishnan et al., 2022).

      To address this, we have now examined DA axon activity relative to time to reward, as shown in new Supplementary Figure 4. Our analysis confirms that these axons ramp up in activity relative to time to reward. Given the periodicity of our mice's behavior in these experiments, as the reviewer correctly points out, we are unable to distinguish between spatial proximity to reward and time to reward. We have added a sentence to our paper highlighting this limitation and stating that further experiments are necessary to differentiate these two variables.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      The authors should explain and justify the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments.  

      We appreciate the reviewer's insightful comment regarding the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments. The choice of a 3m track for LC axon recordings was made to align with a previous experiment from our lab (Dong et al., 2021), in which mice were exposed to a novel 3m track while CA1 pyramidal cell populations were recorded. In that study, we detailed the time course of place field formation within the novel track. Our current hypothesis is that LC axons signal novelty, and we aimed to investigate whether the time course of LC axon activity aligns with the time course of place field formation. This hypothesis, and the potential role of LC axons in facilitating plasticity for new place field formation, is further discussed in the Discussion section of our paper.

      For the VTA axon recordings, we utilized a 2m track, consistent with another recent study from our lab (Krishnan et al., 2022), where reward expectation was manipulated, and CA1 pyramidal cell populations were recorded. By matching the track length to this prior study, we aimed to explore how VTA dopaminergic inputs to CA1 might influence CA1 population dynamics along the track under conditions of varying reward expectations.

      We acknowledge that using different track lengths for LC and VTA recordings introduces a variable that could potentially confound direct comparisons. To address this, we normalized the track lengths for our LC versus VTA comparison analysis. This normalization allowed us to directly compare patterns of activity across the two types of axons by adjusting the data to a common scale, thereby ensuring that any observed differences or similarities are attributable to the intrinsic properties of the axons rather than differences in track lengths. By doing so, we could assess relative changes in activity levels at matched spatial bins.

      Although the experiences of the animals on the different track lengths are not identical, our observations suggest that LC and VTA axon signals are not majorly influenced by variations in track length. LC axons are associated with velocity and a pre-motion initiation signal, neither of which are affected by track length. VTA axons, which also correlate with velocity, can be compared to LC axon velocity signals because mice reach maximal velocity very quickly a long the track, well before the end of the 2m track. The range of velocities are therefore capture on both track lengths. While VTA axons exhibit ramping activity as they approach the reward zone—a signal potentially modulated by track length—LC axons do not show such ramping to reward signals. Thus, a comparison across different track lengths is justified for this aspect of our analysis.

      To further enhance the rigor of our comparisons between axon dynamics recorded on 2m and 3m tracks, we conducted an additional analysis plotting axon activity by time to reward and actual (un-normalized) distance from reward (Supplementary Figure 4). This analysis revealed very similar signals between the two sets of axons, supporting our initial conclusions.

      We thank the reviewer for raising this important point and hope that our detailed explanation and additional analysis address their concern.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      Dong, C., Madar, A. D. & Sheffield, M.E. Distinct place cell dynamics in CA1 and CA3 encode experience in new environments. Nat Commun 12, 2977 (2021).

      Reviewer #2 (Public Review):  

      Summary:  

      The authors used 2-photon Ca2+-imaging to study the activity of ventral tegmental area (VTA) and locus coeruleus (LC) axons in the CA1 region of the dorsal hippocampus in head-fixed male mice moving on linear paths in virtual reality (VR) environments.  

      The main findings were as follows:  

      - In a familiar environment, the activity of both VTA axons and LC axons increased with the mice's running speed on the Styrofoam wheel, with which they could move along a linear track through a VR environment.  

      - VTA, but not LC, axons showed marked reward position-related activity, showing a ramping-up of activity when mice approached a learned reward position.  

      - In contrast, the activity of LC axons ramped up before the initiation of movement on the Styrofoam wheel.  

      - In addition, exposure to a novel VR environment increased LC axon activity, but not VTA axon activity.  

      Overall, the study shows that the activity of catecholaminergic axons from VTA and LC to dorsal hippocampal CA1 can partly reflect distinct environmental, behavioral, and cognitive factors. Whereas both VTA and LC activity reflected running speed, VTA, but not LC axon activity reflected the approach of a learned reward, and LC, but not VTA, axon activity reflected initiation of running and novelty of the VR environment.  

      I have no specific expertise with respect to 2-photon imaging, so cannot evaluate the validity of the specific methods used to collect and analyse 2-photon calcium imaging data of axonal activity.  

      Strengths:  

      (1) Using a state-of-the-art approach to record separately the activity of VTA and LC axons with high temporal resolution in awake mice moving through virtual environments, the authors provide convincing evidence that the activity of VTA and LC axons projecting to dorsal CA1 reflect partly distinct environmental, behavioral and cognitive factors.  

      (2) The study will help a) to interpret previous findings on how hippocampal dopamine and norepinephrine or selective manipulations of hippocampal LC or VTA inputs modulate behavior and b) to generate specific hypotheses on the impact of selective manipulations of hippocampal LC or VTA inputs on behavior.  

      Weaknesses:  

      (1) The findings are correlational and do not allow strong conclusions on how VTA or LC inputs to dorsal CA1 affect cognition and behavior. However, as indicated above under Strengths, the findings will aid the interpretation of previous findings and help to generate new hypotheses as to how VTA or LC inputs to dorsal CA1 affect distinct cognitive and behavioral functions.  

      (2) Some aspects of the methodology would benefit from clarification.  

      First, to help others to better scrutinize, evaluate, and potentially to reproduce the research, the authors may wish to check if their reporting follows the ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines for the full and transparent reporting of research involving animals (https://arriveguidelines.org/). For example, I think it would be important to include a sample size justification (e.g., based on previous studies, considerations of statistical power, practical considerations, or a combination of these factors). The authors should also include the provenance of the mice. Moreover, although I am not an expert in 2-photon imaging, I think it would be useful to provide a clearer description of exclusion criteria for imaging data.

      We thank the reviewer for helping us formalize the scientific rigor of our study. There are ten ARRIVE Guidelines and we have addressed most of them in our study already. However, there is an opportunity to add detail. We have listed below all ten points and how we have addressed each one (and point out any new additions):

      (1) Experimental design - we go into great depth explaining the experimental set-up, how we used the autofluorescent blebs as imaging controls, how we controlled for different sample sizes between the two populations, and the statistical tests used for comparisons. We also carefully accounted for animal behavior when quantifying and describing axon dynamics both in the familiar and novel environments.

      (2) Sample size - we state both the number of ROIs and mice for each analysis. We have now also added the number of mice we observed specific types of activity in. 

      (3) Inclusion/exclusion criteria - The following has now been added to the Methods section: Out of the 36 NET-Cre mice injected, 15 were never recorded from for either failing to reach behavioral criteria, or a lack of visible expression in axons. Out of the 54 DAT-Cre mice injected, imaging was never conducted in 36 of them for lack of expression or failing to reach behavioral criteria. Out of the remaining 21 NET-CRE, 5 were excluded for heat bubbles, z-drift, or bleaching, while 10 DAT-Cre were excluded for the same reasons. This was determined by visually assessing imaging sessions, followed by using the registration metrics output by suite2p. This registration metric conducted a PCA on the motion-corrected ROIs and plotted the first PC. If the PC drifted largely, to the point where no activity was apparent, the video was excluded from analysis. 

      (4) Randomization - Already included in the paper is a description of random downsampling of LC axons to make statistical comparisons with VTA axons. LC axons were selected pseudo-randomly (only one axon per imaging session) to match VTA sampling statistics. This randomization was repeated 1000 times and comparisons were made against this random distribution. 

      (5) Blinding-masking - no blinding/masking was conducted as no treatments were given that would require this. We will include this statement in the next version. 

      (6) Outcomes - We defined all outcomes measured, such as those related to animal behavior and axon signaling. 

      (7) Statistical methods - None of the reviewers had any issues regarding our description of statistical methods, which we described in great detail in this version of the paper. 

      (8) Experimental animals - We have now described that DAT- Cre mice were obtained through JAX labs, and NET-Cre mice were obtained from the Tonegawa lab (Wagatsuma et al. 2017). This was absent in the initial version of the paper.

      (9) Experimental procedure - Already listed in great detail in Methods section.

      (10) Results - Rigorously described in detail for behaviors and related axon dynamics.

      Wagatsuma, Akiko, Teruhiro Okuyama, Chen Sun, Lillian M. Smith, Kuniya Abe, and Susumu Tonegawa. “Locus Coeruleus Input to Hippocampal CA3 Drives Single-Trial Learning of a Novel Context.” Proceedings of the National Academy of Sciences 115, no. 2 (January 9, 2018): E310–16. https://doi.org/10.1073/pnas.1714082115.

      Second, why were different linear tracks used for studies of VTA and LC axon activity (from line 362)? Could this potentially contribute to the partly distinct activity correlates that were found for VTA and LC axons?  

      We thank the reviewer for pointing this out and giving us a chance to address it directly. A detailed response to this is written above for a similar comment from reviewer 1.

      Third, the authors seem to have used two different criteria for defining immobility. Immobility was defined as moving at <5 cm/s for the behavioral analysis in Figure 3a, but as <0.2 cm/s for the imaging data analysis in Figure 4 (see legends to these figures and also see Methods, from line 447, line 469, line 498)? I do not understand why, and it would be good if the authors explained this.  

      This is a typo leftover from before we converted velocity from rotational units of the treadmill to cm/s. This has now been corrected.

      (3) In the Results section (from line 182) the authors convincingly addressed the possibility that less time spent immobile in the novel environment may have contributed to the novelty-induced increase of LC axon activity in dorsal CA1 (Figure 4). In addition, initially (for the first 2-4 laps), the mice also ran more slowly in the novel environment (Figure 3aIII, top panel). Given that LC and VTA axon activity were both increasing with velocity (Figure 1F), reduced velocity in the novel environment may have reduced LC and VTA axon activity, but this possibility was not addressed. Reduced LC axon activity in the novel environment could have blunted the noveltyinduced increase. More importantly, any potential novelty-induced increase in VTA axon activity could have been masked by decreases in VTA axon activity due to reduced velocity. The latter may help to explain the discrepancy between the present study and previous findings that VTA neuron firing was increased by novelty (see Discussion, from line 243). It may be useful for the authors to address these possibilities based on their data in the Results section, or to consider them in their Discussion.  

      We appreciate the reviewer's insightful comment regarding the potential impact of decreased velocity on novelty responses in LC and VTA axons. The decreased velocity in the novel environment could lead to a diminished novelty response in LC axons and could mask a subtle novelty signal in VTA axons. We have now included the following points in our discussion:

      “In addition, as noted above, on average we did observe a velocity associated signal in VTA axons. When mice were exposed to the novel environment their velocity initially decreased. This would be expected to reduce the average signal across the VTA axon population relative to the higher velocity in the familiar environment. It is possible that this decrease could somewhat mask a subtle novelty induced signal in VTA axons. Therefore, additional experiments should be conducted to investigate the heterogeneity of these axons and their activity under different experimental conditions during tightly controlled behavior.”

      “As discussed above, the slowing down of animal behavior in the novel environment could have decreased LC axon activity and reduced the magnitude of the novelty signal we detected during running. The novelty signal we report here may therefore be an under estimate of it's magnitude under matched behavioral settings.”

      However, it is important to note that although VTA axons, on average, showed activity modulated by velocity in a familiar rewarded environment, this relationship was largely due to the activity of two VTA axons that were strongly modulated by velocity, indicating heterogeneity within the VTA axon population in dCA1. We have highlighted this point in the discussion. We also discuss that:

      “It is possible that some VTA DA inputs to dCA1 respond to novel environments, and the small number of axons recorded here are not representative of the whole population.”

      (4) Sensory properties of the water reward, which the mice may be able to detect, could account for reward-related activity of VTA axons (instead of an expectation of reward). Do the authors have evidence that this is not the case? Occasional probe trials, intermixed with rewarded trials, could be used to test for this possibility.  

      Mice receive their water reward through a water spout that is immobile and positioned directly in front of their mouth. Water delivery is triggered by a solenoid when the mice reach the end of the virtual track. Therefore, because the water spout is immobile and the water reward is not delivered until they reach the end of the track, there is nothing for the mice to detect during their run. We have added clarifications about the water spout to the Methods and Results sections, along with appropriate discussion points.

      Additionally, we note that the ramping activity of VTA axons is still present on the initial laps with no reward (Krishnan et al., 2022), indicating that this activity is not directly related to the presence or absence of water but is instead associated with the animal’s reward expectation.

      We thank the reviewer for raising this point and hope that these clarifications address their concern.

      Reviewer #3 (Public Review):  

      Summary:  

      Heer and Sheffield provide a well-written manuscript that clearly articulates the theoretical motivation to investigate specific catecholaminergic projections to dorsal CA1 of the hippocampus during a reward-based behavior. Using 2-photon calcium imaging in two groups of cre transgenic mice, the authors examine the activity of VTA-CA1 dopamine and LC-CA1 noradrenergic axons during reward seeking in a linear track virtual reality (VR) task. The authors provide a descriptive account of VTA and LC activities during walking, approach to reward, and environment change. Their results demonstrate LC-CA1 axons are activated by walking onset, modulated by walking velocity, and heighten their activity during environment change. In contrast, VTA-CA1 axons were most activated during the approach to reward locations. Together the authors provide a functional dissociation between these catecholamine projections to CA1. A major strength of their approach is the methodological rigor of 2-photon recording, data processing, and analysis approaches. These important systems neuroscience studies provide solid evidence that will contribute to the broader field of learning and memory. The conclusions of this manuscript are mostly well supported by the data, but some additional analysis and/or experiments may be required to fully support the author's conclusions.  

      Weaknesses:  

      (1) During teleportation between familiar to novel environments the authors report a decrease in the freezing ratio when combining the mice in the two experimental groups (Figure 3aiii). A major conclusion from the manuscript is the difference in VTA and LC activity following environment change, given VTA and LC activity were recorded in separate groups of mice, did the authors observe a similar significant reduction in freezing ratio when analyzing the behavior in LC and VTA groups separately?  

      In response to the comment regarding the freezing ratios during teleportation between familiar and novel environments, we have analyzed the freezing ratios and lap velocities of DAT-Cre and NET-Cre mice separately (Fig. 3Aiii). Our analysis shows that the mean lap velocities of both groups overlap in the familiar environment and significantly decrease on the first lap of the novel environment (Fig. 3iii, top). For subsequent laps, the velocities in both groups are not statistically significantly different from the familiar environment lap velocities.

      Freezing ratios also show a statistically significant decrease on the first lap of the novel environment compared to the familiar environment in both groups (Fig. 3iii, bottom). In the NETCRE mice, the freezing ratios remain statistically lower in subsequent laps, while in the DATCRE mice, the following laps show a similar trend but without statistical significance. This lack of statistical significance in the DAT-CRE mice is likely due to their already lower freezing ratios in the familiar environment. Overall, the data demonstrate similar behavioral responses in the two groups of mice during the switch from the familiar to the novel environment.

      (2) The authors satisfactorily apply control analyses to account for the unequal axon numbers recorded in the LC and VTA groups (e.g. Figure 1). However, given the heterogeneity of responses observed in Figures 3c, 4b and the relatively low number of VTA axons recorded (compared to LC), there are some possible limitations to the author's conclusions. A conclusion that LC-CA1 axons, as a general principle, heighten their activity during novel environment presentation, would require this activity profile to be observed in some of the axons recorded in most all LC-CA1 mice.

      We agree with the reviewer’s point. To address this issue, when downsampling LC axons to compare to VTA axons, we matched the sampling statistics of the VTA axons/mice by only selecting one LC axon from each mouse to match the VTA dataset.

      Additionally, we have now included the number of recording sessions and the number of mice in which we observed each type of activity. This information has been added to further clarify and support our conclusions.

      Additionally, if the general conclusion is that VTA-CA1 axons ramp activity during the approach to reward, it would be expected that this activity profile was recorded in the axons of most all VTA-CA1 mice. Can the authors include an analysis to demonstrate that each LC-CA1 mouse contained axons that were activated during novel environments and that each VTA-CA1 mouse contained axons that ramped during the approach to reward?  

      As above, we have now added the number of mice that had each activity type we report in the paper here.  

      (3) A primary claim is that LC axons projecting to CA1 become activated during novel VR environment presentation. However, the experimental design did not control for the presentation of a familiar environment. As I understand, the presentation order of environments was always familiar, then novel. For this reason, it is unknown whether LC axons are responding to novel environments or environmental change. Did the authors re-present the familiar environment after the novel environment while recording LC-CA1 activity?  

      While we did not vary the presentation order of familiar and novel environments, we recorded the activity of LC axons in some mice when exposed to a dark environment (no VR cues) prior to exposure to the familiar environment. Our analysis of this data demonstrates that LC axons are also active following abrupt exposure to the familiar environment.

      We have added a new figure showing this response (Supplementary Figure 5A) and expanded on our original discussion point that LC axon activity generally correlates with arousal, as this result also supports that interpretation.

      We thank the reviewer for highlighting this important consideration. It certainly helps with the interpretation regarding what LC axons generally encode.  

      >Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      In addition to what has been described in the public review, I have the following recommendations:  

      The sample size of DA axon recordings should be increased with the use of a single GCaMP for valid conclusions to be made about the lack of novelty-inducted activity in these axons.  

      We have increased the n of VTA GCaMP6s axons in the familiar environment by including two axons that were recorded in the familiar rewarded condition. We have also conducted an analysis comparing GCaMPs versus GCaMP7b, which is discussed in detail above.

      Regarding the concerns about valid conclusions of novelty-induced activity in VTA axons, we have added a comment in the discussion to tone down our conclusions regarding the lack of a novelty signal in the VTA axons. This valid concern is discussed in detail above.  

      The title is currently very generic, and non-informative. I recommend the use of more specific language in describing the type of behavior under investigation. It is not clear to the reviewer why 'learning' is included here.  

      Original title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during behavior and learning”

      To make it more specific to the experiments conducted here, we have changed the title to this:

      New title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during navigation in familiar and novel environments”

      Error noted in Figure 4C legend - remove reference to VTA ROIs.  

      The reference to VTA ROIs has been removed from the figure legend

      Reviewer #2 (Recommendations For The Authors):  

      (1) The concluding sentence of the Abstract could be more specific: which distinct types of information are reflected/'signaled'/'encoded' by LC and VTA inputs to dorsal CA1?  

      The abstract has been adjusted accordingly. The new sentence is more specific: “These inputs encode unique information, with reward information in VTA inputs and novelty and kinematic information in LC inputs, likely contributing to differential modulation of hippocampal activity during behavior and learning.”

      (2) Line 46/47: The study by Mamad et al. (2017) did not quite show that VTA dopamine input to dorsal CA1 'drives place preference'. To my understanding, the study showed that suppression of VTA dopamine signaling in a specific place caused avoidance of this place and that VTA dopamine signaling modulated hippocampal place-related firing. So, please consider rephrasing.  

      Corrected, thanks for pointing this out.

      (3) Legend to Figure 3AIII: 'Each lap was compared to the first lap in F . . .' Could you clarify if 'F' refers to the 'familiar environment?  

      Figure legend has been changed accordingly

      (4) Line 176: '36 LC neurons' - should this not be '36 imaged axon terminals in dorsal CA1' or something along these lines?  

      This reference has been changed to “LC axon ROIs”

      (5) Line 353: Why was water restriction started before the hippocampal window implant, if behavioral training to run for water reward only started after the implant? Please clarify.

      A sentence was added to the methods to explain that this was done to reduce bleeding and swelling during the hippocampal window implantation.  

      (6) Line 377: '. . . which took 10-14 days (although some mice never reached this threshold).' How many mice did not reach the criterion within 14 days? I think it is not accurate to say the mice 'never' reached the threshold, as they were only tested for a limited period of time.  

      We have added details of how many mice were excluded from each group and the reason why they were excluded.

      (7) Exclusion criteria for imaging data: The authors state (from line 402): 'Imaging sessions with large amounts of drift or bleaching were excluded from analysis (8 sessions for NET mice, 6 sessions for LC Mice).' What exactly were the quantitative exclusion criteria? Were these defined before the onset of the study or throughout the study?  

      Imaging sessions were first qualitatively assessed by looking for disappearance or movement of structures in the Z-plane throughout the imaging FOV. Additionally, following motion correction in suite2p, we used the registration metrics, which plots the first Principle Component of the motion corrected images, to assess for drift, bleaching, or heat bubbles. If this variable increased or decreased greatly throughout a session, to the point where any apparent activity was not visible in the first PC, the dataset was excluded. We have added these exclusion criteria to the methods section.

      Reviewer #3 (Recommendations For The Authors):  

      Please provide a justification or rationale for having two different criteria for immobility (< 5cm/sec) and freezing (<0.2 cm/sec). If VTA and LC axon activities are different between these two velocities, please provide some commentary on this difference.  

      This is a typo leftover from before we converted velocity from rotational units to cm/s.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewing editor’s list of items remaining to be addressed followed by our responses/actions:

      (1) The order and organization of supplemental figures and tables is almost impossible to navigate. Please put them in order. 

      All the sections from the previous Supplementary files have been divided into individual Supplementary files so that each can be referenced without confusion from the text. All of the references in the body of the text and the author responses have been updated to reflect this change.

      (2) The question of sample sizes was partially addressed, with authors stating that cell culture work in iPSCs and PGCLCs was done in replicates of 3. Sertoli and granulosa cells were generated from pooled preps - how many individuals, were they littermates? 

      Sertoli and granulosa primary cultures were generated from littermates and each prep used 5 animals (males for Sertoli cells and females for granulosa cells). These changes have been added to the body of the text on pages 39 and 40.

      (3) Authors need to discuss the limitations of doing work in triplicates. Their PCA (Supplement Figure 9) reveals that in several cases samples from the same treatment were not discriminated by PC1 and/or PC2. This is especially true in e and f, the variance of which was explained by PC1 for cell type, but for which treatments showed poor discrimination by PC2. Some discussion of the limitations of sample size should be provided.

      Additional text has been added to what is now Supplementary file 15 to acknowledge this limitation imposed by the limited number of replicates (three) and the ability to resolve the differences in treatments by PCA in subplots e and f. However, we also note that the differences were sufficient to identify significant DMCs/DMRs/DEGs.

      Reviwer 2 also noted a potential weakness that “exposures are more complicated in a whole organism than in an isolated cell line.”

      We note that in our revised manuscript we included wording noting that despite the advantages of using an in vitro approach to deduce underlying molecular mechanisms, results of such in vitro studies “ultimately warrant validation of results discerned from studies of in vitro models to ensure they also reflect functions ongoing in the more complex and heterogeneous environment of the intact animal in vivo.” Thus we have endeavored to acknowledge the reviewer’s point.

      Reviewer #1 (Public Review): 

      Critiques/Comments: 

      (1) A problem with in vitro work is that homogeneous cell lines/cultures are, by nature, absent from the rest of the microenvironment. The authors need to discuss this. 

      [Addressed on pages: 24-25] – We have added two sentences to the second paragraph of the Discussion section in which we now acknowledge this concern, but also point out that in vitro models of this sort also provide an experimental advantage in that they facilitate a deconvolution of the extensive complexity resident within the intact animal. Nevertheless, we acknowledge that this deconvolution requires ultimate validation of findings obtained within an in vitro model system to ensure they accurately recapitulate functions that occur in the intact animal in vivo.

      In response to Reviewer 2’s stated weakness of our study that “The weakness includes the fact that exposures are more complicated in a whole organism than in an isolated cell line,” please note that this added text includes the statement that despite the advantages of using an in vitro approach to deduce underlying molecular mechanisms, results of such in vitro studies “ultimately warrant validation of results discerned from studies of in vitro models to ensure they also reflect functions ongoing in the more complex and heterogeneous environment of the intact animal in vivo.” Thus we have endeavored to acknowledge the reviewer’s point.

      (2) What are n's/replicates for each study? Were the same or different samples used to generate the data for RNA sequencing, methylation beadchip analysis, and EM-seq? This clarification is important because if the same cultures were used, this would allow comparisons and correlations within samples.  

      Addressed on pages: 39-45 and in new Supplementary file 15 – Additional text has been added in the Methods section to indicate that all samples involving cell culture models which include iPSCs and PGCLCs came from a single XY iPS cell line aliquoted into replicates and all primary cultures which included Sertoli and granulosa cells were generated from pooled tissue preps from mice and then aliquoted into replicates. Finally, all experiments in the study were performed on three replicates. Because this experimental design did indeed allow for comparisons among samples, we have added a new Supplementary file 15

      which displays PCA plots showing clustering among control and treatment datasets, respectively, as well as distinctions between each cluster representing each experimental condition.

      (3) In Figure 1, it is interesting that the 50 uM BPS dose mainly resulted in hypermethylation whereas 100 uM appears to be mainly hypomethylation. (This is based on the subjective appearance of graphs). The authors should discuss and/or present these data more quantitatively. For example, what percentage of changes were hypo/hypermethylation for each treatment? How many DMRs did each dose induce? For the RNA-seq results, again, what were the number of up/down-regulated genes for each dose?  

      Addressed on pages: 6-7 and in new Supplementary files 1-3  – The experiment shown in Figure 1 was designed to 1) serve as proof of principle that cells maintained in culture could be susceptible to EDC-induced epimutagenesis at all, 2) determine if any response observed would be dose-dependent, and 3) identify a minimally effective dose of BPS to be used for the remaining experiments in this study (which we identified as 1 μM). We agree that it is interesting that the 50 µM dose of BPS induced predominantly hypermethylation changes whereas the 1 µM and 100 µM doses induced predominantly hypomethylation changes, but are not in a position to offer a mechanistic explanation for this outcome at this time. As the results shown satisfied our primary objectives of demonstrating that exposure of cells in culture to BPS could indeed induce DNA methylation epimutations, that this occurs in a dose-dependent manner, and that a dose of as low as 1 µM of BPS was sufficient to induce epimutagenesis, the data obtained satisfied all of the initial objectives of this experiment. That said, in response to the reviewer’s request we have now added text on pages 6-7 alluding to new Supplementary files 1-3 indicating the total number of DMCs and DMRs, as well as the number of DEGs, detected in response to exposure to each dose of BPS shown in Figure 1, as well as stratifying those results to indicate the numbers of hyper- and hypomethylation epimutations and up- and down-regulated DEGs induced in response to each dose of BPS. While, as noted above, investigating the mechanistic basis for the difference in responses induced by the 50 µM versus 1 and 100 µM doses of BPS was beyond the scope of the study presented in this manuscript, we do find this result reminiscent of the “U-shaped” response curves often observed in toxicology studies. Importantly, this result does demonstrate the elevated resolution and specificity of analysis facilitated by our in vitro cell culture model system.

      (4) Also in Figure 1, were there DMRs or genes in common across the doses? How did DMRs relate to gene expression results? This would be informative in verifying or refuting expectations that greater methylation is often associated with decreased gene expression.  

      Addressed on pages: 6-7 and new Supplementary files 1-6 – In general, we observed a coincidence between changes in DNA methylation and changes in gene expression (Supplementary files 1-3). Pertaining directly to the reviewer’s question about the extent to which we observed common DMRs and DEGs across all doses, while we only found 3 overlapping DMRs conserved across all doses tested, we did find an average of 51.25% overlap in DMCs and an average of 80.45% overlap in DEGs across iPSCs exposed to the different doses of BPS shown in Figure 1. In addition, within each dose of BPS tested in iPSCs, we also found that there was an overlap between DMCs and the promoters or gene bodies of many DEGs (Supplementary file 5). Specifically within gene promoters, we observed a correlation between hypermethylated DMCs and decreased gene expression and hypomethylated DMCs and increased gene expression, respectively (Supplementary file 6).

      (5) In Figure 2, was there an overlap in the hypo- and/or hyper-methylated DMCs? Please also add more description of the data in 2b to the legend including what the dot sizes/colors mean, etc. Some readers (including me) may not be familiar with this type of data presentation. Some of this comes up in Figure 4, so perhaps allude to this earlier on, or show these data earlier.  

      Addressed on pages: 8-9 and new Supplementary file 4 – We observed an average of 11.05% overlapping DMCs between different pairs of cell types, we did not observe any DMCs that were shared among all four cell types. Indeed, this limited overlap of DMCs among different cell types exposed to BPS was the primary motivation for the analysis described in Figure 2. Thus, instead of focusing solely on direct overlap between specific DMCs, we instead examined similarities among the different cell types tested in the occurrence of epimutations within different annotated genomic regions. To better describe this, we have now added additional text to page 9. We have also added more detail to the legend for Figure 2 on page 8 to more clearly explain the significance of the dot sizes and colors, explaining that the dot sizes are indicative of the relative number of differentially methylated probes that were detected within each specific annotated genomic region, and that the dot colors are indicative of the calculated enrichment score reflecting the relative abundance of epimutations occurring within a specific annotated genomic region. The relative score is calculated by iterating down the list of DMCs and increasing a running-sum statistic when encountering a DMC within the specific annotated genomic region of interest and decreasing the sum when the epimutation is not in that annotated region. The magnitude of the increment depends upon the relative occurrence of DMCs within a specific annotated genomic region.

      (6) iPSCs were derived from male mice MEFs, and subsequently used to differentiate into PGCLCs. The only cell type from an XX female is the granulosa cells. This might be important, and should be mentioned and its potential significance discussed (briefly).  

      Addressed on page: 29 – We have added a new paragraph just before the final paragraph of the Discussion section in which we acknowledge that most of the cell types analyzed during our study were XY-bearing “male” cells and that the manner in which XX-bearing “female” cells might respond to similar exposures could differ from the responses we observed in XY cells. However, we also noted that our assessment of XX-bearing granulosa cells yielded results very similar to those seen in XY Sertoli cells suggesting that, at least for differentiated somatic cell types, there does not appear to be a significant sex-specific difference in response to exposure to a similar dose of the same EDC. That said, we also acknowledged that in cell types in which dosage compensation based on X-chromosome inactivation is not in place, differences between XY- and XX-bearing cells could accrue.

      (7) EREs are only one type of hormone response element. The authors make the point that other mechanisms of BPS action are independent of canonical endocrine signaling. Would authors please briefly speculate on the possibility that other endocrine pathways including those utilizing AREs or other HREs may play a role? In other words, it may not be endocrine signaling independent. The statement that the differences between PGCLCs and other cells are largely due to the absence of ERs is overly simplistic.  

      Addressed on page: 11 and in a new Supplementary file 8  – Previous reports have indicated that BPS does not have the capacity to bind with the androgen receptor (Pelch et al., 2019; Yang et al., 2024). However there have been reports indicating that BPS can interact with other endocrine receptors including PPARγ and RXRα, which play a role in lipid accumulation and the potential to be linked to obesity phenotypes (Gao et al., 2020; Sharma et al., 2018). To address the reviewer’s comment we assessed the expression of a panel of hormone receptors including PPARγ, RXRα, and AR  in each of the cell types examined in our study and these results are now shown in a new Supplementary file 8. We show that in addition to not expressing either estrogen receptor (ERa or ERb), germ cells also do not express any of the other endocrine receptors we tested including AR, PPARγ, and RXRα. Thus we now note that these results support our suggestion that the induction of epimutations we observed in germ cells in response to exposure to BPS appears to reflect disruption of non-canonical endocrine signaling. We also note that non-canonical endocrine signaling is well established (Brenker et al., 2018; Ozgyin et al., 2015; Song et al., 2011; Thomas and Dong, 2006). Thus we feel the suggestion that the effects of BPS exposure could conceivably reflect either disruption of canonical or non-canonical signaling in any cell type is well justified and that our data suggests that both of these effects appear to have accrued in the cells examined in our study as suggested in the text of our manuscript.

      (8) Interpretation of data from the GO analysis is similarly overly simplistic. The pathways identified and discussed (e.g. PI3K/AKT and ubiquitin-like protease pathways) are involved in numerous functions, both endocrine and non-endocrine. Also, are the data shown in Figure 6a from all 4 cell types? I am confused by the heatmap in 6c, which genes were significantly affected by treatment in which cell types?  

      Addressed on pages: 19-21 – Per the reviewer’s request, we have added text to indicate that Figure 6a is indeed data from all four cell types examined. We have also modified the text to further clarify that Figure 6c displays the expression of other G-coupled protein receptors which are expressed at similar, if not higher, levels than either ER in all cell types examined, and that these have been shown to have the potential to bind to either 17β-estradiol or BPA in rat models. As alluded to by the reviewer, this is indicative of a wide variety of distinct pathways and/or functions that can potentially be impacted by exposure to an EDC such as BPS. Thus, we have attempted to acknowledge the reviewer’s primary point that BPS may interact with a variety of receptors or other factors involved with a wide variety of different pathways and functions. Importantly, this illustrates the strength of our model system in that it can be used to identify potential impacted target pathways that can then be subsequently pursued further as deemed appropriate.

      (9) In Figure 7, what were the 138 genes? Any commonalities among them? 

      Addressed on page: 22 and in a new Supplementary files 13 and 14 – We have now added a new supplemental Excel file (Supplementary file 13) that lists the 138 overlapping conserved DEGs that did not become reprogrammed/corrected during the transition from iPSCs to PGCLCs. In addition, we have added new text on page 22 and a new Supplementary file 14 which displays KEGG analysis of pathways associated with these 138 retained DEGs. We find that these genes are primarily involved with cell cycle and apoptosis pathways which, interestingly, have the potential to be linked to cancer development which is often linked to disruptions in chromatin architecture.

      (10) The Introduction is very long. The last paragraph, beginning line 105, is a long summary of results and interpretations that better fit in a Discussion section.

      Addressed on page: 6 – We have now significantly reduced the length and scope of the final paragraph of the Introduction per the reviewer’s recommendation.

      (11) Provide some details on husbandry: e.g. were they bred on-site? What food was given, and how was water treated? These questions are to get at efforts to minimize exposure to other chemicals.  

      Addressed on page: 37 – We have added additional text detailing that all mice used in the project were bred onsite, water was non-autoclaved conventional RO water, and our selection of 5V5R extruded feed for mice used in this study which was highly controlled for the presence of isoflavones and has been certified to be used for estrogen-sensitive animal protocols.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript uses cell lines representative of germ line cells, somatic cells, and pluripotent cells to address the question of how the endocrine-disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters. 

      Strengths: 

      The strengths of the paper include the use of various cell types to address the sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation. 

      Weaknesses: 

      The weaknesses include the lack of reporting of replicates, superficial bioinformatic analysis, and the fact that exposures are more complicated in a whole organism than in an isolated cell line. 

      Recommendations for the authors: please note that you control which revisions to undertake from the public reviews and recommendations for the authors. 

      Reviewer #2 (Recommendations For The Authors): 

      Overall, this is an intriguing paper but more transparency in the replicates and methods and a more rigorous bioinformatic treatment of the data are required. 

      Specific comments: 

      (1) End of abstract "These results suggest a unique mechanism by which an EDC-induced epimutated state may be propagated transgenerationally following a single exposure to the causative EDC." This is overly speculative for an abstract. There is only epigenetic inheritance following mitosis or differentiation presented in this study. There is no meiosis and therefore no ability to assess multi- or transgenerational inheritance. 

      Addressed on page: 2 – We have modified the text at the end of the abstract to more precisely reflect our intended conclusions based on our data. In our view, the ability of induced epimutations to transcend meiosis per se is not as relevant to the mechanism of transgenerational inheritance as their ability to transcend major waves of epigenetic reprogramming that normally occur during development of the germ line. In this regard the transition from pluripotent iPSCs to germline PGCLCs has been shown to recapitulate at least the first portion of normal germline reprogramming, and now our data provide novel insight into the fate of induced epimutations during this process. Specifically, we show that a prevelance of epimutations was conserved during the iPSC à germ cell transition but that very few (< 5%) of the specific epimutations present in the the BPS-exposed iPSCs were retained when those cells were induced to form PGCLCs. Rather, we observed apparent correction of a large majority of the initially induced epimutations during this transition, but this was accompanied by the apparent de novo generation of novel epimutations in the PGCLCs. We suggest, based on other recent reports in the literature, that this is a result of the BPS exposure inducing changes in the chromatin architecture in the exposed iPSCs such that when the normal germline reprogramming mechanism is imposed on this disrupted chromatin template there is both correction of many existing epimutations and the genesis of many novel epimutations. This observation has the potential to explain the long-standing question of why the prevalence of epimutations persists across multiple generations despite the occurrence of epigenetic reprogramming during each generation. Nevertheless, as noted above, we have modified the text at the end of the abstract to temper this interpretation given that it is still somewhat speculative at this point.

      (2) Doses used in the experiments. One needs to be careful when stating that the dose used is "below FDA's suggested safe environmental level established for BPA" because a different bisphenol is being used here (BPA vs BPS) and the safe level is that which the entire organism experiences. It is likely that cell lines experience a higher effective dose.  

      Addressed on pages: 3, 5, and 26 – We have now made a point of noting that our reference to an EPA-recommended “safe dose” of BPA was for humans and/or intact animals. Changes to this effect have been made in the second and sixth paragraphs of the Introduction section. In addition, we have added text at the end of the fourth paragraph of the Discussion section acknowledging that, as the reviewer suggests, the same dose of an EDC could exert greater effects on cells in a homogeneous culture than on the same cell type within an intact animal given the potential for mitigating metabolic effects in the latter. However, we also note that the ability we demonstrated to quantify the effects of such exposures on the basis of numbers of epimutations (DMCs or DMRs) induced could potentially be used in future studies to study this question by assessing the effects of a specific dose of a specific EDC on a specific cell type when exposed either within a homogeneous culture or within an intact animal.

      (3) Figure 1: In the dose response, what was the overlap in DMCs and DEGs among the 3 doses? Are the responses additive, synergistic, or completely non-overlapping? This is an important point that should be addressed. 

      Addressed on page: 6-7 and in Supplementary files 1-5 – Please see our response to Reviewer 1 critique #4 above where we address similar concerns. While we do find overlap among different cell types with respect to the DMCs, DMRs, and DEGs displayed in Figure 1, we found the effect to be only partially additive as opposed to synergistic in any apparent manner. The fold increase in DMCs, DMRs, and DEGs resulting from exposure to doses of 1 μM or 50 μM ranged from 2.5x to 4.4x, which was well below the 50x increase that would have been expected from a strictly additive effect, and the effect increased even less, if at all, in response to exposure to doses of 50 μM versus 100 μM BPS. Finally, as now noted in the Discussion section on page 25, our conclusion is that these results display a limited dose-dependent effect that was partially additive but also plateaued at the highest doses tested.

      (4) Methods: How many times was each exposure performed on a given cell type? This information should be in the figure legends and methods. In the case of multiple exposures for a given line, do the biological replicates agree? 

      Addressed on pages: 39-45 and in new Supplementary file 15 –  Please see our response to Reviewer 1 critique #2 where we address similar concerns with newly added text and analysis. We now note repeatedly on pages 39-45 that each analysis was conducted on three replicate samples, and we display the similarity among those replicates graphically in a new Supplementary file 15.

      (5) DNA methylation analyses. Very little analysis is presented on the BeadChip array other than hypermethylated/hypomethylated and genomic regions of DMCs. What is the range of methylation changes? Does it vary between hypo vs. hyper DMCs? How many array experiments were performed (biological replicates) and what stats were used to determine the DMCs? Are there DMCs in common among the various cell types? As an example, if more meaningful analysis, one can plot the %5mC over a given array for comparisons between control and treated cell types. For more granularity, the %5mC can be presented according to the element type (enhancers vs promoters). 

      Addressed on pages: 10 and 39-45 and in new Supplementary files 1-5, 15 –  Please see our response to Reviewer 1 critique #2 above where we address similar concerns regarding the number of biological replicates used in this study. DMCs on the Infinium array are identified using mixed linear models. This general supervised learning framework identifies CpG loci at which differential methylation is associated with known control vs. treated co-variates. CpG probes on the array were defined as having differential changes that met both p-value and FDR (≤ 0.05) significant thresholds between treatment and control samples for each cell type analyzed. The range of medians across all samples was 0.0278 to 0.0059 for hypermethylated beta values and -0.0179 to -0.0033 for hypomethylated beta values. As noted above, we did observe an overlap in DMCs between cell types. Thus, we observed an average of 11.05% overlapping DMCs between two or more cell types but we did not observe any DMCs shared between all four cell types. We have added additional text on page 9 and new Supplementary files 1-5 to now more clearly describe that this limited similarity in direct overlap of DMCs was the underlying motivation for the analysis described in Figure 2. Finally, the enrichment dot plots shown in Figure 2 provide the information the reviewer requested regarding the %5mC observed at different annotated genomic element types.

      (6) The investigators correlate the number of DMCs in a given cell type with the presence of estrogen receptors. Does the correlation extend to the methylation difference (delta beta) at the statistically different probes?

      Addressed in a new Supplementary file 7 – We have added a new Supplementary file 7 in which we provide data addressing this question. In brief, we find that the delta betas of probes enriched at enhancer regions and associated with relative proximity to ERE elements in Sertoli cells, granulosa cells, and iPSCs appear very similar to those associated with DMCs not located within these enriched regions. However, when we compared the similarity of the two data sets with goodness of fit tests, we found these relatively small differences were, in fact, statistically significant based on a two-sample Kolmogorov-Smirnov test. These observed significant differences appear to indicate that there is higher variability among the delta betas associated with hypomethylated, but not hypermethylation changes occurring at DMCs associated with enhancers, potentially suggesting a greater tendency for exposure to BPS to induce hypomethylation rather than hypermethylation changes, at least in these specific regions.

      (7) Methylation changes relative to EREs are presented in multiple figures. Are other sequences enriched in the DMCs? 

      Addressed in a new Supplementary file 11. We profiled the genomic sequence within 500 bp of cell type-specific enriched DMCs that were either associated with enhancer regions in Sertoli, granulosa, or iPS cells or transcription factor binding sites in PGCLCs for the identification of higher abundance motif sequences. We then compared any motifs identified with the JASPAR database to potentially find transcription factors that could be binding to these regions. Interestingly we found that the two most common motifs across all cell types were associated with either the chromatin remodeling transcription factor HMG1A or the pluripotency factor KLF4.

      (8) Please present a correlation plot between the methylation differences and the adjacent DEGs. Again, the absence of consideration of the absolute changes in methylation and gene expression minimizes the impact of the data. 

      Addressed on pages 6, 7, and 17 and in a new Supplementary file 6 – We analyzed the relationship between DMCs at DEGs promoter regions and the corresponding change in expression of that DEG. Our data support a relationship between up-regulated genes showing decreased methylation in promoter regions and down-regulated genes showing increased methylation at promoter regions, although there were some exceptions to this relationship.

      (9) EM-Seq is mentioned in Figure 7 and in the material and methods. Where is it used in this study? 

      Addressed on page 22 – We now note in the text on page 22 that EM-seq was used during experiments assessing the propagation of BPS-induced epimutations during the iPSC à EpiLC à PGCLC cell state transitions to gather higher resolution data of changes to DNA methylation differences at the whole-epigenome level.

      References

      Brenker C, Rehfeld A, Schiffer C, Kierzek M, Kaupp UB, Skakkebæk NE, Strünker T. 2018. Synergistic activation of CatSper Ca2+ channels in human sperm by oviductal ligands and endocrine disrupting chemicals. Hum Reprod 33:1915–1923. doi:10.1093/humrep/dey275

      Gao P, Wang L, Yang N, Wen J, Zhao M, Su G, Zhang J, Weng D. 2020. Peroxisome proliferator-activated receptor gamma (PPARγ) activation and metabolism disturbance induced by bisphenol A and its replacement analog bisphenol S using in vitro macrophages and in vivo mouse models. Environ Int 134. doi:10.1016/J.ENVINT.2019.105328

      Ozgyin L, Erdos E, Bojcsuk D, Balint BL. 2015. Nuclear receptors in transgenerational epigenetic inheritance. Prog Biophys Mol Biol. doi:10.1016/j.pbiomolbio.2015.02.012

      Pelch KE, Li Y, Perera L, Thayer KA, Korach KS. 2019. Characterization of Estrogenic and Androgenic Activities for Bisphenol A-like Chemicals (BPs): In Vitro Estrogen and Androgen Receptors Transcriptional Activation, Gene Regulation, and Binding Profiles. Toxicol Sci 172:23–37. doi:10.1093/TOXSCI/KFZ173

      Sharma S, Ahmad S, Khan MF, Parvez S, Raisuddin S. 2018. In silico molecular interaction of bisphenol analogues with human nuclear receptors reveals their stronger affinity vs. classical bisphenol A. Toxicol Mech Methods 28:660–669. doi:10.1080/15376516.2018.1491663

      Song K-H, Lee K, Choi H-S. 2011. Endocrine Disrupter Bisphenol A Induces Orphan Nuclear Receptor Nur77 Gene Expression and Steroidogenesis in Mouse Testicular Leydig Cells. Endocrinology 143:2208–2215. doi:10.1210/endo.143.6.8847

      Thomas P, Dong J. 2006. Binding and activation of the seven-transmembrane estrogen receptor GPR30 by environmental estrogens: A potential novel mechanism of endocrine disruption. J Steroid Biochem Mol Biol 102:175–179. doi:10.1016/j.jsbmb.2006.09.017

      Yang Z, Wang L, Yang Y, Pang X, Sun Y, Liang Y, Cao H. 2024. Screening of the Antagonistic Activity of Potential Bisphenol A Alternatives toward the Androgen Receptor Using Machine Learning and Molecular Dynamics Simulation. Environ Sci Technol 58:2817–2829. doi:10.1021/ACS.EST.3C09779/ASSET/IMAGES/LARGE/ES3C09779_0004.JPEG

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      [...] Strengths:

      The authors have generated a novel transgenic mouse line to specifically label mature differentiated oligodendrocytes, which is very useful for tracing the final destiny of mature myelinating oligodendrocytes. Also, the authors carefully compared the distribution of three progenitor cre mouse lines and suggested that Gsh-cre also labeled dorsal OLs, contrary to the previous suggestion that it only marks LGE-derived OPCs. In addition, the author also analyzed the relative contributions of OLs derived from three distinct progenitor domains in other forebrain regions (e.g. Pir, ac). Finally, the new transgenic mouse lines and established multiple combinatorial genetic models will facilitate future investigations of the developmental origins of distinct OL populations and their functional and molecular heterogeneity.

      Weaknesses:

      Since OpalinP2A-Flpo-T2A-tTA2 only labels mature oligodendrocytes but not OPCs, the authors can not suggest that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation (line 118-9). It remains possible that LGE/CGE-derived OPCs migrate into the cortex but are later eliminated.

      We are glad that the reviewer appreciates our work and are grateful for the positive comments and the constructive suggestion. We agree with the reviewer that our methodology by itself cannot suggest whether the lack of LGE/CGE-derived-OLs in the neocortex is caused by competitive postnatal elimination or not. That is why we cited a parallel work by Li et al. (ref [17] in the original manuscript; ref [19] in the revised manuscript), in which in utero electroporation (IUE) failed to label LGE-derived OL lineage cells in both embryonic and early postnatal brains. Although they did not directly explore CGE using IUE, their fate mapping results using Emx1-Cre; Nkx2.1-Cre; H2B-GFP at P0 and P10 revealed very low percentage of LGE/CGE-derived OL lineage cells. The lack of adult labeling in our study together with the lack of developmental labeling in the other study prompted us to hypothesize that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation. In the revised manuscript, we have expanded the discussion to explain this point more clearly.

      Reviewer #2 (Public Review):

      [...] Strengths:

      The strength and novelty of the manuscript lies in the elegant tools generated and used and which have the potential to elegantly and accurately resolve the issue of the contribution of different progenitor zones to telencephalic regions.

      We are glad that the reviewer appreciates our work and are grateful for the overall positive comments.

      Weaknesses:

      (1) Throughout the manuscript (with one exception, lines 76-78), the authors quantified OL densities instead of contributions to the total OL population (as a % of ASPA for example). This means that the reader is left with only a rough estimation of the different contributions.

      We thank the reviewer for this constructive suggestion. We have replaced the density quantification (Figure 2F and 3D in the original manuscript) with contributions to the total OL population (% of ASPA) (Figure 2J and 2N in the revised manuscript).

      (2) All images and quantifications have been confined to one level of the cortex and the potential of the MGE and the LGE/CGE to produce oligodendrocytes for more anterior and more posterior cortical regions remains unexplored.

      The quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. We apologize for not having stated and presented this information clearly enough, and for the confusions it may have caused. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200*) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      (3) Hence, the statement that "In summary, our findings significantly revised the canonical model of forebrain OL origins (Figure 4A) and provided a new and more comprehensive view (Figure 4B )." (lines 111, 112) is not really accurate as the findings are neither new nor comprehensive. Published manuscripts have already shown that (a) cortical OLs are mostly generated from the cortex [Tripathi et al 2011 (https://doi.org/10.1523/JNEUROSCI.6474-10.2011), Winker et al 2018 (https://doi.org/10.1523/JNEUROSCI.3392-17.2018) and Li et al (https://doi.org/10.1101/2023.12.01.569674)] and (b) MGE-derived OLs persist in the cortex [Orduz et al 2019 (https://doi.org/10.1038/s41467-019-11904-4) and Li et al 2024 (https://doi.org/10.1101/2023.12.01.569674)]. Extending the current study to different rostro-caudal regions of the cortex would greatly improve the manuscript.

      As explained in the response to comment (2), our original quantifications included different rostro-caudal regions of the cortex. In the revised manuscript, we have added more schematics and representative images in the Supplementary Figure 2 for better illustration to resolve the concern of comprehensiveness.

      We thank the reviewer for listing and summarizing highly relevant published researches along with the parallel study by Li et al. submitted to eLife. We apologize for the omission of the first two references in our original manuscripts and have cited them in appropriate places (ref [10] and ref [11] in the revised manuscript). However, we believe these works do not compromise the novelty and significance of our work for the following reasons:

      (1) Tripathi et al. 2011 (ref [10] in the revised manuscript) analyzed OL lineage cells in the corpus callosum and the spinal cord, but not in the cortex and anterior commissure. Their analysis was performed in juvenile mice (P12/13), not in adulthood. Most importantly, their analysis of ventrally derived OL lineage cells relied on lineage tracing using Gsh2Cre, which in fact also label OLs derived from Gsh2+ dorsal progenitors. In contrast, we analyzed mature OLs in the cortex, corpus callosum and anterior commissure in 2-month-old adult mice. We used intersectional and subtractive strategy to label OLs derived from dorsal, LGE/CGE and MGE/POA origins. Our strategy differentiated the two different ventral lineages (LGE/CGE vs. MGE/POA) and avoided mixed labeling of OLs from ventral and dorsal Gsh2+ progenitors.

      (2) Winkler et al. 2018 (ref [11] in the revised manuscript) analyzed OLs derived from dorsal progenitors but only quantified those in the gray matter and the white matter of somatosensory cortex. Their quantification relied on co-staining with Olig2/Sox10, and thereby included both oligodendrocyte precursors (OPCs) and OLs. In contrast, we analyzed mature OLs from three origins and quantified not only neocortical regions (Mo and SS) but also an archicortical region (Pir). Our analysis revealed that although dorsally derived OLs dominate neocortex, ventrally derived OLs, especially the LGE/CGE-derived ones, dominate piriform cortex.

      (3) Orduz et al. 2019 (ref [7] in the original manuscript and the revised manuscript) mainly focused on POA-derived OLs in the somatosensory cortex. Although they performed limited analysis on MGE/POA-derived OPCs at postnatal day 10 and 19, no quantification of MGE/POA-derived OLs was performed in terms of their density, contribution to the total OL population and spatial distribution in the cortex. In contrast, we performed systematic quantification on these aspects to demonstrate that MGE/POA-derived OLs make small but sustained contribution to cortex with a distribution pattern distinctive from those derived from the dorsal origin.

      (4) Li et al. 2024 (ref [17] in the original manuscript and [19] in the revised manuscript) is a parallel study submitted to eLife. Their and our independent discoveries nicely complemented each other. Using different sets of techniques and experiments but some shared genetic mouse models, we both found that LGE/CGE made minimum contribution to neocortical OLs. Their analysis in the prenatal and early postnatal stages together with our analysis in the adult brain painted a more comprehensive picture of cortical oligodendrogenesis. The uniqueness of our work is that we performed systematic quantification of all three origins and uncovered the differential contributions to neocortex, piriform cortex, corpus callosum and anterior commissure.

      In summary, our work developed novel strategies to faithfully trace OLs from the three different origins and performed systematic analysis in the adult brain. Our data uncovered their differential contributions to neocortex, piriform cortex and the two commissural white matter tracts, which significantly differ not only from the canonical view but also from other previous studies in aspects discussed above. We believe our discoveries did significantly revise the canonical model of forebrain OL origins and provided a new and more comprehensive view.

      Reviewer #3 (Public Review):

      [...] Intriguingly, by using an indirect subtraction approach, they hypothesize that both Emx1-negative and Nkx2.1-negative cells represent the progenitors from lateral/caudal ganglionic eminences (LC), and conclude that neocortical OLs are not derived from the LC region.The authors claim that Gsh2 is not exclusive to progenitor cells in the LC region (PMID: 32234482). However, Gsh2 exhibits high enrichment in the LC during early embryonic development. The presence of a small population of Gsh2-positive cells in the late embryonic cortex could originate/migrate from Gsh2-positive cells in the LC at earlier stages (PMID: 32234482). Consequently, the possibility that cortical OLs derived from Gsh2+ progenitors in LC could not be conclusively ruled out. Notably, a population of OLs migrating from the ventral to the dorsal cortical region was detected after eliminating dorsal progenitor-derived OLs (PMID: 16436615).

      The indirect subtraction data for LC progenitors drawn from the OpalinFlp-tdTOM reporter in Emx1-negative and Nkx2.1-negative cells in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line present some caveats that could influence their conclusion. The extent of activity from the two Cre lines in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mice remains uncertain. The OpalinFlp-tdTOM expression could occur in the presence of either Emx1Cre or Nkx2.1Cre, raising questions about the contribution of the individual Cre lines. To clarify, the authors should compare the tdTOM expression from each individual Cre line, OpalinFlp::Emx1Cre::RC::FLTG or OpalinFlp::Nkx2.1Cre::RC::FLTG, with the combined OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line. This comparison is crucial as the results from the combined Cre lines could appear similar to only one Cre line active.

      Overall, the authors provided intriguing findings regarding the origin and fate of oligodendrocytes from different progenitor cells in embryonic brain regions. However, further analysis is necessary to substantiate their conclusion about the fate of LC-derived OLs convincingly.

      We thank the reviewer for these thoughtful comments. We agree with the reviewer that the presence of Gsh2-positive cells in the late embryonic cortex by itself could not rule out the possibility that they originate/migrate from Gsh2-positive cells in the LC at earlier stages. Staining dorsal-lineage intermediate progenitors with Gsh2, or performing intersectional lineage tracing using Gsh2Cre along with a dorsal-specific Flp driver, would provide more direct evidence on this issue. Nonetheless, as our lineage tracing of LGE/CGE-derive OLs did not employ Gsh2Cre, the doubt on the identity of Gsh2+ cortical progenitors should not affect the interpretation of our data.

      Regarding the subtractional LCOL labeling strategy used in our study, we wonder if there was any misunderstanding by the reviewer. As stated in our manuscript (line 59-61) and reiterated by the reviewer, OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG labels OLs derived from progenitors that express neither Emx1Cre nor Nkx2.1Cre. As these two progenitor pools do not overlap with each other, there is a purely additive effect of their actions. If there is any concern about efficiency and specificity, it would be non-adequate Cre-mediated recombinations that lead to mislabeling of dOLs or MPOLs as LCOLs (i.e., OLs derived from Emx1 or Nkx2.1-expressing progenitors were not successfully “subtracted” and thereby “wrongly” retained RFP expression). Therefore, the bona-fide LGE/CGE-derive OLs would only be fewer but not more than RFP+ LCOLs labeled by our subtractional strategy, even if any of the Cre lines did not work efficiently enough. In any case, this would not affect our conclusion that LGE/CGE-derive OLs make a minimal contribution to neocortex, as the “ground truth” contribution by LGE/CGE could only be less but not more than what we have observed using the current strategy.

      In support of our conclusion, a parallel study by Li et al. 2024 (ref [17] in the original manuscript; ref [19] in the revised manuscript) also provided independent experimental evidence that “any contribution of oligodendrocyte precursors to the developing cortex from the lateral ganglionic eminence is minimal in scope (quoted from its eLife assessment).” In addition, in their revision, they performed Gsh2 immunostaining in P0 Emx1Cre::HG-loxP mouse and found nearly all Gsh2+ cells in the cortical SVZ were derived from the Emx1+ lineage. We are glad that this additional piece of evidence further clarified the case, but still want to emphasize that the subtractional strategy we took was designed purposefully to avoid the potential uncertainty of Gsh2Cre and to more faithfully label LGE/CGE-derived OLs. Therefore, the validity of our conclusion about the fate of LC-derived OLs should be independent from the question on the identity of Gsh2+ cortical progenitors and stands well by itself.

      We hope that these explanations have adequately addressed the reviewer’s concerns. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In Figures 2C, 2D, 2E and 3D, the authors should provide counts of labelled cells as a % of ASPA+ cells. This will give an accurate picture of the contribution of the different progenitor regions to OLs.

      The graphs in Figure 2F are unnecessary since they are simply repeats of C-E but re-arranged.

      We thank the reviewer for the valuable suggestions. These two recommendations are sort of related, and thereby we made the following changes. We replaced the density quantification in Figure 2F and 3D with % of ASPA (Figure 2J and 2N in the revised manuscript) to give an accurate picture of the contribution of the different progenitor regions to OLs, as suggested by the reviewer. We still retained the density counts in Figure 2C-E (Figure 2G-I in the revised manuscript). Together with quantifications of rotral-caudal and larminar distributions presented in Supplementary Figure 2, these data demonstrated that OLs from differential origins display distinct spatial distribution patterns.

      At what ages were the quantifications performed in all the figures?

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section of the revised manuscript.

      In 2D, and 3B the GFP should have been activated but the authors do not show it or quantify it presumably because GFP would flood the sections in the presence of Emx1Cre. Nevertheless, since eGFP is shown in the diagram in 2B, the authors should mention why they chose not to show it.

      We thank the reviewer for the helpful comment and the suggestion. We have modified the schematic in Figure 2B and added explanation in the figure legend (line 308-313). We also added a schematic in Supplementary Figure 1A along with images of GFP channel in Supplementary Figure 1D (line 338-350).

      All the main figures and supplementary figures are too small to see properly.

      We are sorry that there was severe compression of images in the combined manuscript file at the conversion step during the initial submission. We apologize for the compromised image quality and have re-uploaded full-size figures as individual files on BioRxiv soon after receiving the reviews. For the revised manuscript, we also take care to upload full-size figures at high resolution as individual files to ensure their quality of presentation.

      Supplementary Figure 2E is unnecessary and perhaps misleading the reader that cortical-derived OLs have a preference for the lower layers whereas the distribution may simply reflect the distribution of OLs in the cortex.

      We thank the reviewer for the helpful comment and the suggestion. We have removed this panel and replaced it with quantifications of relative laminar distributions of the total (ASPA+) OLs along with those from the three different origins (Supplementary Figure 2G in the revised manuscript). Indeed, the preference for the lower layers of dorsally-derived OLs mirrored the distribution of total OLs in the cortex, while the MGE/POA-derived OLs deviate significantly from others and exhibit higher preference towards layer 4.

      Quantification of labelled cells as a % of ASPA should also be performed in Supplementary Figure 3.

      We thank the reviewer for this suggestion. In the revised manuscript, we have included quantifications of labelled cells as % of ASPA for both OpalinFlp::Emx1Cre::Ai65 and  OpalinFlp::Nkx2.1Cre::Ai65 (Figure 2J and N). The sum of the these two data sets will be equivalent to those of OpalinFlp::Emx1Cre::Nkx2.1Cre::Ai65 shown in Supplementary Figure 3, and thereby we did not perform additional quantifications to avoid redundant efforts.

      Imaging and quantification should be extended to more posterior regions of the cortex to find out whether the contribution is different from the areas already examined.

      We thank the reviewer for the suggestion on imaging and apologize for the confusion about the range of quantification. As explained in the response to comment (2) of weakness, the quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors should provide Opalin reporter expression data across various brain regions at different developmental stages to clarify the expression pattern of the reporter.

      We appreciate the reviewer’s comment. We chose to performed all quantifications in adult mice as Opalin is a well-established marker for differentiated OLs and the recombinase-dependent reporter expression is accumulative and irreversible. If there is any non-specific labeling in any earlier developmental stage, it would be retained and manifested at the timepoint we examined as well. In another word, the fact that we did not detect any non-specific labeling in the current dataset but only confined labeling in mature OLs ensured that no non-OL labeling was present in earlier timepoint. As shown in Figure 1D-F, reporter expression activated by the Opalin driver is presented at high OL specificity in all analyzed brain regions. This is further corroborated by results from combinatorically labeled samples (Figure 2 and Supplementary Figure 2), in which only OLs but not any other cell types were labeled in all analyzed brain regions too. Following the reviewers’ suggestions, we have added representative images of more rostral and more caudal cortical regions (Supplementary Figure 2B-D), which also showed highly specific OL labeling.  

      (2) In Figure 1D, please specify the developmental stage of the mice used for staining.

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript.

      (3) The authors should clarify if the Opalin reporter expressed in OPCs and astrocytes at developmental stages of mice, such as P0, P7, and P30.

      We appreciate the reviewer’s comment, but as explained in response to comment (1), Opalin is a well-established marker for differentiated OLs which is not expressed in OPCs or astrocytes. As shown in Figure 1D-E, reporter expression is confined to CC1+ differentiated OLs with no colocalization with Sox9 (astrocyte marker). In support with this observation, only ASPA+ differentiated OLs but no OPC or astrocyte were labeled in any of the combinatorial lineage tracing samples generated using this line combined with progenitor-Cre lines. In addition to marker staining, we also did not observe any RFP+ cells with OPC or astrocyte morphology. As the recombinase-dependent reporter expression is accumulative and irreversible, the fact no non-specific labeling was observed in adult brain retrospectively proved the specificity of Oplain-Flp in earlier developmental stages.

      (4) In Figure 1E, authors should address why the efficiency of the tdTomato line is notably lower compared to that of H2B-GFP and whether the stability of reporters could impact the conclusions drawn.

      The difference in reporting efficiency is mainly caused by differences inherent to the two reporting systems. The TRE-RFP reporter is derived from Ai62, composed of a Tet response element and tdTomato inserted into the T1 TIGRE locus. The tdTomato expression is driven by tTA-TRE transcriptional activation. The HG-loxP reporter is derived from HG-Dual, composed of a CAG promoter, a frt-flanked STOP cassette, and H2B-GFP inserted into the Rosa26 locus. The H2B-GFP expression is driven by CAG promoter after Flp-mediated removal of the STOP cassette. A Flp-dependent tdTomato reporter designed in the same way as the HG-FRT reporter would have similar efficiency. In fact, the RC::FLTG reporter can be viewed as such a reporter in the absence of Cre, which did show similarly high efficiency as HG-FRT and supported efficient subtractive labeling of LGE/CGE-derived OLs. We apologize for a typo in the title of the Y-axis of the right panel in the original Figure 1F which may have caused potential misunderstanding. The “RFP+CC1+/CC1” should be “XFP+CC1/CC1”. We have corrected this mistake and revised the figure legend for clearer description of the data (Line 293-302 in the revised manuscript).

      (5) In Figure 2, please clarify the developmental stage of the mice used for staining. Authors should present the eGFP image in addition to tdTOM.

      We apologize for the omission of the age information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript. We thank the reviewer for the suggestion on eGFP image and have presented it in supplementary Figure 1 in the revised manuscript.

      (6) in Figure 2D, authors should display the eGFP image alongside the tdTomato image. It is difficult to assess the efficiency of Emx-Cre and Nkx2.1-Cre.

      We thank the reviewer for the suggestion on eGFP image and have presented eGFP image in Supplementary Figure 1D in the revised manuscript. There are two reasons why we chose to present it in the supplementary figure instead of main figure. First, we added ASPA staining in the green channel along with quantifications of RFP cells as % of ASPA in Figure 2 in the revised manuscript, following reviewer #2’s suggestion. Second, as pointed out by reviewer #2, GFP would flood the sections in the presence of Emx1Cre and could be quite distractive if it was shown together with RFP.

      We were not entirely sure what exactly the reviewer means by “assess the efficiency of Emx-Cre and Nkx2.1-Cre”, but we believe that the quantifications of RFP cells as % of ASPA clarified the contribution of each origin to the total OLs (Figure 2J and 2N in the revised manuscript).

      (7) Figure 3 depicts the entire brain, replicating the image presented in Figure 2. It would be beneficial to consolidate Figures 2 and 3, as they showcase identical brain scans of different regions.

      We thank the reviewer for the constructive suggestion and have consolidated Figures 2 and 3 in the original manuscript into Figure 2 in the revised manuscript.

    1. Author response

      Reviewer #1 (Public Review):

      […] Weaknesses:

      This work explores an interesting question on regulating myoD+ progenitors and the defects of this process in skeletal muscle differentiation by SRFS2 but spreads out in many directions rather than focusing on the key defects. A number of approaches are used, but they lack the robust mechanistic analysis of the defects that result in muscle differentiation. Specifically, the role of SRFS2 on splicing appears to be a misfit here and does not explain the primary defects in the migration of myoD+ progenitors. There are concerns about the scRNA-seq and many transcripts in muscle biology that are not expressed in muscle cells. Focusing on main defects and additional experimental evidence to clear the fusion vs. precocious differentiation vs. reduced differentiation will strengthen this work.

      (1) The analysis of RNA-seq data (Figure 2) is limited, and it is unclear how it relates to the work presented in this MS. The Go enrichment analysis is combined for both up and down-regulated DEG, thus making it difficult to understand the impact differently in both directions. Stac2 is a predominant neuronal isoform (while Stac3 is the muscle), and the Symm gene is not found in the HGNC or other databases. Could the authors provide the approved name for this gene? The premise of this work is based on defects in ECM processes resulting in the mis-targeting of the muscle progenitors to the nonmuscle regions. Which ECM proteins are differentially expressed?

      The GO enrichment analysis (Figure 2B) indicates that genes involved in skeletal muscle construction and function were significantly dysregulated, with both up-regulated and down-regulated genes observed, consistent with the phenotype analysis presented in Figure 1.

      We agree with the reviewer’s comments that Stac3 is the predominant muscle isoform with high expression in skeletal muscle tissues, while stac2 is expressed at low levels in these tissues. Therefore, we decided to delete the Stac2 data from the Figure 2C and will modify the text accordingly. We apologize for our errors.

      In response to the reviewer's comment regarding the Symm gene not being found in the HGNC or other databases, we carefully re-examined the genes presented in Figure 2C. We discovered that one of the genes is actually Synm, which encodes synemin, an intermediate filament protein. We will correct this in the manuscript.

      scRNA-seq analysis revealed defects in ECM processes in SRSF2-deficient myoblasts, which we believe likely resulted in the mis-targeting of muscle progenitors to non-muscle regions. However, comparing RNA-seq results from whole muscle tissues with scRNA-seq results is challenging.

      (2) Could authors quantify the muscle progenitors dispersed in nonmuscle regions before their differentiation? Which nonmuscle tissues MyoD+ progenitors are seen? Most of the tDT staining in the enlarged sections appears to be punctate without any nuclear staining seen in these cells (Figure 3 B, D E-F). Could authors provide high-resolution images? Also, in the diaphragm cross-sections in mutants, tdT labeling appears to be missing in some areas within the myofibers defined as cavities by the authors (marked by white arrows, Figure 3H). Could this polarized localization of tDT be contributing to specific defects?

      tdT staining revealed a substantial presence of MyoD-derived cells distributed beyond the muscle regions, as shown in Figure 3B. Quantify the number of MyoD+ progenitors dispersed in non-muscle regions is not meaningful.

      tdT+ cells also include those that previously expressed MyoD but have since differentiated into myotubes and myofibers, which is why many tdT+ staining is not nuclear.

      MyoD+ cells deficient in SRSF2 either undergo apoptosis or premature differentiation. Consequently, tdT staining in SRSF2-KO muscles showed many irregularities in the muscle fibers.

      (3) Is there a difference in the levels of tDT in the myoD" muscle progenitors that are mis-targeted vs the others that are present in the muscle tissues?

      tdT+ cells include those that previously expressed MyoD but have since differentiated into myotubes and myofibers, which are no longer MyoD+ cells. Additionally, tdT+ also include those currently expressing MyoD, which are MyoD+ cells.

      The fiber differences between WT and SRSF2-KO mice are easily discernible through tdT staining (Figure 2D and 3D), however, comparing the levels of tdT staining between the two groups is not meaningful.

      (4) scRNA is unsuitable for myotubes and myofibers due to their size exclusion from microfluidics. Could authors explain the basis for scRNA-seq vs SnRNA-seq in this work? How are SKM defined in scRNA-data in Figure 4? As the myofibers are small in KO, could the increased level of late differentiation markers be due to the enrichment of these small myotubes/myofibers in scRNA? A different approach, such as ISH/IF with the myogenic markers at E9.5-10.5, may be able to resolve if these markers are prematurely induced.

      SRSF2 is highly expressed in proliferative myoblasts, but its levels declined once differentiation begins. In our study, we used Myod1-Cre to delete the SRSF2 gene and performed the scRNA-seq analysis to examine the effects of SRSF2 deletion on the proliferation and differentiation of MyoD cells. Our analysis revealed that SRSF2 deletion caused proliferation defects and premature differentiation of MyoD cells (Figure 5G), leading to myofiber abnormalities.

      We determined that snRNA-seq analysis is not suitable for our study.

      Additionally, skeletal muscle cells (SKM) were defined based on the expression of skeletal muscle markers, as shown in Figure 4C.

      (5) TNC is a marker for tenocytes and is absent in skeletal muscle cells. The authors mentioned a downregulation of TNC in the KO SKM derived clusters. This suggests a contamination of the tenocytes in the control cells. In spite of the downregulation of multiple ECM genes showed by scRNA-seq data, the ECM staining by laminin in KO in Figure 3 appears to be similar to controls.

      Tenascin-C (Tnc) is also part of the extracellular matrix (ECM) family. scRNA-seq analysis revealed that multiple ECM genes were downregulated in SRSF2-KO myoblasts, however, this did not indicate that laminin was downregulated in the SRSF2-KO muscles.

      (6) The expression of many fusion genes, such as myomaker and myomerger, is reduced in KO, suggesting a primary fusion defect vs a primary differentiation defect. Many mature myofiber proteins exhibit an increased expression in disease states, suggesting them as a compensatory mechanism. Authors need to provide additional experimental evidence supporting precocious differentiation as the primary defect.

      Our analysis revealed that the deletion of SRSF2 caused premature differentiation of MyoD cells (Figure 5G), leading to abnormalities of myofiber formation. SRSF2 is highly expressed in proliferative myoblasts, but its expression declines quickly in myotubes. Therefore, it is unlikely that the low expression of SRSF2 in myotubes caused the primary fusion defect.

      (7) The fusion defects in KO are also evident in siRNA knockdown for SRSF2 and Aurka in C2C12, which mostly exhibits mononucleated myocytes in knockdowns. Also, a fusion index needs to be provided.

      SRSF2 knockdown and Aurka knockdown caused differentiation defects, including fusion defects. We quantified the percentages of both MyoG+ and MHC+ cells in the differentiation assay.

      (8) The last section of the role of SRSF2 on splicing appears to be a misfit in this study. Authors describe the Bin1 isoforms in centronuclear myopathy, but exon17 is not involved in myopathy. Is exon17 exclusion seen in other diseases/ splicing studies?

      Our study is the first to report that exon 17 inclusion of Bin1 is regulated by SRSF2. Specifically, the knockdown of Bin1 exon 17 caused severe differentiation defects in C2C12 myoblasts. The involvement of Bin1 exon 17 in myopathy requires further validation using clinical samples.

      Reviewer #2 (Public Review):

      […] Weaknesses: Although unbiased sequencing methods were used, their findings about SRSF2 served as a transcriptional regulator and functioned in alternative splicing events are not novel. The introductions and discussion is not clearly written. The authors did not raise clear scientific questions in the introduction part. The last paragraph is only copy-paste of the abstract. The discussion part is mainly the repeat of their results without clear discussion.

      While the role of SRSF2 as a transcriptional regulator involved in alternative splicing events is not novel, the specific SRSF2-regulated alternative splicing events and targeted genes in skeletal muscle have not been reported in other publications. We believe our interpretation of the data and comparison with related published studies are well presented in the Discussion section.

    1. Author response:

      Answers to Reviewer #1 (Public Review):

      (1) Tonic and phasic components in Figure 1 are not clear.

      We will reformulate Figure 1A to show how the tonic and phasic components were measured. As this point was also raised by Reviewer #2 (Comment 3), we will explicitly clarify this in the Methods section. We will modify the color scheme to improve clarity.

      (2) Labeling of traces in Figure 4.

      We will add labels to traces informing which sensory pathways were stimulated to produce each response.

      (3) Optic tectum instead of optical tectum.

      We apologize for the error. We will replace “optical tectum” with “optic tectum” as also suggested by Reviewer #2.

      Answers to Reviewer #2 (Public Review):<br /> (1) Complexity of tectum upstream circuitry (Comments 1 and 2).

      Processing of visual information is certainly a major role of the tectum, but it is true that it also receives sensory inputs from other structures including sensory pathways. We will acknowledge this complexity in our revised manuscript along with suggestions for heading titles.

      (2) Figure 1 and associated text. 

      As mentioned in the provisional answer point 1 to Reviewer #1, we will reformulate Figure 1A and clarify how tonic and phasic responses were calculated.

      (3) Figure 3 and associated text.

      We will perform the analysis suggested by the reviewer and move calculations to the Methods section as requested.

      (4) Figure 5C and lines 398-410.

      We will consider omitting Figure 5C or clearly stating its value in the context of the rest of the data and our previous behavioral experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors explore mechanisms through which T-regs attenuate acute pain using a heat sensitivity paradigm. Analysis of available transcriptomic data revealed expression on the proenkephalin (Penk) gene in T-regs. The authors explore the contribution of T-reg Penk in the resolution of heat sensitivity.

      Strengths:

      Investigating the potential role of T-reg Penk in the resolution of acute pain is a strength.

      Weaknesses:

      The overall experimental design is superficial and lacks sufficient rigor to draw any meaningful conclusions.

      We hope that the reviewer will reconsider this severe criticism after examining the updated manuscript and results.

      For instance:

      (1) The were no TAM controls. What is the evidence that TAM does not alter heat-sensitive receptors.

      the impact of TMX on heat perception is not the object of this study. Nevertheless, it appears that heat-sensitivity in controls WT (blue dots) is slightly diminished after TMX administration (Figure 5A), suggesting that heat-sensitive receptors are moderately altered by TMX per se. This reduction is much more pronounced for LOX mice. Thus, although it is possible that TMX play a marginal role on heat sensitivity by itself, the results show a much more pronounced effect of TMX in LOX than in WT, in favor of a role for Penk Treg in heat sensitivity.

      (2) There are no controls demonstrating that recombination actually occurred. How do the authors know a single dose of TAM is sufficient?

      these results are now presented in figure S4. A 70% reduction in Penk mRNA is observed in Treg after a single administration of TMX.

      (3) Why was only heat sensitivity assessed? The behavioral tests are inadequate to derive any meaningful conclusions. Further, why wasn't the behavioral data plotted longitudinally

      The longitudinal data are presented in figure S5A. New behavioral tests have been performed and the results are now shown in figure S5E-H. Importantly, heat sensitivity was observed in two independent laboratory with two different tests.

      Reviewer #2 (Public Review):

      Summary:

      The present study addresses the role of enkephalins, which are specifically expressed by regulatory T cells (Treg), in sensory perception in mice. The authors used a combination of transcriptomic databases available online to characterize the molecular signature of Treg. The proenkephalin gene Penk is among the most enriched transcripts, suggesting that Treg plays an analgesic role through the release of endogenous opioids. In addition, in silico analysis suggests that Penk is regulated by the TNFR superfamily; this being experimentally confirmed. Using flow cytometry analysis, the authors then show that Penk is mostly expressed in Treg of the skin and colon, compared to other immune cells. Finally, genetic conditional excision of Penk, selectively in Treg, results in heat hypersensitivity, as assessed by behavior analysis.

      Strengths:

      The manuscript is clear and reveals a previously unappreciated role of enkephalins, as released by immune cells, in sensory perception. The rationale in this manuscript is easy to follow, and conclusions are well supported by data.

      Weaknesses:

      The sensory deficit of Penk cKO appears to be quite limited compared to control littermates.

      Reviewer #3 (Public Review):

      Summary:

      Aubert et al investigated the role of PENK in regulatory T cells. Through the mining of publicly available transcriptome data, the authors confirmed that PENK expression is selectively enriched in regulatory but not conventional T cells. Further data mining suggested that OX40, 4-1BB as well as BATF, can regulate PENK expression in Tregs. The authors generated fate-mapping mice to confirm selective PENK expression in Tregs and activated effector T cells in the colon and spleen. Interestingly, transgenic mice with conditional deletion of PENK in Tregs resulted in hypersensitivity to heat, which the authors attributed to heat hyperalgesia.

      Strengths:

      The generation of transgenic mice with conditional deletion of PENK in foxp3 and PENK fate-mapping is novel and can potentially yield significant findings. The identification of upstream signals that regulate PENK is interesting but unlikely to be the main reason why PENK is predominantly expressed in Tregs as both BATF and TNFR are expressed in effector T cells.

      Weaknesses:

      There is a lack of direct evidence and detailed analysis of Tregs in the control and transgenic mice to support the authors' hypothesis. PENK was previously reported to be expressed in skin Tregs and play a significant role in regulating skin homeostasis: this should be considered as an alternative mechanism that may explain the changed sensitivity to heat observed in the paper.

      We now provide a detailed analysis of Treg with or without Penk, from their immunosuppressive functions to their colocalization with sensory neurons in the skin, supporting their function as natural analgesics. The alternate hypothesis relative to skin homeostasis is now clearly presented and discussed.

      Recommendations for the authors):

      Reviewer #2 (Recommendations For The Authors):

      Most of my comments should be addressable in a revised manuscript but will require additional analysis.

      Major:

      - According to flow cytometry analysis, Penk is expressed mostly in Treg of the skin and colon. What may account for such restricted expression? Where could Treg-released enkephalins act?

      We now rephrased the paper to emphasize the known role of Batf in tissue Treg differentiation. We believe the Batf dependency of Penk expression is the reason why tissue Treg are more enriched in Penk than Treg from lymphoid organs. This is now clearly discussed.

      We also provide a new figure (Figure S1) that shows that binding of Batf and co factors AP1 and IRF4 were reported to bind to Penk regulatory regions. Altogether, the role of Batf in tissue Treg differentiation would explain why tissue Treg such as colon and skin are particularly enriched in Penk. This is now clearly stated in the revised manuscript. 

      As to know where Treg-released enkephalins act, we performed immunostainings in the skin and observed that Treg could colocalize with sensory neurons (shown in a new figure 5, panel D). This observation raise the hypothesis that  Treg-released enkephalins could act on sensory neurons locally.

      - Which mechanism can underlie heat hypersensitivity in Penk cKO mice? Which sensory neurons are involved? Are other sensory modalities affected, such as mechanical sensitivity?

      As stated above, we show that Treg can be in close contact with thermal sensors neurons producing CGRP. These data are shown in figure 5D. We have also tested may other nociceptive stimulus (innocuous and noxious) and did not detect significant differences. These data are presented as a supplementary figure S5. Whether enkephalins produced by Treg can change the stimulation threshold of various nervous fibers is currently performed by electrophysiology.

      - No control is provided to ensure that Penk is selectively excised in Treg cells in cKO mice.

      We have performed additional experiments with fluorescent probes to document Penk mRNA expression in cKO mice. The results on the specific expression of Penk mRNA in various subsets post-TMX are shown in a supplementary figure S4.

      - The authors acknowledge that Penk from Treg was previously studied in an animal model of inflammatory pain. However, which role these endogenous opioids play is unclear, especially since authors discovered that enkephalins are likely continuously released at steady states. This is not enough discussed in the narrative, which surprisingly does not separate the results from the discussion.

      The results and discussion are now separated in two sections.

      Minors:

      - Replace "Fox3 1" with "Fox31" (line 31), "functions 15" with "functions15" (line 43), "BATF 19" with "BATF19" (line 85).

      - Text mentions Figure S4 (line 125), which is most likely S3.

      Reviewer #3 (Recommendations For The Authors):

      Given the most significant finding of this paper is based on the heat-induced pain model, there is surprisingly little analysis of Tregs in this context. The authors analyzed spleen and colon Tregs at steady state, it is unclear whether any of these Tregs are involved in pain sensitivity directly. Skin Tregs or other relevant Tregs to this model should be analyzed in control and Lox mice. This is particularly relevant as PENK expression was previously reported in skin Tregs and plays a significant role in skin homeostasis (Yamazaki et al 2020 PNAS). Does PENK conditional deletion alter Treg frequencies, numbers, and immune suppressive function? Not even spleen or colon Treg were analyzed comparing control and lox mice.

      We now provide evidences showing unaltered immunosuppressive functions of Treg in the absence of Penk (Figure 4), and more importantly unaffected proportions of skin Treg in mice lacking Penk in Treg, at the very site of heat stimulation (Figure 5B-C). We also observed unaffected representation of Treg in the spleen and lymph nodes, but we do not feel that these data are necessary to interpret the results.

      Given the role of PENK in skin Tregs, could the observed effect in Figure 4 be due to altered skin homeostasis rather than sensitivity to pain?

      The reviewer is referring to a paper where Penk in skin Treg play a role on UV-damaged keratinocytes in vivo (Shime et al., 2020, PNAS). To our knowledge, a role for Penk produced by skin Treg on keratinocytes homeostasis at the steady state is currently unknown. Nevertheless, this hypothesis is now clearly stated and discussed in the manuscript.

      The authors stated that only after 7 days post tamoxifen treatment was heat hyperalgesia observed: deletion of PENK in Treg but not Tconv should be confirmed: is deletion only complete after 7 days or is the effect observed due to indirect effects of altered "normal" Treg function?

      We have performed a kinetics to document Penk deletion at D3, D7 and 30 post-TMX. Results show a specific deletion of Penk in Treg at all time points so we combined all the time points for the representation of the results (Figure S4). As for the indirect effects of “altered” normal function, we now provide the reader with a new figure (Figure 4), showing that Penk deficient Treg are not impaired in their suppressive function in vitro and in vivo.  

      Actual protein/peptide production of enkephalins by Tregs should be confirmed. It is also unclear which peptide(s) can be secreted and presumably responsible for the changes in heat sensitivity.

      This is a very interesting question that we addressed with a MENK ELISA but without success at reproducing the results. An ongoing project will use mass spectrometry to fully characterize the peptides produced by Treg and activated Tconv.

      The analysis of PENK regulation by Tregs is interesting despite them being entirely based on data mining. BATF is a pioneering factor expressed by all activated effector T cells. While the connection between BATF and PENK may explain why the authors observed PENK expression chiefly in activated effectors and Tregs, BATF cannot be the reason why PENK is "predominantly" expressed by Tregs. Similarly, 4-1BB and OX40 can be induced on effector T cells. Is PENK under the control of Foxp3? There are lots of publically available datasets on Foxp3/IL-2 dependent Treg signatures through which this can be addressed.

      We now provide a supplementary figure (Figure S1), showing a compilation of ChIP Seq studies for various transcription factors in various T cell subsets. We provide the reader with a list of all the TF that have been reported to bind in the regulatory regions of Penk. In agreement with our hypothesis, BATF, FOXP3, IRF4 and several others are present in that list. Further work is needed to decipher the exact contribution of each of those TF to the regulation of Penk in Treg vs activated Tconv that is beyond the scope of this report.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work identified new NMD inhibitors and tested them for cancer treatment, based on the hypothesis that inhibiting NMD could lead to the production of cancer neoantigens from the stabilized mutant mRNAs, thereby enhancing the immune system's ability to recognize and kill cancer cells. Key points of the study include:

      • Development of an RNA-seq based method for NMD analysis using mixed isogenic cells that express WT or mutant transcripts of STAG2 and TP53 with engineered truncation mutations.

      • Application of this method for a drug screen and identified several potential NMD inhibitors.

      • Demonstration that one of the identified compounds, LY3023414, inhibits NMD by targeting the SMG1 protein kinase in the NMD pathway in cultured cells and mouse xenografts.

      • Due to the in vivo toxicity observed for LY3023414, the authors developed 11 new SMG1 inhibitors (KVS0001-KVS0011) based on the structures of the known SMG1 inhibitor SMG1i-11 and the SMG1 protein itself.

      • Among these, KVS0001 stood out for its high potency, excellent bioavailability, and low toxicity in mice. Treatment with KVS0001 caused NMD inhibition and increased presentation of neoantigens on MHC-I molecules, resulting in the clearance of cancer cells in vitro by co-cultured T cells and cancer xenografts in mice by the immune system.

      These findings support the strategy of targeting the NMD pathway for cancer treatment and provide new research tools and potential lead compounds for further exploration.

      Strengths:

      The RNA-seq-based NMD analysis, using isogenic cell lines with specific NMD-inducing mutations, represents a novel approach for the high-throughput identification of potential NMD modulators or genetic regulators. The effectiveness of this method is exemplified by the identification of a new activity of AKT1/mTOR inhibitor LY3023414 in inhibiting NMD.

      The properties of KVS0001 described in the manuscript as a novel SMG1 inhibitor suggest its potential as a lead compound for further testing the NMD-targeting strategies in cancer treatment. Additionally, this compound may serve as a useful research tool.

      The results of the in vitro cell killing assay and in vivo xenograft experiments in both immuno-proficient and immune-deficient mice indicate that inhibiting NMD could be a viable therapeutic strategy for certain cancers.

      Weaknesses:

      The authors did not address the potential effects of NMD/SMG1 inhibitors on RNA splicing. Given that the transcripts of many RNA-binding proteins are natural targets of NMD, inhibiting NMD could significantly alter splicing patterns. This, in turn, might influence the outcomes of the RNA-seq-based method for NMD analysis and result interpretation.

      This is a very important comment that highlights an important aspect of NMD and potential exciting downstream studies. We did not systematically assess RNA splicing in our work as we are not sure if inhibition of NMD would induce cancer specific splicing that would allow for tumor targeting. It is well established that NMD can impact splicing, including modulating cryptic exon expression, but finding and assessing antigenicity of targetable tumor specific antigens constitutes a study in and of its own. Our own data in figure 4C-F supports this, as a point mutation near a splice site in TP53 strongly induced NMD which was subsequently stopped by KVS0001 treatment. Doing a systematic review of this effect we feel is outside the scope of this manuscript. We’ve incorporated a comment into our discussion highlighting this deficiency, but certainly find the idea of mining RNA-splicing changes an exciting next endeavor.

      While the RNA-seq-based approach offers several advantages for analyzing NMD, the effects of NMD/SMG1 inhibitors observed through this method should be confirmed using established NMD reporters. This step is crucial to rule out the possibility that mutations in STAG2 or TP53 affect NMD in cells, as well as to address potential clonal variations between different engineered cell lines.

      This is possible, but we want to highlight that all hits from the screen were confirmed in a separate cell line with different clones. While this will not rule out effects to NMD due to STAG2 and TP53 knockdown, the final lead compound was also tested on different endogenous transcripts in both indel and normal transcripts controlled by NMD (i.e., ATF4) in multiple species (human and mouse).  Importantly, many of these assays employed the non-mutated transcripts from heterozygous mutant cells to ensure that cis-acting NMD was being measured and to control for any trans-acting splicing or other unanticipated biochemical effects.

      The results from the SMG1/UPF1 knockdown and SMG1i-11 experiments presented in Figure 3 correlate with the effects seen for LY3023414, but they do not conclusively establish SMG1 as the direct target of LY3023414 in NMD inhibition. An epistatic analysis with LY3023414 and SMG1-knockdown is needed.

      This is a great comment, and is supported by the recent push to confirm drug targets by chemical probes or knockout followed by loss of further effect due to the application of the drug in question. We attempted to knockout SMG1 in multiple cells lines used in this study, including RPE1, MCF10A, NCI-H358 and LS180, and were unable to obtain clones that have biallelic out of frame indels. We were able to obtain multiple clones with in frame indels. Based on our results and those in the publicly available database DepMap we suspect this gene is likely essential, making a simple knockout unfeasible. While this uncertainty is important to keep in mind, we feel it does not detract from the reporting of a novel NMD screen that is mechanistically agnostic and of a novel in vivo active NMD inhibitor.

      Reviewer #2 (Public Review):

      Summary:

      Several publications during the past years provided evidence that NMD protects tumor cells from being recognized by the immune system by suppressing the display of neoantigens, and hence NMD inhibition is emerging as a promising anti-cancer approach. However, the lack of an efficacious and specific small-molecule NMD inhibitor with suitable pharmacological properties is currently a major bottleneck in the development of therapies that rely on NMD inhibition. In this manuscript, the authors describe their screen for identifying NMD inhibitors, which is based on isogenic cell lines that either express wild-type or NMD-sensitive transcript isoforms of p53 and STAG2. Using this setup, they screened a library of 2658 FDA-approved or late-phase clinical trial drugs and had 8 hits. Among them they further characterized LY3023414, showing that it inhibits NMD in cultured cells and in a mouse xenograft model, where it, however, was very toxic. Because LY3023414 was originally developed as a PI3K inhibitor, the authors claim that it inhibits NMD by inhibiting SMG1. While this is most likely true, the authors do not provide experimental evidence for this claim. Instead, they use this statement to switch their attention to another previously developed SMG1 inhibitor (SMG1i-11), of which they design and test several derivatives. Of these derivatives, KVS0001 showed the best pharmacological behavior. It upregulated NMD-sensitive transcripts in cultured cells and the xenograft mouse model and two predicted neoantigens could indeed be detected by mass spectrometry when the respective cells were treated with KVS0001. A bispecific antibody targeting T cells to a specific antigen-HLA complex led to increased IFN-gamma release and killing of cancer cells expressing this antigen-HLA complex when they were treated with KVS0001. Finally, the authors show that renal (RENCA) or lung cancer cells (LLC) were significantly inhibited in tumor growth in immunocompetent mice treated with KVS0001. Overall, this establishes KVS0001 as a novel and promising ant-cancer drug that by inhibiting SMG1 (and therewith NMD) increases the neoantigen production in the cancer cells and reveals them to the body's immune system as "foreign".

      Strengths:

      The novelty and significance of this work consists in the development of a novel and - judging from the presented data - very promising NMD inhibiting drug that is suitable for applications in animals. This is an important advance for the field, as previous NMD inhibitors were not specific, lacked efficacy, or were very toxic and hence not suitable for animal application. It will be still a long way with many challenges ahead towards an efficacious NMD inhibitor that is safe for use in humans, but KVS0001 appears to be a molecule that bears promise for follow-up studies. In addition, while the idea of inhibiting NMD to trigger neoantigen production in cancer cells and so reveal them to the immune system has been around for quite some time, this work provides ample and compelling support for the feasibility of this approach, at least for tumors with a high mutational burden.

      Main weaknesses:

      There is a disconnect between the screen and the KVS0001 compound, that they describe and test in the second part of the manuscript since KVS0001 is a derivative of the SMG1 inhibitors developed by Gopalsamy et al. in 2012 and not of the lead compound identified in the screen (LY3023414). Because of high toxicity in the mouse xenograft experiments, the authors did not follow up LY3023414 but instead switched to the published SMG1i-11 drug of Gopalsamy and colleagues, a molecule that is widely used among NMD researchers for NMD inhibition in cultured cells. Therefore, in my view, the description of the screen is obsolete, and the paper could just start with the optimization of the pharmacological properties of SMG1i-11 and the characterization of KVS0001. Even though the screen is based on an elegant setup and was executed successfully, it was ultimately a failure as it didn't reveal a useful lead compound that could be further optimized.

      This is a helpful observation from an outside perspective. From our point of view, we were only alerted to the targeting SMG1 due to the previously reported off-target effects of LY3023414 on SMG and lack of plausible explanation for PIK3CA inhibition to efficiently inhibit NMD. We do feel that the screen is worth including for two reasons. First, it offers an unbiased approach for querying the entire NMD pathway for vulnerabilities useful to target. The library chosen was quite small, so the screen itself could be useful to others with larger libraries to test. Second, it did help identify SMG1 as the ideal target for NMD disruption. While targeting SMG1 is not novel, we felt it highlighted why we chose to develop KVS0001. To address this reviewer’s comment, we’ve included a couple sentences in the results and discussion strengthening the point that the screen provided an unbiased approach to finding the best target in the pathway to disrupt NMD and elaborating on the transition from LY3023414 and the screen to development of KVS0001.

      Additional points:

      - Compared to SMG1i-11, KVS0001 seems less potent in inhibiting SMG1 (higher IC50). It would therefore be important to also compare the specificity of both drugs for SMG1 over other kinases at the applied concentrations (1 uM for SMG1i-11, 5 uM for KVS0001). The Kinativ Assay (Fig. S13) was performed with 100 nM KVS0001, which is 50-fold less than the concentration used for functional assays and hence not really meaningful. In addition, more information on the pharmacokinetic properties and toxicology of KVS0001 would allow a better judgment of the potential of this molecule as a future therapeutic agent.

      We agree that the Kinativ assay may have poorly represented the activity of KVS0001 at the bioactive concentration. We have now added 1uM Kinativ data, the highest concentration we were able to run to figure S13.

      - On many figures, the concentrations of the used drugs are missing. Please ensure that for every experiment that includes drugs, the drug concentration is indicated.

      We apologize for this oversight and have added all drug concentrations on the appropriate plots.

      - Do the authors have an explanation for why LY3023414 has a much stronger effect on the p53 than on the STAG2 nonsense allele (Figure 1B, S8), whereas emetine upregulates the STAG2 nonsense alleles more than the p53 nonsense allele (Figure S5). I find this curious, but the authors do not comment on it.

      This is an interesting observation. The short answer is we’re not sure. The speculative answer is that it is related to the distinctly different mechanisms of actions of the two inhibitors (see comments from reviewing editor below).

      - While it is a strength of the study that the NMD inhibitors were validated on many different truncation mutations in different cell lines, it would help readers if a table or graphic illustration was included that gives an overview of all mutant alleles tested in this study (which gene, type of mutation, in which cell type). In the current version, this information is scattered throughout the manuscript.

      This is an excellent suggestion. We’ve included a new table S1 which incorporates the details of each cell line and the genes used in each for this study.

      - Lines 194 and 302: That SMG1i-11 was highly insoluble in the hands of the authors is surprising. It is unclear why they used variant 11j, since variant 11e of this inhibitor is widely used among NMD researchers and readily dissolves in DMSO.

      As this referee notes SMG1i-11 is soluble in DMSO in our hands as well, which enabled us to use it for our in vitro work. Unfortunately, the concentrations of DMSO required to dissolve the compound to suitable concentrations for in vivo work were too high to safely use in mice with our animal protocols. We also attempted to use ethanol, which also did dissolve SMG1i-11, but led to a significant amount of toxicity in both the drug and vehicle control arms.

      - Line 296: The authors claim that they were able to show that LY3023414 inhibited the SMG1 kinase, which is not true. To show this, they would have for example to show that LY3023414 prevents SMG1-mediated UPF1 phosphorylation, as they did for KVS0001 and SMG1i-11 in Fig. 3F. Unless the authors provide this data, the statement should be deleted or modified.

      We’ve modified this statement as requested by the referee, now saying we suspected SMG1 was the target based on previously published work.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      Your paper has been assessed by two reviewers with expertise in the NMD field. They both find the identification and characterization of a new potent and selective inhibitor of the SMG1 NMD kinase with in vivo activity to represent a significant advance in the field, and one that could ultimately be of value as the basis for a novel cancer therapy. However, as you will see both reviewers have concerns about whether the SMG1 inhibitor screen you developed belongs in the paper because it was not used to identify the KVS0001 inhibitor, which instead was generated based on a previously published set of SMG1 inhibitors, and because the NMD inhibitor that did emerge from your screen, LY3023414, was not shown to be a direct inhibitor of SMG1 kinase activity. While it is an elegant screen, during the revision of the paper you could consider streamlining the manuscript by emphasizing how the screening assay was used to validate KVS0001, and bolstering the characterization of the new KVS0001 NMD inhibitor by conducting the proposed additional experiments.

      Each of the reviewers raises additional points that should be addressed in a revised version.

      The reviewing editor has two additional points:

      (1) While emetine inhibits NMD, it is not really a direct NMD inhibitor, as implied, but rather a potent protein synthesis elongation inhibitor that acts by binding to the E-site of the 40S ribosomal subunit, and is therefore, like anisomycin, another protein synthesis inhibitor, working indirectly to inhibit NMD. This should be acknowledged in the section where emetine is first used as an "NMD inhibitor".

      This has been included in the indicated section at the referee’s request.  

      (2) To establish that the observed phenotypic effects of KVS0001 are due to on-target inhibition of SMG1, the authors could generate and express an SMG1 point mutant that is resistant to KVS0001 inhibition, which could be based on the SMG1 catalytic domain structure that the authors used originally to design KVS001. Inhibitor-resistant kinase mutants are the gold standard for demonstrating that the biological consequences of a novel protein kinase inhibitor are due to on-target effects. Admittedly, because SMG1 is such a huge protein, this may be technically challenging and is likely beyond the scope of the present paper.

      -We agree with the reviewing editor on all accounts: this would be an ideal experiment to run, but also that it is beyond the scope of the present paper. As indicated in our discussion above with reviewer 1, SMG1 knockout was not possible in our hands, and we suspect it may be due to the gene being essential. Creating an inhibitor resistant mutant could overcome this issue and create an ideal model to test the target for KVS0001. Unfortunately finding such a mutant would likely require significant amounts of trial and error to create a resistant mutant that did not lose SMG1 function. And SMG1 is huge, creating technical issues for experimenting. Due to the anticipated amount of work for such a study we believe this would be better accomplished in future studies.

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors did not mention a new SMG1 inhibitor and its effects described in Cheruiyot et al, Cancer Res 2019 (PMID: 34215620).

      A comment regarding this discovery and its implications for our work was added to the discussion.

      (2) There is an inconsistency between the manuscript text and methods sections regarding the time of drug treatment (16 hours vs 14 hours) in the HTS screen.

      This has been double checked in our notebook and fixed to reflect 16hrs as the correct incubation time. Thank you for identifying that clerical oversight.

      Reviewer #2 (Recommendations For The Authors):

      (1) Line 61: The references to NMD reviews are very old (refs 20 and 21). I suggest citing more recent, up-to-date reviews instead.

      Two additional references, one from 2016 and another from 2023, have been added to increase support for this statement in the introduction.

      (2) Figure S1: Shouldn't the caption of the right panel (TP32 data) say "clone 221" rather than "clone 22"?

      This has been fixed.

      (3) Figure S18: Please indicate on the y-axis that you are displaying RPKM for p53.

      This has been fixed.

      (4) Figures 4D and S19: Please indicate concentrations used for all drugs.

      This has been fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The authors investigate pleiotropy in the genetic loci previously associated to a range of neuropsychiatric disorders: Alzheimer's disease, amyotrophic lateral sclerosis (ALS), frontotemporal dementia, Parkinson's disease, and schizophrenia. The local statistical fine-mapping and variant colocalisation approaches they use have the potential to uncover not only shared loci but also shared causal variants between these disorders. There is existing literature describing the pleiotropy between ALS and these other disorders but here the authors apply state of the art, local genetic correlation approaches to further refine any relationships. 

      Complex disease and GWAS is not my area of expertise but the authors managed to present their methods and results in a clear, easy to follow manner. Their results statistically support several correlations between the disorders and, for ALS and AD, a shared variant in the vicinity of the lead SNP from the original ALS GWAS. Such findings could have important implications for our understanding of the mechanisms of such disorders and eventually the possibility of managing and treating them. 

      The authors have built a useful pipeline that plugs together all the gold-standard, existing software to perform this analysis and made it openly available which is commendable. However, there is little discussion of what software is available to perform global and local correlation analysis and, if there are multiple tools available, why they consider the ones they selected to be the gold-standard. 

      There is some mention of previous findings of genetic pleiotropy between ALS and these other disorders in the introduction, and discussion of their improved ALS-AD evidence relative to previous work. However, detailed comparisons of their other correlations to what was described before for the same pairs of disorders (if any) is missing. Adding this would strengthen the impact of this paper. 

      Finally, being new to this approach I found the abstract a little confusing. Initially, the shared causal variant between ALS and AD is mentioned but immediately in the following sentence they describe how their study "suggested that disease- implicated variants in these loci often differ between traits". After reading the whole paper I understood that the ALS-AD shared variant was the exception but it may be best to restructure this part of the abstract. Additionally, in the abstract the authors state that different variants "suggests the role of distinct mechanisms across diseases despite shared loci". Is it not possible that different variants in the same regulatory region or protein-coding parts of a gene could be having the same effect and mechanism? Or does the methodology to establish that different variants are involved automatically mean that the variants are too distant for this to be possible? 

      We thank reviewer one for their considered review of this manuscript and for highlighting points that would benefit from further exploration. Itemised responses are provided below.

      (1) The reviewer noted that we did not adequately explain our choice of software for global and local genetic correlation analysis, and why we consider the techniques chosen as gold standard. We agree that the paper would benefit from clarification around this aspect of the study.

      Briefly, we firstly selected LAVA for the local genetic correlation analysis because it offers several advantages above competing software and was developed by a reputable team previously known for developing MAGMA, which is well-established in the statistical genetics field. In the manuscript (page 8), we added the following clarification: “LAVA was the most appropriate local genetic correlation approach for this study for several reasons. First, unlike SUPERGNOVA and rho-HESS, LAVA makes specific accommodations for analysis of binary traits. Second, other tools focus on bivariate correlation between traits whilst LAVA offers this alongside multivariate tests such as multiple regression and partial correlation, enabling rigorous testing of pleiotropic effects. Lastly, LAVA is shown to provide results which are less biased than those from other tools.”

      LDSC was selected for the global genetic correlation analysis because the software is well-established and likely the most widely adopted global genetic correlation tool. Reflecting its prevalence, the software is also compatible with LAVA, which adjusts for sample overlap based on the bivariate intercept estimate returned by LDSC. Since global genetic correlations were not the primary focus of this study, having been tested across several previous investigations (see response 2), we did not prioritise comparison of correlation estimates from LDSC against other available software. In the manuscript (pages 7-8) we now include the following statement: “[LDSC] was also applied to derive ‘global’ (i.e., genome-wide) genetic correlation estimates between trait pairs and estimate sample overlap from the bivariate intercept. The latter of these outputs was taken forward as an input for the local genetic correlation analysis using LAVA (see 2.2.2.2). Since global genetic correlation analysis across the traits studied here is not novel and associations reported in past studies are congruent across different tools, the compatibility between LDSC and LAVA motivated our use of LDSC for this analysis”.

      (2) The second comment was that the paper would be strengthened by contextualising our study with detail around what is previously known about associations between the studied traits. Accordingly, we have added clarifying text at the end of the introduction, stating: “although previous studies have performed global genetic correlation analyses between various combinations of these traits {references}, this is the first to compare them at a genome-wide scale using a local genetic correlation approach“. In the discussion, we link back to these studies, stating that “Through genetic correlation analysis, we replicated genome-wide correlations previously described between the studied traits {references}”.

      (3) The reviewer highlighted that the abstract as originally written may mislead or confuse the reader and we agree that clarity could be improved with some restructuring. This has now been revised and should read more logically.

      (4) They also enquired about our reasons for suggesting that the implication of distinct variants for each trait from a colocalisation analysis suggests a distinct causal mechanism. We thank them for this question as it encouraged us to reconsider how best to present the results of this analysis. To answer their question:

      It is certainly true that nearby but distinct variants can confer the same effect. In a scenario where multiple distinct variants result in the same effect and thus increase susceptibility towards two or more related phenotypes, you would expect to find evidence of association to each relevant variant in GWAS across these related traits (even if the magnitude of the associations differ). Where biological mechanisms are shared, post-GWAS finemapping analysis would be expected to yield credible sets overlapping across the traits, and likewise, colocalisation analysis should converge on a set of credible SNPs that are candidates for the shared effect. Where multiple distinct variants confer the same effect, you would expect to see separate fine-mapping credible sets for these distinct variants that colocalise pairwise between the jointly-affected traits. Generally, therefore, evidence supporting the two distinct variants hypothesis would suggest the role of two distinct mechanisms except when certain credible sets identified through fine-mapping converge on a colocalised effect.

      There is a further caveat which we also explored in response to Reviewer two: if a region includes long-spanning LD (and hence a larger number of variants are considered in the analysis), then the colocalisation analysis is more likely to favour the two distinct variants hypothesis since the probability of the variants implicated in both traits being shared decreases. It is likely that support for the two independent variants hypothesis is correct in most of the comparisons from this study that favour this conclusion. This is because, generally, the fine-mapping credible sets do not overlap across trait pairs (Figure S4) and consequently the colocalisation analysis does not find any support for the shared variant hypothesis. An exception is the analysis of PD and schizophrenia at the MAPT locus on chromosome 17. We have accordingly added the following clarification to the (page 18): “However, the colocalisation analysis will increasingly favour the two independent variants hypothesis as the number of analysed variants increases. Hence, the wide-spanning LD of this region may have obstructed identification of variants and mechanisms shared between the traits.”

      Reviewer #2 (Public Review): 

      Summary: 

      Spargo and colleagues present an analysis of the shared genetic architectures of Schizoprehnia and several late-onset neurological disorders. In contrast to many polygenic traits for which global genetic correlation estimates are substantial, global genetic correlation estimates for neurological conditions are relatively small, likely for several reasons. One is that assortative mating, which will spuriously inflate genetic correlation estimates, is likely to be less salient for late-onset conditions. Another, which the authors explore in the current manuscript, is that some loci affecting two or more conditions (i.e., pleiotropic loci) may have effects in opposite directions, or shared loci are sparse, such that the global genetic correlation signal washes out. 

      The authors apply a local genetic correlation approach that assesses the presence and direction of pleiotropy in much smaller spatial windows across the genome. Then, within regions evidencing local genetic correlations for a given trait pair, they apply fine-mapping and colocalization methods to attempt to differentiate between two scenarios: that the two traits share the same causal variant in the region or that distinct loci within the region influence the traits. Interestingly, the authors only discover one instance of the former: an SNP in the HLA region appearing to confer risk for both AD and ALS. This is in contrast to six regions with distinct causal loci, and twenty regions with no clear shared loci. 

      Finally, the authors have published their analysis pipeline such that other researchers might easily apply the same techniques to other collections of traits. 

      Strengths: 

      - All such analysis pipelines involve many decision points where there is often no clear correct option. Nonetheless, the authors clearly present their reasoning behind each such decision. <br /> - The authors have published their analytic pipeline such that future researchers might easily replicate and extend their findings. 

      Weaknesses:

      - The majority of regions display no clear candidate causal variants for the traits, whether shared or distinct. Further, despite the potential of local genetic correlation analysis to identify regions with effects in opposing directions, all of the regions for causal variants were identified for both traits evidenced positive correlations. The reasons for this aren't clear and the authors would do well to explore this in greater detail. 

      - The authors very briefly discuss how their findings differ from previous analyses because of their strict inclusion for "high-quality" variants. This might be the case, but the authors do not attempt to demonstrate this via simulation or otherwise, making it difficult to evaluate their explanation. 

      We thank Reviewer two for their appraisal of this manuscript and kind comments regarding its strengths. We will now aim to address the identified weaknesses.

      (1) The reviewer comments that we did not adequately investigate why loci with causal variants identified in both traits all had positive local genetic correlations. We agree that it would be helpful to better understand the underlying reasons. To address this issue, we have added a new supplementary figure to compare the positive and negative local genetic correlation results (see Figure S2). In the main-text we add the following clarification. ”Although both positive and negative local genetic correlations passed the FDR-adjusted significance threshold, we observed only positive local genetic correlations in loci where fine-mapping credible sets were identified for both traits in the pair. This reflects that the correlation coefficients and variant associations from the analysed GWAS studies were generally stronger in the positively correlated loci (see Figure S2).”

      (2) The reviewer rightly suggests that the manuscript would benefit from an improved explanation of the somewhat inconsistent results for the colocalisation analysis of ALS and AD at the locus around the rs9275477 SNP from this work and a previous study.  We have now further investigated this and believe that the discrepancy results partly from an inherent empirical characteristic of the colocalisation analysis. We have explained this in the manuscript (page 22) as follows: “The previous study analysed a 200Kb window of over 2,000 SNPs around the lead genome-wide significant SNP from the ALS GWAS, rs9275477, and found ~0.50 posterior probability for each of the shared and two independent variant(s) hypotheses. The current analysis used 475 SNPs occurring within a semi-independent LD block of ~50kb in this locus. Since the posterior probability of the two independent variants hypothesis (H3) increases exponentially with the number of variants in the region whilst the shared variant hypothesis (H4) scales linearly, it is expected that our analysis would give stronger support for the latter. Given that the previous study defined regions for analysis based on an arbitrary window of ±100kb around each lead genome-wide significant SNP from the ALS GWAS and we defined each analysis region based on patterns of LD in European ancestry populations, it is reasonable to favour the current finding.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to recommendations

      Reviewer #1 (Recommendations For The Authors):

      Describe more precisely how gene expression graphs are built (tissues, reads counts). For example, how were read counts normalized? Were they from DESeq2 data, which only works by comparing two samples? If so, all samples should be independently compared to a reference and the normalized expression value of the reference will change from sample to sample... thus introducing a pure technical artifact.

      We have added additional information about the normalisation method to the

      Material and Methods section (Lines 597-598: “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.”) and figure legends

      (lines 247, 286, 372, 404: “Gene expression data was generated from whole fish.

      Expression levels were derived from DESeq2 normalised gene counts.”) to address this recommendation. 

      DESeq2 provides a reference independent normalisation through a median of ratios method (a good explanation can be found here:

      https://hbctraining.github.io/DGE_workshop/lessons/02_DGE_count_normalization.h tml). The normalised expression values are independent of any reference, and therefore will not change from sample and sample as suggested in this comment. In contrast, the pairwise comparisons are done when analysing significantly differentially expressed genes between two treatments using a Wald test, which is done against a reference and generates log2 fold change information and p-values.; however, this is different to the normalisation we described above.

      Provide bioinformatics workflows and, if possible, the set of parameters used, the computing resources, etc. Were some assembly finishing steps carried out (by long-range PCR?) and experimental validations (especially for allelespecific transcripts, by conventional RT-PCR based on diagnostic mutations)?

      We have added additional information on the bioinformatics workflows where required, including parameters used (Lines 530, 536, 549-551, and 574-583.). No finishing steps other than HiC scaffolding were performed. No allele-specific analysis was done as part of this manuscript.

      To further improve transparency, we have also uploaded all the scripts used for this study to https://github.com/R-Huerlimann/Malabar_grouper_genome and the gene models and functional annotation to https://figshare.com/projects/Malabar_grouper_Epinephelus_malabaricus_genome_ annotation/199909. This information has been added to the manuscript in lines 600601 and 609-611.

      Reviewer #3 (Recommendations For The Authors):

      General author response:

      All the recommendations of this reviewer are very relevant and would certainly provide a lot of information, but they are constituting a full project in themselves as they would imply establishing this grouper species as an experimental model in our lab. Currently we only have access to the larval and juvenile stages via a collaboration with the Okinawa Prefectural Sea Farming Center, which is an hour drive from our lab, and is limited to the grouper spawning season. If we want to do all what is suggested, we need to have a regular and easy access to the fishes. This would require establishing this model in our marine station, which is not possible due to space and time issues. These groupers grow to a very large size (1-2 m in length, and up to 150 kg in weight) and only mature into males after > 6 years.

      First and foremost, I would advise the authors to extend their TH and cortisol levels measurements to the entire developmental time considered in their analysis.

      For the reasons stated above we could not perform these experiments. We must emphasize that the data regarding TH are available for a closely related species (e.g., Epinephelus coioides, de Jesus et al. 1998) and there is no reason to think that the situation will be drastically different in E. malabaricus. In addition, given that we have now studied several coral reef fish species in the same context (clownfish, surgeonfish, damselfish, gobies) we observed that the transcriptomic data are more robust, more sensitive, and more precise than hormone measurements. 

      Consider carrying out in situ hybridisation of TSH with putative CRH receptors to determine if thyrotrophin could be competent to respond to HPA axis signals.

      We agree studying the interplay between corticoids and thyroid hormones at the neuroendocrine level would be desirable and we fully agree with the experiment suggested by the reviewer, but this is impossible in our current situation. We are not working with an establish animal model like zebrafish or Xenopus, but with a large, long-lived marine fish that reproduces in spawning aggregations and whose husbandry is notoriously difficult.

      Consider conducting cortisol treatment experiments to functionally determine if indeed cortisol is involved in grouper metamorphosis.

      We tried to do TH and cortisol treatments specifically on the early larval stages corresponding to the early TH peak to see how this would impact the development of the fin spines, but our trials were unsuccessful. The larvae at that stage are extremely fragile and even putting them into small volumes of treatment drugs induced massive mortalities. Again, this would mean establishing this grouper species as a model organism and would require a massive effort to improve larval rearing as discussed above. We feel that our data stands on its own in the meantime and adds valuable information to the existing literature by studying a rarely investigated species.

      Responses to comments

      Reviewer #1 (Public Review):

      Weaknesses:

      The manuscript needs proper editing and is not complete. Some wordings lack precision and make it difficult to follow (e.g. line 98 "we assembled a chromosome-scale genome of ..." should read instead "we assembled a chromsome-scla genome sequence of ...". Also, panel Figure 2E is missing.

      We made the suggested change of adding “sequence” in lines 32 and 121. Concerning additional changes, we have carefully edited our manuscript and looked for any incomplete sections. Unfortunately, it is difficult to see what other issues are being raised here without any further information. 

      As for panel E of figure 2, it is not missing. The panel is located to the right, just below “Target Cells”.

      The shortcomings of the manuscripts are not limited to the writing style, and important technical and technological information is missing or not clear enough, thereby preventing a proper evaluation of the resolution of the genomic resources provided:

      Several RNASeq libraries from different tissues have been built to help annotate the genome and identify transcribed regions. This is fine. But all along the manuscript, gene expression changes are summarized into a single panel where it is not clear at all which tissue this comes from (whole embryo or a specific tissue ?), or whether it is a cumulative expression level computed across several tissues (and how it was computed) etc. This is essential information needed for data interpretation.

      No fertilised eggs or embryos have been sequenced. The individual tissues derived from juvenile fish were used for the genome annotation only, using ISOseq. The whole larval fish were used for the developmental analysis using RNAseq, as well as the genome annotation. We have added additional information in the figures and text that the results shown are from whole larvae, and added more detail to the material and methods section about which type of sample was analysed in which way.

      Specifically, we have added “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.” to lines 597-598 in the Material and Methods section, “Gene expression data was generated from whole larvae.” to line 191, and “Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.” to the figure legends in lines 247, 286, 372, 404). Additionally, we have added clarifications in lines 489, 497, 530, and 536. 

      The bioinformatic processing, especially of the assemble and annotation, is very poorly described. This is also a sensitive topic, as illustrated by the numerous "assemblathon" and "annotathon" initiatives to evaluate tools and workflows. Importantly, providing configuration files and in-depth description of workflows and parameter settings is highly recommended. This can be made available through data store services and documents even benefit from DOIs. This provides others with more information to evaluate the resolution of this work. No doubt that it is well done,but especially in the field of genome assembly and annotation, high resolution is VERY cost and time-intensive. Not surprisingly, most projects are conditioned by trade-offs between cost, time, and labor. The authors should provide others with the information needed to evaluate this.

      We have added additional information on parameters used in the genome assembly, annotation and transcriptome analysis in lines 549-551, 577, 579, 580, and 582. Additionally, we have uploaded all scripts to github as outlined in the Code and Data Availability section (lines 599-614).

      The genome assembly did not use a specific workflow (e.g., nextflow), but was done with a simple command and standard parameters in IPA. Scaffolding was carried out by Phase Genomics using their standardised proprietary workflow, of which a detailed description provided by Phase Genomics can be found in the supplementary material.

      Quantifications of T3 and T4 levels look fairly low and not so convincing. The work would clearly benefit from a discussion about why the signal is so low and what are the current technological limitations of these quantifications.

      This would really help (general) readers.

      The T3/T4 levels are consistent with other published work in fish. In the present manuscript for grouper we have a peak level of 1.2 ng/g (1,200 pg/g) of T4 and 0.06 ng/g (60 pg/g) of T3. This is a higher level of T4 and comparable level of T3 to what was found in convict tang (Holzer et al. 2017; Figure 2) with 30 pg/g of T4 and 100 pg/g of T3. Of course, there are also examples with higher levels, such as clownfish (Roux et al. 2023; Figure 1), with 10 ng/g (10,000 pg/g) of T4 and 2 ng/g (2,000 pg/g) of T3.

      The differences could be due to different structure of fish tissues and therefore different hormone extraction efficiency, different hormone measurement protocols, different fish physiology, different fish size (e.g., the weighting of tiny grouper larvae is difficult and less precise than in convict tang). What is important is not the absolute level but the relative level, which shows the change within different larval stages of a species with identical extraction and measurement protocols. Which means our data is internally consistent and coherent with what the grouper literature says.

      Holzer, Guillaume, et al. "Fish larval recruitment to reefs is a thyroid hormonemediated metamorphosis sensitive to the pesticide chlorpyrifos." Elife 6 (2017): e27595.

      Roux, Natacha, et al. "The multi-level regulation of clownfish metamorphosis by thyroid hormones." Cell Reports 42.7 (2023).

      Differential analysis highlights up to ~ 15,000 differentially expressed genes (DEG), out of a predicted 26k genes. This corresponds to more than half of all genes. ANOVA-based differential analysis relies on the simple fact that only a minority of genes are DEG. Having >50% DEG is well beyond the validity of the method. This should be addressed, or at least discussed.

      The large number of differentially expressed genes is due to the fact that this is coming from a larval developmental transcriptome going from one day old larva to fully metamorphosed juveniles at around day 60. 

      While DESeq2 indeed works on an assumption that most genes are not differentially expressed, this affects normalization but not hypothesis testing (Wald-test, LRT tests or ANOVA). However, normalisation in DESeq2 is fairly robust to this assumption. According to the author of DESeq2, Micheal Love, DESeq2 is using the median ratio for normalisation, and as long as the number of up and down regulated genes is relatively even, DESeq2 will be able to handle the data. As part of our general quality control for this project we consulted the MA plots, which do not show any overrepresented up or down expression patterns. Additionally see Michael Love comment on comparing different tissues, which is also applicable here when comparing vastly different larval stages (https://support.bioconductor.org/p/63630/):

      “For experiments where all genes increase in expression across conditions, the median ratio method will not be able to capture this difference, but this is typically not the case for a tissue comparison, as there are many "housekeeping" genes with relatively similar expression pattern across tissues.”

      Reviewer #3 (Public Review):

      Weaknesses:

      However, the authors make substantial considerations that are not proven by experimental or functional data. In fact, this is a descriptive study that does not provide any functional evidence to support the claims made.

      We agree with the reviewer that our paper lacks functional experiments but despite that, the transcriptomic data clearly show the activation of TH and corticoid pathways during two distinct periods: an early activation between D1 and D10, and a second one between D32 and juvenile stage. These data are interesting as they call for further examination of 1) the existence of an early larval developmental step also involving TH and corticosteroids and 2) the possible interaction of corticoids and TH during metamorphosis. This is a question that is certainly not settled yet in teleost fishes and which is of great interest.

      Especially 1) is of interest and importance, since this early activation (unique to our knowledge in any teleost fish studied so far) raises a lot of new questions and once again will certainly be scrutinised by other groups in the years to come, therefore ensuring a good citation impact of this study. We hope that the reviewer, while disagreeing with some our statements, will recognize that our study will be stimulating at that level and that this is what scientific studies should do.

      We acknowledge the descriptive nature of the data and the lack of functional experiments in the Discussion in lines 443 to 445: “This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians, but functional experiments need to be conducted to confirm this hypothesis.” As stated above doing such functional experiment would require establishing the grouper as an experimental model in our husbandry, which currently is not possible due to the large size of the adult fish.

      The consideration that cortisol is involved in metamorphosis in teleosts has never been shown, and the only example cited by the authors (REF 20) clearly states that cortisol alone does not induce flatfish metamorphosis. In that work, the authors clearly state that in vivo cortisol treatment had no synergistic effect with TH in inducing metamorphosis. Moreover, in Senegalensis, the sole pre-otic CRH neuron number decreases during metamorphosis, further arguing that, at least in flatfish, cortisol is not involved in flatfish metamorphosis (PMID: 25575457).  

      We will do our best to improve the clarity of the revised manuscript to avoid any misunderstanding about our claims. However, we would like to point out the semantic shift in the reviewer first sentence: Indeed “being involved” is not the same as “cortisol alone does not induce”. In ref 20 the authors explicitly wrote that “Cortisol further enhanced the effects of both T4 and T3, but was ineffective in the absence of thyroid hormones” and in our view this indeed corresponds to ”being involved in metamorphosis”.

      We are not claiming that cortisol alone is involved in metamorphosis as the reviewer suggests, but simply that there is a possible involvement of cortisol together with TH in metamorphosis. We stand on this claim as we indeed observed an activation of corticoid pathway genes around D32, which is sufficient to say it is involved. We do agree that functional experiments will be needed to properly demonstrate the involvement of corticoids in grouper metamorphosis, but this was not possible in the current study as it would imply to set up a full grouper life cycle in lab conditions which is impossible for the scope of this manuscript.

      We also mentioned in the discussion that the role of corticoids in fish larval development is still debated, and we agree that this remains a contentious issue. We have clarified the Discussion on this point (lines 375-376, lines 439-464).

      We wrote that “There is contrasting evidence of communication between these two pathways during teleost fish larval development with some data suggesting a synergic and other an antagonistic relationship. In terms of synergy, an increase in cortisol level concomitantly with an increase in TH levels has been observed in flatfish [26], golden sea bream [64] and silver sea bream [65]. Cortisol was also shown to enhance in vitro the action of TH on fin ray resorption (phenomenon occurring during flatfish metamorphosis) in flounder[27]. It has also been shown that cortisol regulates local T3 bioavailability in the juvenile sole via regulation of deiodinase 2 in an organ-specific manner [66]. On the antagonistic side, it has been shown that experimentally induced hyperthyroidism in common carp decreases cortisol levels[67], whereas cortisol exposure decreases TH levels in European eel [68]. Given this scattered evidence, the existence of a crosstalk active during teleost larval development and metamorphosis has never been formally demonstrated. The results we obtained in grouper are clearly indicating that HPI axis is activated during both early development and metamorphosis and that cortisol synthesis is activated during early development. This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians [25], but functional experiments need to be conducted to confirm this hypothesis.” In the revised manuscript, we have also added the interesting case of the Senegal sole mentioned by the reviewer.

      In the last revision, we had also added that our results “brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy” meaning that we clearly acknowledge that we are only revealing a hypothesis that remains to be tested. We later follow up with a discussion about the most novel observation and focus of our study, the increase in THs and cortisol during early development, which was unexpected and very intriguing. Again, these results suggest that there might be a link between the two, as has been shown in amphibians. This is typically the kind of results that should encourage more investigations into other fish species. Indeed, this has been pointed out by other authors and in particular by Bob Denver (probably the foremost expert on this topic) in Crespi and Denver 2012: “Elevation in HPA/I axis activity has been described prior to Metamorphosis in amphibians and fish, birth in mammals (reviewed in Crespi & Denver 2005a; Wada 2008)”. B. Denver also adds that: “Experiments in which GCs were elevated prior to metamorphosis or prior to hatching or birth (e.g. Weiss, Johnston & Moore 2007) or inhibited by treatments with GC synthesis blockers (e.g. metyrapone) or receptor antagonists (e.g. RU486, Glennemeir & Denver 2002) demonstrate that GCs play a causal role in precipitating these life-history transitions (also reviewed in Crespi & Denver 2005a; Wada 2008).” We believe the reviewer will be convinced by these elements coming from a colleague unanimously respected in the field. 

      Furthermore, the authors need to recognise that the transcriptomic analysis is whole-body and that HPA axis genes are upregulated, which does not mean they are involved in regulating the HPT axis. The authors do not show that in thyrotrophs, any CRH receptor is expressed or in any other HPT axis-relevant cells and that changes in these genes correlate with changes in TSH expression. An in-situ hybridisation experiment showing co-expression on thyrotrophs of HPA genes and TSH could be a good start. However, the best scenario would be conducting cortisol treatment experiments to see if this hormone affects grouper metamorphosis.

      We agree that functional experiments are needed to validate our hypothesis. As the early peaks of expression levels observed for many genes were very intriguing for us, we did carry out thyroid hormones and goitrogenic treatment on young grouper larvae to test their effect on the morphological changes. Unfortunately, such experiments, already tricky on metamorphosing larvae, are even more risky on such tiny individuals just after hatching and we encountered high mortality rates. We must add that because we cannot establish a full grouper life cycle under lab conditions, we have done these experiments in the context of a commercial husbandry system in Japan, which while excellent limits the scope of possible experiments. We were thus not able to provide functional validation of our hypothesis. Such experiments will be a full project in itself, requiring setting up a rearing system suitable for both larval survival and economical constraints related to drug treatments. We were further limited by the spawning times of the grouper in the operational aquaculture farm, which are limited to a short time during each year. So even if we strongly agree with the necessity of conducting such experiments, we think that this is not in the scope of the present paper, but something future research can explore.

      High TSH and Tg levels usually parallel whole-body TH levels during teleost metamorphosis. However, in this study, high Tg expression levels are only achieved at the juvenile stage, whereas high TSH is achieved at D32, and at the juvenile stage, they are already at their lowest levels.

      This is exactly our point. We observe two peaks in TSH expression, one at D3 and one at D32. The peak at D3 coincides with high thyroid hormone levels on the same day, and while we have not measured TH at D32, existing literature shows that there is a peak in TH during that time (e.g., de Jesus et al., 1998). Similarly, there is a small peak of Tg at D3. Our manuscript focused more on the upregulation of these genes at D3, which has not been reported before in the literature and raised the question of the role of TH so early in the larval development, outside of the metamorphosis period. 

      Regarding the respective levels of TSH and Tg, we first would like to add that their respective order of appearance before metamorphosis (TSH at D32, Tg after) is consistent with what we would expect. We agree however that the strong increase of Tg and TPO expression is later than expected. Therefore, we have added the following sentence in lines 212 to 216: “The respective order of appearance of TSH and Tg (TSH at D32, Tg after) is consistent with what we would expect but a bit later than expected given the morphologicl transformation. It would be interesting to revisit this in a future series of experiments, with tighter temporal sampling to study how gene expression and morphological transformation aligned.“.

      It is very difficult to conclude anything with the TH and cortisol levels measurements. The authors only measured up until D10, whereas they argue that metamorphosis occurs at D32. In this way, these measurements could be more helpful if they focus on the correct developmental time. The data is irrelevant to their hypothesis.

      We respectfully disagree with the reviewer, considering that 1) TH levels have already been investigated in groupers coinciding with pigmentation changes and fin rays resorption (Figure 4 in de Jesus et al, 1998), 2) there is also evidence in numerous fish species that TH level increase is concomitant with increase of TH related genes, and 3) we observed in our data an increase in the expression of TH related genes as well as pigmentation changes and fin rays resorption. Based on our experience in fish metamorphosis and the literature we can say confidently that those observations indicate that metamorphosis is occurring between D32 and the juvenile stage. This clearly shows that our inference is correct. Additionally, we would like to reemphasize that from our experience in several fish species transcriptomic data are more robust and precise than hormone measurements.

      However, as we were surprised by the activation of TH and corticoid pathway genes very early in the larval development (at D3), which is clearly outside of the metamorphosis period, we decided to measure TH and cortisol levels during this period of time to determine if whether or not there this surprising early activation was indeed corresponding to an increase in both TH and cortisol. As such observation has never been made in other teleost species (to our knowledge), and as we were wondering if gene activation was accompanied by hormonal increase, the measurements we did for TH and cortisol between D1 and D10 are relevant. In order to clarify our message further, we have changed some of the mentions of

      “metamorphosis” to “larval development” throughout the manuscript and added other improvements to avoid any confusion between the two periods we are studying: early larval development (between D1 and D10) and metamorphosis (between D32 and juvenile stage).  

      Moreover, as stated in the previous review, a classical sign of teleost metamorphosis is the upregulation of TSHb and Tg, which does not occur at D32 therefore, it is very hard for me to accept that this is the metamorphic stage. With the lack of TH measurements, I cannot agree with the authors. I think this has to be toned down and made clear in the manuscript that D32 might be a putative metamorphic climax but that several aspects of biology work against it. Moreover, in D10, the authors show the highest cortisol level and lowest T4 and T3 levels. These observations are irreconcilable, with cortisol enhancing or participating in TH-driven metamorphosis.

      We thank the reviewer for this comment, but we think that there might be a misunderstanding here. 

      (1) We clearly observed an increase of TSHb (that occurs between D18 and juvenile stage) and an increase of tg from D32 which coincide with the activation of other genes involved in TH pathway (dio2, dio3, and also a strong increase of TRb). All this and put in the context of what we know from previous grouper studies, clearly supports our conclusion that TH-regulated metamorphosis is starting at around D32 in grouper. We also observed morphological changes such as fin rays resorption and pigmentation changes between D32 and juvenile stage. Such morphological changes have already been associated as corresponding to metamorphosis in groupers (De Jesus et al 1998) as they occur during TH level increase, and they also happen to be under the control of TH in grouper (De Jesus et al 1998). Based on this study but also on studies (conducted on many other teleost species) showing that the increase of TH levels is always associated with an activation of TH pathway genes and morphological and pigmentation changes we concluded that metamorphosis of E. malabaricus occurs between D32 and juvenile stage. We have improved the clarity of the manuscript in several places to make sure that our conclusion is based on our transcriptomic and morphological data plus the available literature.

      (2) We clearly observed another activation of TH related gene earlier in the development (between D1 and D10, with a surge of trhrs, tg and tpo at D3. As this activation was very unexpected for us, we decided to focus the analysis of TH levels between D1 and D10 and very interestingly we observed high level of T4 at D3 indicating that THs are instrumental very precociously in the larval development of the malabar grouper which has never been shown before. We declared lines 224-225 that our “data reinforce the existence of two distinct periods of TH signalling activity, one early on at D3 and one late corresponding to classic metamorphosis at D32”. However, we agree that we could have been clearer and clearly explained that this early activation was very intriguing for us and that we wanted to investigate hormonal levels around that period. However, we never claimed anywhere in the manuscript

      that this early developmental period corresponds to metamorphosis. Something else is occurring and both TH and cortisol seem to be involved but further experiments need to be conducted to understand their role and their possible interaction. We have added corresponding statements in the abstract (lines 39-43) and discussion (lines 447 to 449).

      (3) Finally, regarding the comment about cortisol enhancing or participating in TH driven metamorphosis, our data clearly showed an activation of the corticoid pathway genes around metamorphosis (between D32 and juvenile stage) suggesting a potential implication of corticoids in metamorphosis, but we agree with the reviewer that further experiment are needed to test that. We never claimed that cortisol was enhancing or participating in metamorphosis, on the contrary we are “suggesting a possible interaction between TH and corticoid pathway during metamorphosis”. And we also say that our “results brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy.” Nonetheless, we agree that some parts of our manuscript can be confusing in regards of cortisol synthesis during metamorphosis as we did not measure cortisol levels between D32 and juvenile stage. We have therefore made changes throughout the Introduction and Discussion to make this clearer.

      Given this, the authors should quantify whole-body TH levels throughout the entire developmental window considered to determine where the peak is observed and how it correlates with the other hormonal genes/systems in the analysis.

      We did not measure TH levels at later stages as it has already been measured during Epinephelus coioides metamorphosis and the morphological changes observed in this species around the TH peak corresponds to what we observed in Epinephelus malabaricus around the peak of expression of TH pathway genes (see De Jesus et al., 1998 General and Comparative Endocrinology, 112:10-16). The main focus of this manuscript is the novel observation of the existence of an early activation period observed at D3, and for which we needed TH levels to determine if they were involved in another early developmental process (not related to metamorphosis). Our hypothesis is that this early activation might be related to the growth of fin rays necessary to enhance floatability during the oceanic larval dispersal. As we may have arrived at the explanation of this hypothesis too rapidly without setting up the context well enough, we have made changes to the introduction and discussion.

      Even though this is a solid technical paper and the data obtained is excellent, the conclusions drawn by the authors are not supported by their data, and at least hormonal levels should be present in parallel to the transcriptomic data. Furthermore, toning down some affirmations or even considering the different hypotheses available that are different from the ones suggested would be very positive.

      We thank the reviewer for acknowledging the solidity of the method of our paper and the quality of the results. We agree that there were several parts where our message was unclear. We have addressed these points in the revised version of the manuscript to make sure there is no more confusion between the two distinct periods we studied in this paper (early larval development and metamorphosis). We also made sure that our claims about TH/corticoids interaction during both periods remain hypothetical as we cannot yet, despite trials, sustain them with functional experiment.

    1. Author response:

      eLife assessment

      This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.

      We thank the Reviewers and the Reviewing Editor for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. In what follows we summarize our current plan to improve the paper taking up on their suggestions.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.

      Thanks for these insights and for this summary of our work.

      Major comments:

      (1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.

      We will describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how this ratio of neuron numbers depends on the weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig 6E). We will make sure that these results are suitably expanded and better emphasized in revision. We will also include new analysis of dependence of optimal parameters on the relative weighting of encoding error vs metabolic cost in the loss function when studying other parameters (namely: noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity, time constants of single E and I neurons).

      (2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.

      We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity similar to those proposed in the above papers. We apologize if this was not clear enough in the previous version. We will make it clearer in revision.  We nevertheless think it useful to report the effects of perturbations within this network because the structure derived in our network is not identical to those studied in the above paper, and because these results give information about how lateral inhibition works in this network. Thus, we will keep presenting it in the revised version, although we will de-emphasize and simplify its presentation to give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding.

      (3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.

      We will improve the Limitations paragraph in Discussion, and also anticipate caveats in tandem with results when needed, as suggested.

      (4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future - but most of the "predictions" from the model are actually findings that broadly match earlier experimental results, making them "postdictions".

      This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.

      We will better distinguish between pre- and post-dictions  in revision.

      Reviewer #2 (Public Review):

      Summary: In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Thanks for these insights and for the kind words of appreciation of the strengths of our work.

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      We are addressing this issue in two ways. First, we will present results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. Namely we plan to vary jointly the noise intensity and the metabolic constant, as well as the ratio of E to I neuron numbers and the ratio of mean I-I to E-I connectivity. Second, we will individuate a reasonable/realistic range of possible variations of each individual parameter and then perform a Monte Carlo search for the optimal point within this range, and compare the so-obtained results with those obtained from the understanding gained from varying one or two parameters at a time.  We will also add the suggested citation to Calaim et al. 2022 in regard to the points discussed above.

      We will improve the comparison between the Excitatory-Inhibitory and the 1-Cell-Type model (see reply to the suggestions of Referee 3 for more details).

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

      In the previously submitted manuscript we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We will improve this work by adding the suggested calculations to provide quantitative measures of the dependence of the optimal network parameters and configurations on this relative weighting.

      Reviewer #3 (Public Review):

      Summary: In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?

      They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.

      They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.

      Strengths:

      (1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.

      (2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.

      (3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.

      Thanks for this summary and for these kind words of appreciation of the strengths of our work.

      Weaknesses:

      (1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      (2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.

      We indeed removed non-Dalian connections because having only connections respecting Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. However, to get better insights into how Dale’s Law constrains or influences the design of efficient networks, we added a comparison of the coding properties of networks that either do or do not satisfy Dale’s law. We apologize if this was not sufficiently clear in the previous version and we will clarify this in revision. 

      (3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.

      We will perform the suggested detailed comparisons between the network loss in the 1CT-model and E-I model and then revise or refine conclusions if and as needed, according to the results we will obtain.

      (4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.

      We will try to make the presentation of the model more accessible to a non-computational audience.

      Assessment and context: Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.

      Thanks for these kind words. We will make sure that these points emerge more clearly and in a more accessible way from the revised paper.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The endocannabinoid system (ECS) components are dysregulated within the lesion microenvironment and systemic circulation of endometriosis patients. Using endometriosis mouse models and genetic loss of function approaches, Lingegowda et al. report that canonical ECS receptors, CNR1 and CNR2, are required for disease initiation, progression, and T-cell dysfunction.

      Strengths:

      The approach uses genetic approaches to establish in vivo causal relationships between dysregulated ECS and endometriosis pathogenesis. The experimental design incorporates both bulk and single-cell RNAseq approaches, as well as imaging mass spectrometry to characterize the mouse lesions. The identification of immune-related and T-cell-specific changes in the lesion microenvironment of CNR1 and CNR2 knockout (KO) mice represents a significant advance

      Weaknesses:

      Although the mouse phenotypic analyses involve a detailed molecular characterization of the lesion microenvironment using genomic approaches, detailed measurements of lesion size/burden and histopathology would provide a better understanding of how CNR1 or CNR2 loss contributes to endometriosis initiation and progression. The cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although this aspect of the approach is recognized as a major limitation, global CNR1 and CNR2 KO may affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or lead to preexisting alterations in host or donor tissues, which could affect lesion establishment and development in the surgically induced, syngeneic mouse model of endometriosis.

      We appreciate the reviewer's thoughtful and constructive feedback. We agree that the additional measurements of lesion size/burden and histopathology would provide valuable insights into the specific contributions of CNR1 and CNR2 to endometriosis progression. However, the focus of this study was on assessing the alterations in complex immune microenvironment due to the absence of CNR1 and CNR2, given their close relation in regulating immune cell populations. We will plan to incorporate these measurements in future studies to further strengthen the understanding of the disease pathogenesis. Regarding the potential effects of global knockout, the reviewer raises a valid concern. To address this, we will explore cell and/or tissue-specific knockout models in future experiments to better isolate the direct effects of CNR1 and CNR2 on the disease process, while minimizing potential confounding factors from systemic alterations.

      Reviewer #2 (Public Review):

      Summary:

      The endocannabinoid system (ECS) regulates many critical functions, including reproductive function. Recent evidence indicates that dysregulated ECS contributes to endometriosis pathophysiology and the microenvironment. Therefore, the authors further examined the dysregulated ECS and its mechanisms in endometriosis lesion establishment and progression using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. The authors presented differential gene expressions and altered pathways, especially those related to the adaptive immune response in CNR1 and CNR2 ko lesions. Interestingly, the T-cell population was dramatically reduced in the peritoneal cavity lacking CNR2, and the loss of proliferative activity of CD4+ T helper cells. Imaging mass cytometry analysis provided spatial profiling of cell populations and potential relationships among immune cells and other cell types. This study provided fundamental knowledge of the endocannabinoid system in endometriosis pathophysiology.

      Strengths:

      Dysregulated ECS and its mechanisms in endometriosis pathogenesis were assessed using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. Not only endometriotic lesions, but also peritoneal exudate (and splenic) cells were analyzed to understand the specific local disease environment under the dysregulated ECS.

      Providing the results of transcriptional profiles and pathways, immune cell profiles, and spatial profiles of cell populations support altered immune cell population and their disrupted functions in endometriosis pathogenesis via dysregulation of ECS.

      In line 386: Role of CNR2 in T cells. The finding that nearly absent CD3+ T cells in the peritoneal cavity of CNR2 ko mice is intriguing.

      The interpretation of the results is well-described in the Discussion.

      Weaknesses:

      The study was terminated and characterized 7 days after EM induction surgery without the details for selecting the time point to perform the experiments.

      The authors also mentioned that altered eutopic endometrium contributes to the establishment and progression of endometriosis. This reviewer agrees with lines 324-325. If so, DEGs are likely identified between eutopic endometrium (with/without endometriosis lesion induction) and ectopic lesions. It would be nice to see the data (even though using publicly available data sets).

      Figure 7 CDEF. The results of the statistical analyses and analyzed sample numbers should be added. Lines 444-450 cannot be reviewed without them.

      This reviewer agrees with lines 498-500. In contrast, retrograded menstrual debris is not decidualized. The section could be modified to avoid misunderstanding.

      We would like to thank the reviewer for insightful comments, suggestions and acknowledging the importance of the work presented in this manuscript.

      Regarding 7-day time point, we have provided rationale in lines 479-481, but agree that it isn’t sufficient and hence we have provided additional details on the selection of the 7-day time point for the experiments in methods section (Mouse model of EM). We have also noted the suggestion on providing comparison of differentially expressed genes in the eutopic endometrium vs ectopic lesions. Since there are publications comparing the eutopic vs ectopic gene expression patterns (PMIDs: 33868805 and 18818281), including a study exploring the ECS genes in the endometrium throughout different menstrual cycles (PMID: 35672435), we believe additional analysis using the same dataset may not yield new information. However, we see the value in reviewer’s comment, and we will look at the gene expression patterns in the uterine vs endometriosis like lesions in our future studies with tissue or cell specific CNR1 and CNR2 knockout models to understand functional relevance of ECS in endometriosis initiation.

      Since the IMC study was exploratory for proof of concept, we did not have enough biological replicates for meaningful statistical validation (n = 2-3). We have clarified this information in the methods, results, and figure legends for appropriately representing the limitations of the current setup.

      Finally, we appreciate the feedback on the section discussing retrograded menstrual debris. Even though the menstrual debris may not be decidualized, some endometriotic lesions have the ability to decidualize based on their response to estrogen and progesterone in a cycling manner (PMID: 26450609), similar to the endometrium in the uterine cavity. We have clarified this in the revised MS.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      The mechanism of how alterations in ECS contribute to the observed cellular and molecular changes is unclear. Connecting CNR1 or CNR2 function to a specific cell type or cellular process would provide a more detailed understanding of how dysregulated ECS contributes to endometriosis pathogenesis.

      We agree that integrating the functions of CNR1 or CNR2 to specific cell types or cellular processes would strengthen the mechanistic insights presented in our study. This would help elucidate specific pathways by which dysregulated ECS leads to the alterations in immune cell populations, gene expression profiles, and other key aspects of endometriosis development and progression. This is a rapidly evolving field and at this stage, we do not have published information to reflect on this aspect in the revised manuscript.

      (1) As mentioned in the text, the ECS components being studied are widely expressed and may affect multiple aspects of endometriosis pathogenesis and symptomatology. However, the cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although these limitations are mentioned in the discussion, it is important to know if global CNR1 and CNR2 KO affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or if preexisting alterations in host or donor tissues affect lesion development in the surgically induced, syngeneic mouse model of endometriosis. This would also be the case in studies on immune system dysfunction or lesion microenvironment, as it is possible preexisting immune system dysfunction following CNR1 or CNR2 loss could alter the disease trajectory and lead to a misinterpretation of the findings. Some of these potential confounders could be addressed using crossover approaches in Figure 1A experimental design, but the donor tissues are reported to be matched to the recipients based on genotype.

      The reviewer raised an excellent point that the widespread expression of the ECS components studied in our manuscript may affect multiple aspects of endometriosis pathogenesis and symptomatology. Indeed, the cell or tissue-specific effects of CNR1 and CNR2 knockout are not fully incorporated into our experimental design, which could lead to potential confounding factors that may affect the interpretation of some of our findings. However, as outlined in our previous comments, we will incorporate the tissue/cell specific knockout, as well the crossover approaches to elucidate if the loss of CNR1 and CNR2 function is lesion driven in future studies. We agree that it is important to understand the impact of global CNR1 and CNR2 knockout on normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, and other potential preexisting alterations in the host or donor tissues that could influence lesion development in the syngeneic mouse model of endometriosis. As outlined in the MS (lines 59-62), there are studies highlighting pregnancy specific impact including implantation and impaired primary decidual zone formation. We did not find any baseline alterations in the systemic immune profiles between the CNR1 and CNR2 knockout mice and the WT mice without EM induction. However, the uterine environment has not been assessed to understand the baseline immune profile between the knockout mice and WT mice. We agree with the reviewer that, the possibility of preexisting immune system dysfunction following CNR1 or CNR2 loss could alter the disease trajectory related to immune system dysfunction or lesion microenvironment. We have highlighted this in the limitations section.

      (2) The phenotypic characterization of the endometriosis mouse model with or without CNR1 or CNR2 KO is very limited. To better understand how the observed cellular and molecular alterations correlate with endometriosis pathogenesis and severity CNR1 and CNR2 K/O mice, a detailed characterization of lesion size differences and histopathology should be made. Importantly, the histopathological characterization of the lesions would complement the imaging mass spectrometry findings.

      We agree that more detailed characterization of the endometriosis lesions in our CNR1 and CNR2 knockout mouse models are required. As evident for our several previous publications, we have focused on detailed histopathological characterization of endometriotic lesions in our syngeneic mouse model of endometriosis including a multiple time course study (Symons et al, 2020, FASEB). In the present investigation, we focused on cataloging spatial and transcriptomic changes as we do not currently have any information on the global influence of CNR1 and CNR2 knockout on endometriosis lesion microenvironment, since we prioritized this aspect, we were not able to provide detailed histological assessment of lesions. However, the IMC analysis provides a detailed, spatially resolved profile of the cellular composition and interactions within the endometriotic lesions, which we believe offers valuable insights into the mechanisms by which the dysregulated ECS may contribute to endometriosis pathogenesis. This quantitative, high-dimensional approach complements the transcriptional profiling and other analyses we have performed.

      (3) Given the effect sizes and variance observed with the ECS ligand measurements, an N = 4-5 biological samples for mouse phenotypic studies seems too low.

      The reviewer raises a valid point about low sample size. As elaborated earlier, this was a proof of principle study to capture biologically significant alterations within lesion and surrounding peritoneal microenvironment in the absence of CNR1, CNR2 receptors. This information is crucial for establishing the potential mechanisms by which the dysregulated ECS may contribute to the pathogenesis of endometriosis. Now that we have established the framework and baseline understanding of immune-inflammatory alterations, we will refine our future experimental approaches and include more samples if becomes necessary.

      Reviewer #2 (Recommendations For The Authors):

      It is hard to read the labeling of figures. Please increase the font size of each figure.

      We have increased the font size of the labels where necessary to improve the readability.

      Supplementary Data 1, Table 1 seems like Supplementary Table 1. Please use the same labeling of the Supplementary tables and figures to avoid confusion.

      We have updated the labeling accordingly and ensured that all supplementary tables and figures are consistently labeled.

      This reviewer suggests depositing RNA-seq and IMC data to NCBI etc. and listing the accession number in the MS.

      Thank you for your recommendation to deposit the RNA-seq and imaging mass cytometry (IMC) data from our study in public repositories such as NCBI. We appreciate your suggestion, as data sharing is an important aspect of scientific transparency and reproducibility. Bulk mRNA sequencing data has been attached as a supplementary file and IMC data has been deposited on Mendeley Data (DOI: 10.17632/2ptns5yhzh.1).

      Please clarify L363.

      We have clarified this in the revised MS. The revised text now reads: “However, we did not find the same differences (T cell-related genes) in the UnD lesions of CNR2 k/o mice. Moreover, UnD lesions of CNR2 k/o mice showed significantly low number of DEGs (11 compared to 65 in the DD lesions from CNR2 k/o mice) suggesting a decidualization dependent response (Supplementary Data 3).”

      Figure 7B: It is hard to see/understand the results in L438-440. It might be helpful if % is added to the figure.

      We have added more tick marks to the y-axis of Figure 7B to make it easier for the reader to interpret the percentages of the different cell types.

      Figure 7 legend: 2nd D should be G.

      We have revised the legend accordingly.

      Supplementary Figure 6: It seems immune cells are clustered in CN1, which is different from Figure 7. To easily understand Suppl Fig 6AB, please add some details in the legend.

      We have revised the legend as suggested.

      The revised legend now reads: “A, B Representative image of 8 distinct cell types from CN analysis of DD and UnD lesions from WT, CNR1 k/o, and CNR2 k/o mice, respectively. C Heatmap representation of CN analysis shows distinct clustering patterns observed in the UnD lesions among the different genotypes. The clustering reveals distinct spatial patterns of immune cell populations within the UnD lesions, which appear to differ from the observations in Figure 7G. This suggests potential spatial heterogeneity in the immune landscape of EM like lesions under conditions of decidualization.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study reports on a previously unrecognized function of ATG6 in plant immunity. The work is valuable because it proposes a direct interaction between ATG6 and a well-studied salicylic acid receptor protein, NPR1, which may interest researchers investigating plant immunity regulation. While the data presented are compelling, more information regarding the specificity of ATG6's role would improve the overall impact of the study, especially with an eye towards consistency with prior work.

      We also genuinely thank the editor and reviewers for the constructive and helpful suggestions and comments. These comments have greatly improved the quality and thoroughness of our manuscript. We have carefully studied these comments and have made the appropriate changes as far as possible. Additionally, some minor errors were also corrected during the revision process. New text is shown in blue in the revised manuscript. Our responses to the reviewer's comments are provided below each respective comment.

      Public Reviews:

      Reviewer #1 (Public Review):<br /> Summary:<br /> The authors showed that autophagy-related genes are involved in plant immunity by regulating the protein level of the salicylic acid receptor, NPR1.<br /> Strengths:<br /> The experiments are carefully designed and the data is convincing. The authors did a good job of understanding the relationship between ATG6 and NRP1.

      Thank you very much for recognizing our research.

      Weaknesses:<br /> - The authors can do a few additional experiments to test the role of ATG6 in plant immunity.<br /> I recommend the authors to test the interaction between ATGs and other NPR1 homologs (such as NPR2).

      Thanks to your valuable feedback, it was discovered that the Arabidopsis NPRs family comprises six members: NPR1, NPR2, NPR3, NPR4, NPR5/PETIOLE 1 (BOP1), and NPR6/BOP2. NPR3/4 function in tandem as negative regulators to modulate SA signaling and plant immune responses (Ding et al., 2018). Similar to NPR1, NPR2 acts as a positive regulator of SA signaling (Castello et al., 2018). NPR5/BOP1 and NPR6/BOP2 primarily participate in the regulation of plant growth and development (McKim et al., 2008). This study specifically investigates the correlation between ATG6 and NPRs in plant resistance to pathogenic bacteria. Consequently, we experimentally confirmed the interaction between ATG6 and NPR1, NPR3, and NPR4 (Fig. 1 and Fig. S1 in the revised manuscript). It would be intriguing to further explore the interactions between ATG6 and other NPRs in the context of regulating plant growth and development in future research endeavors.

      -The concentration of SA used in the experiment (0.5-1 mM) seems pretty high. Does a lower concentration of SA induce ATG6 accumulation in the nucleus?

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). Consequently, to investigate the function of NPR1, many scientists and research groups typically employ higher concentrations of SA (e.g., 0.5 mM, 1 mM, or even 5 mM) to elucidate its role (Spoel et al., 2009; Fu et al., 2012; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a). In our study, we observed an interaction between ATG6 and NPR1. To enhance the detection of the NPR1 protein, we standardized the SA concentration (Arabidopsis was treated with 0.5 mM SA; Tobacco was treated with 1 mM SA) used in our experiments. Subsequently, we analyzed the nuclear accumulation ATG6 or NPR1 using a relatively high SA concentration (Arabidopsis was treated with 0.5 mM SA; Tobacco was treated with 1 mM SA), consistent with concentrations used in previous studies (Spoel et al., 2009; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a).

      -Does the silencing of ATG6 affect the cell death (or HR) triggered by AvrRPS4?

      Thank you for pointing this out. In this study, we examined changes in Pst DC3000/avrRps4-induced cell death in Col, amiRNAATG6 # 1, amiRNAATG6 # 2, npr1, NPR1-GFP, ATG6-mCherry and ATG6-mCherry × NPR1-GFP plants. The results of Taipan blue staining showed that Pst DC3000/avrRps4-induced cell death in npr1, amiRNAATG6 # 1 and amiRNAATG6 # 2 was significantly higher compared to Col (Fig. S15 in the revised manuscript). Conversely, Pst DC3000/avrRps4-induced cell death in ATG6-mCherry, NPR1-GFP and ATG6-mCherry × NPR1-GFP was significantly lower compared to Col. Notably, Pst DC3000/avrRps4-induced cell death in ATG6-mCherry × NPR1-GFP was significantly lower compared ATG6-mCherry and NPR1-GFP (Fig. S15 in the revised manuscript). These results suggest that ATG6 and NPR1 cooperatively inhibit Pst DC3000/avrRps4-induced cell dead. The relevant description can be found in lines 394-404 of the revised manuscript.

      -SA and NPR1 are also required for immunity and are activated by other NLRs (such as RPS2 and RPM1). Is ATG6 also involved in immunity activated by these NLRs?

      Thank you for your valuable comments. The most notable event in the NLR-mediated ETI immune response is the induction of hypersensitive response-programmed cell death (HR-PCD) (Jones and Dangl, 2006; Yuan et al., 2021). SA plays a dual role in the ETI response. On one hand, the accumulation of SA during the R gene-mediated ETI defense response is directly linked to the onset of HR-PCD (Nawrath and Metraux, 1999). SA and NPR1 can enhance the ETI response by regulating the expression of downstream target genes (Falk et al., 1999; Feys et al., 2001; Ding et al., 2018; Liu et al., 2020). On the other hand, the activation of SA signaling can have a negative regulatory effect on HR-PCD during the ETI response. High levels of SA have been shown to significantly inhibit HR-PCD triggered by the avrRpt2 effector (Rate and Greenberg, 2001; Devadas and Raina, 2002; Jurkowski et al., 2004). Rate et al. discovered that the inhibition of HR-PCD by SA relies on NPR1 (Rate and Greenberg, 2001).

      Arabidopsis AtATG6 or its homologs in other species (such as NbBECLIN1, TaATG6s, etc.) have been identified as positive regulators in plant immunity, playing a crucial role in inhibiting cell death and preventing invasion by pathogenic microorganisms (Liu et al., 2005; Patel and Dinesh-Kumar, 2008; Yue et al., 2015). Patel et al. demonstrated that, akin to autophagy-deficient mutants previously documented, AtATG6 antisense (AtATG6-AS) plants treated with Pst DC3000/avrRpm1 exhibited diffuse cell death, indicating the necessity of ATG6 in restricting cell death (Patel and Dinesh-Kumar, 2008). In tobacco, deficiencies in BECLIN 1 result in the onset of diffuse HR-PCD, underscoring the essential role of BECLIN 1 in limiting HR-PCD (Liu et al., 2005). Despite the genetic evidence supporting the critical function of ATG6 in plant immunity, the precise molecular mechanisms through which ATG6 impedes the invasion of pathogenic microorganisms remain elusive.

      In our study, we uncovered that ATG6 interacts with NPR1 to hinder pathogen invasion and inhibit the initiation of cell death. In animals, members of the NLR family have been observed to interact with the autophagy-related protein LC3 to inhibit the survival of pathogen (Zhang et al., 2019). Similar mechanisms may exist in plants. However, it remains to be explored whether NLR directly induces the activation of ATG6 through interaction or the relationship between NPR1-ATG6 interactions and NLR-mediated plant immunity, necessitating further investigation.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.

      However, the overall conclusions of the study are not well supported experimentally. The significance of the findings is low because of their mostly correlational nature, and lack of consistency with earlier reports on the same protein.

      Thank you for your valuable and constructive suggestions. In this article, we unveil a novel relationship in which ATG6 positively regulates NPR1 in plant immunity (Fig. 8 in the revised manuscript). ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and formation of SINCs-like condensates. This may be of interest to researchers studying the regulation of plant immunity. While there may be minor flaws in our current study, the significance of these findings cannot be overstated, as they have the potential to redirect scientific attention towards uncovering novel functions for autophagy genes.

      Based on the integrity and quality of the data as well as the depth of analysis, it is not yet clear if ATG6 is a specific regulator of NPR1 or if it is affecting NPR1's stability indirectly, through inducing an elevation of SA levels in plants. As such, the current study demonstrates a correlation between overexpression of ATG6, SA accumulation, and NPR1 stability, however, whether and how these components work together is not yet demonstrated.

      Thanks to your valuable feedback. Although as the reviewer said there may be some flaws in our data from the current results, scientific research is an ongoing process and I am confident that future studies will be even better. From the results given to us at the moment at least this study reports a previously undiscovered function of ATG6 in plant immunity. We propose a direct interaction between ATG6 and a well-studied salicylic acid receptor protein, NPR1. We unveil a novel relationship in which ATG6 positively regulates NPR1 in plant immunity (Fig. 8 in the revised manuscript). ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and formation of SINCs-like condensates. This may be of interest to researchers studying the regulation of plant immunity.

      Based on the provided biochemical data, it is not yet clear if the ATG6 functions specifically through NPR1 or through its paralogs NPR3 and NPR4, which are negative regulators of immunity. It is quite possible that interaction with NPR1 (or any NPR) is not the major regulatory step in the activity of ATG6 in plant immunity. The effect of ATG6 on NPR1 could well be indirect, through a change in the SA level and redox environment of the cell during the immune response. Both SA level and redox state of the cell were reported to induce accumulation of NPR1 in the nucleus and increase in stability.

      Thanks to your valuable feedback. In this study, we validated the interaction between ATG6 and NPR1 through various approaches and identified the key regions mediating their interaction. Our findings indicate that ATG6 interacts with NPR1 to synergistically enhance plant resistance by regulating NPR1 protein levels, stability, nuclear accumulation, and the formation of SINC-like condensates. These results clearly demonstrate the involvement of ATG6 in the regulation of NPR1.Furthermore, we also found that ATG6 interacts with NPR3/4 (Fig. S1 in the revised manuscript). This is particularly relevant given that NPR3 and NPR4 have been shown to act as adaptors for the ubiquitin E3 ligase Cullin 3 (CUL3) to regulate the degradation of NPR1. Therefore, whether ATG6 regulates NPR1 through its interactions with NPR3/4 is an intriguing question worth exploring in future studies. We appreciate the reviewer's concerns and are committed to addressing them in our future research to further elucidate the complex regulatory mechanisms involving ATG6, NPR1, and other key players in plant immunity.

      Another major issue is the poor quality of the subcellular analyses. In contradiction to previous studies, ATG6 in this study is not localized to autophagosome puncta, which suggests that the soluble localization pattern presented here does not reflect the true localization of ATG6. Even if the authors propose a novel, non-canonical nuclear localization for ATG6, they still should have detected the canonical autophagy-like localization of this protein.

      Thanks to your valuable feedback. We conducted predictions at NLS Mapper (https://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi) and identified two bipartite NLSs in ATG6, with the sequences "MRKEEIPDKSRTIPIDPNLPKWVCQNCHHS" and "DPNLPKWVCQNCHHS LTIVGVDSYAGKFFNDP". To further elucidate the nuclear localization of ATG6, we introduced Agrobacterium tumefaciens carrying ATG6-GFP into nls-mCherry tobacco leaves through transient transformation. Subsequently, we observed the localization of ATG6-GFP, along with the canonical autophagy-like patterns. Our findings revealed fluorescence signals of ATG6-GFP in both the cytoplasm and nuclei (Figure 2b). The nuclear-localized ATG6-GFP overlapping with the nuclear-localized marker, nls-mCherry (indicated by white arrows). Additionally, we observed punctate patterns indicative of canonical autophagy-like localization of ATG6-GFP fluorescence signals (indicated by red circles). Based on these results, we are more confident about the authenticity of ATG6's nuclear localization. The revised manuscript includes clearer images to support our observations.

      Recommendations for the Authors:

      Reviewer #2 (Recommendations For The Authors):

      The duration and concentration of SA treatments are quite variable between experiments which makes comparisons difficult.

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). Consequently, to investigate the function of NPR1, many scientists and research groups typically employ higher concentrations of SA (e.g., 0.5 mM, 1 mM, or even 5 mM) to elucidate its role (Spoel et al., 2009; Fu et al., 2012; Lee et al., 2015; Saleh et al., 2015; Skelly et al., 2019; Zavaliev et al., 2020; Chen et al., 2021a). In our study, we observed an interaction between ATG6 and NPR1. To enhance the detection of the NPR1 protein, we standardized the SA concentration used in our experiments. In this study, for the treatment of Arabidopsis, we followed the protocols outlined in Saleh et al. and Spoel et al., utilizing 0.5 mM SA (Spoel et al., 2009; Saleh et al., 2015). For tobacco treatment, we adopted the methodology described in the study by Zavaliev et al., administering 1 mM SA (Zavaliev et al., 2020).

      The methods section does not explain some of the essential experimental conditions and reagents used in the study.

      Thank you for pointing this out. Due to word limitations we have placed the detailed experimental methods and reagents in Supplemental Data 1. In Supplemental Data 1, we provide a comprehensive overview of the experimental flow and conditions employed in our study.

      Lines 62-63: the C-terminal domain of all NPRs has a name (already defined as SA-binding domain (SBD)). Also, it would be worth referring to the structure of NPR1 (Kumar et al 2022, Nat) as the source of information about its domains.

      Thank you for pointing this out, we have changed this description in the revised manuscript (lines 62-63).

      Lines 66-69: NPR1 doesn't form monomers. A recent study showed that the basic functional unit of NPR1 is a dimer (Kumar et al 2022, Nat).

      Thank you for pointing this out. In the revised manuscript (line 67) " monomers " has been changed to “dimer”.

      Lines 89-95 and elsewhere: the term "invasion" has a very specific meaning and it doesn't necessarily refer to disease. A pathogen can invade the plant but cause no disease (e.g. ETI). Most plant genetic immune mechanisms act after pathogen invasion, not before it. Those cited works reported the disease resistance, not the invasion resistance.

      Thank you for pointing this out. We've changed the incorrect description in the revised manuscript (line 91).

      Lines 113-119: the truncation at the aa328 includes half of the ANK domain (repeats 1 and 2), not just BTB. The C-terminal truncation variant contains the other half (repeats 3 and 4) of the ANK domain, not the entire ANK domain. It also contains the SBD, not just the NLS. So, this kind of analysis cannot determine the role of ANK domain in the interaction, nor it can conclusively determine if the interaction is through SBD. The interaction should be tested with the SBD domain only in order to make this conclusion.

      Thank you for pointing this out, we have removed the inappropriate description and made the appropriate changes in the revised manuscript (lines 114 and 115).

      In Figure S1, the equally strong interaction of atg6 is found for NPR3/NPR4. Does that mean that atg6 functions also through these other NPRs? What's the significance of these data compared to NPR1-ATG6 interaction? This is especially important, because both NPR3 and NPR4 are predominantly nuclear proteins, and they are unlikely to significantly overlap with autophagy components in the cytoplasm.

      NPR1 and its paralogues NPR3/NPR4, which frequently interact with other proteins to regulate plant immune responses (Backer et al., 2019; Chen et al., 2019). To identify ATGs that interact with NPRs, we performed yeast two-hybrid (Y2H) screens using NPRs as bait. Interestingly, ATG6 interacted with NPR1, NPR3 and NPR4, respectively, and different concentrations of SA treatment did not significantly affect their interaction (Fig. S1a). NPR1 is an important positive regulator of the plant immune response (Chen et al., 2021b). In Arabidopsis and N. benthamian, ATG6 or its homologues was reported to act as a positive regulator to enhance plant disease resistance to P. syringae pv. tomato (Pst) DC3000 and Pst DC3000/avrRpm1 bacteria (Patel and Dinesh-Kumar, 2008), N. benthamiana mosaic virus (TMV) (Liu et al., 2005). Therefore, in this study we focused on investigating the biological significance of the interaction between ATG6 and NPR1. Whether the interaction between ATG6 and NPR3/4 also has an effect on plant immunity is a question that remains to be explored in future studies.

      In Figure 1c and elsewhere: why not use the anti-mCherry antibody to detect atg6-mcherry? Are we seeing the correct protein band of atg6-mcherry? Also, it is not clear what antibodies they used throughout the study: the sources and specificities of antibodies are not provided.

      Thank you for pointing this out. We initially synthesized the ATG6 antibody (anti-ATG6, 1:200, peptide, C-KEKKKIEEEERK, Abmart) in order to detect the endogenous ATG6 protein, and we also tested the specificity and potency of the ATG6 antibody (results are shown in Fig. S17). Additionally, in order to determine the location of the ATG6-mCherry bands, we also detected ATG6-mCherry in ATG6-mCherry Arabidopsis using the ATG6 antibody, and we also used Col as a control (results are shown in Fig. S4). These results show that our synthesized ATG6 antibody can effectively and clearly immunize to both ATG6 and ATG6-mCherry. Therefore, in this study, we used the ATG6 antibody to analyze both ATG6-mCherry and endogenous ATG6. Detailed antibody information is presented in Supplementary Data 1, table S4

      In Figures 1d, 2a, and 2b, the subcellular localization pattern of atg6 contradicts what was published before (Fujiki et al 2007, Plant Phys; Liu et al 2018, FPlS; Xu et al 2017, Autophagy; Li et al 2018, Nat. Comm.). As an autophagy protein, atg6 was shown to localize to cytoplasmic puncta (autophagosomes), like atg8. No nuclear localization was found in those studies. The lack of puncta and the strong nuclear accumulation are signs that the localization of atg6 reported here has to be interpreted with caution. With the data provided, I am not convinced yet that we are looking at the correct ATG6 subcellular localization. Even if the authors propose a novel, non-canonical localization for atg6, they still should have detected the canonical autophagy-like localization of this protein.

      Thanks to your valuable feedback. To further elucidate the nuclear localization of ATG6, we introduced Agrobacterium tumefaciens carrying ATG6-GFP into nls-mCherry tobacco leaves through transient transformation. Subsequently, we observed the localization of ATG6-GFP, along with the canonical autophagy-like patterns. Our findings revealed fluorescence signals of ATG6-GFP in both the cytoplasm and nuclei (Figure 2b). The nuclear-localized ATG6-GFP overlapping with the nuclear-localized marker, nls-mCherry (indicated by white arrows). Additionally, we observed punctate patterns indicative of canonical autophagy-like localization of ATG6-GFP fluorescence signals (indicated by red circles). Based on these results, we are more confident about the authenticity of ATG6's nuclear localization. The revised manuscript includes clearer images to support our observations.

      It would make more sense to include the BiFC data (fig. S2) in the main figure, instead of the co-localization (fig. 1d) which cannot serve as evidence for interaction.

      Thank you for the feedback. We accept your suggestion. In Fig.1, we have replaced the co-localization image with a BiFC (Bimolecular Fluorescence Complementation) image to better illustrate the interaction.

      In Figure S2, the bifc signals have to be quantified to qualify as evidence for interaction. also, a subcellular marker has to be used (e.g. nuclear mcherry). From the current poor-quality images, one cannot determine where in the cell the presumed interaction takes place, nucleus or cytoplasm, or both. Also, no puncta are seen in these images.

      Thank you for pointing this out. Despite the lack of clarity in the images we provided, our BiFC results unequivocally demonstrate the interaction between ATG6 and NPR1 in both the cytoplasm and nucleus. Notably, as the reviewer pointed out, punctate signals were not observed in our images. This lack of punctate signals is consistent with previous studies (Figure 2) that have also shown BiFC results between autophagy-associated proteins ATG8s and their interacting partners. For instance, Fig 1G (Marshall et al. 2019, Cell), Fig 2F (Marshall et al. 2019, Cell), Fig 4B (Macharia et al. 2019, BMC Plant Biology), and Fig 3 (Zhou et al. 2018, Autophagy) all did not exhibit punctate signals, aligning closely with our findings.

      In Figure S3a, the nuclear localization is shown for stomata. It is known that stomata are especially strong expressors of the transgenes, and localization there could be an artefact of overaccumulation of the fusion protein. Also, why do they present the localization of atg6-gfp, if the analysis and the cross were made with atg6-mcherry?

      Thank you for pointing this out. In our previous experiments, we observed the localization of ATG6 in the nucleus of Arabidopsis thaliana plants overexpressing ATG6-GFP (Fig. S3a). To clearly visualize the location of the nucleus, we used the cytosolic DAPI dye, which readily stained the nuclei of the stomatal guard cells. This allowed us to easily identify the nuclear regions for our observations. Additionally, in Fig. 2a and Fig.S3b, we detected the fluorescence signal of ATG6-mCherry within the nucleus, further confirming the nuclear localization of ATG6. Moreover, the nuclear and cytoplasmic fractions were separated. Under SA treatment, ATG6-mCherry and ATG6-GFP were detected in the cytoplasmic and nuclear fractions in N. benthamiana (Fig. 2c and d). Similarly, ATG6 was also detected in the nuclear fraction of UBQ10::ATG6-GFP and UBQ10::ATG6-mCherry overexpressing plants (Fig. 2e and f).

      In Figure S3b, the images are low resolution and of poor quality. Why atg6-mcherry is expressed in a single cell if these are transgenic plants? The nuclear co-localization with npr1-gfp has to be shown more clearly with high res. images and also be quantified, because the expression of atg6-mcherry is not as uniform as npr1-gfp.

      Thank you for pointing this out. Contrary to the reviewer's assertion, the ATG6-mCherry fluorescence signal depicted in Figure S3b was not exclusive to a single cell. In fact, this fluorescence was also evident in other cells, albeit with relatively weaker intensity. This disparity in fluorescence intensity may be attributed to the irregularities in leaf structure at the time of image capture using the microscope. To bolster our conclusion, we further examined the fluorescence signals in the cells of the root elongation zone in ATG6-mCherry x NPR1-GFP, as depicted in the figure below. Our observations revealed that the fluorescence signals of ATG6-mCherry exhibited uniform distribution, with detection in both the cytoplasm and nucleus. We have replaced the original unclear image with a high-quality image.

      Lines 138-143: In fig. S3d, it would make more sense to show the WB on the hybrid npr1-gfp/atg6-mcherry plants with both anti-gfp and anti-mcherry antibodies to detect the free mcherry/gfp. Since the analysis of the level of free FP is done, then why didn't they test the free mcherry levels in Figure S4a? This would be more important than testing the free GFP in ATG6-GFP plants, because the imaging of atg6-mcherry was done in the hybrid plants (fig. S3b).

      Thank you for pointing this out. We initially synthesized the ATG6 antibody (anti-ATG6, 1:200, peptide, C-KEKKKIEEEERK, Abmart) in order to detect the endogenous ATG6 protein, and we also tested the specificity and potency of the ATG6 antibody (results are shown in Fig. S17). Additionally, in order to determine the location of the ATG6-mCherry bands, we also detected ATG6-mCherry in ATG6-mCherry Arabidopsis using the ATG6 antibody, and we also used Col as a control (results are shown in Fig. S4). These results show that our synthesized ATG6 antibody can effectively and clearly immunize to both ATG6 and ATG6-mCherry. Therefore, in this study, we used the ATG6 antibody to analyze both ATG6-mCherry and endogenous ATG6. Detailed antibody information is presented in Supplementary Data 1, table S4. In the previous experiments, we procured the mCherry antibody (mCherry-Tag Monoclonal Antibody(6B3), BD-PM2113, China) to immunolabel ATG6-mCherry. However, we encountered challenges with the potency of this mCherry antibody, and considering our budget constraints, as well as the availability of our self-synthesized ATG6 antibody, we chose not to pursue the purchase of another antibody from a different company for the continuation of the Western Blot experiment.

      In Figure 2c, there's no atg6-mcherry detected at time 0, in either cytoplasm or nucleus, yet the microscope images in panel a show strong accumulation in both compartments.

      Thank you for pointing this out. Previous studies ATG6 can also be degraded via the 26s proteasome pathway (Qi et al., 2017). We speculate that this phenomenon might be attributed to the rapid turnover of ATG6 at time 0.

      Lines 156-160: this statement is unsupported by the data. In fig. S5, the bands for native atg6 in the nuclear fraction are extremely weak, and they do not show the reverse pattern of change along the time points compared to the cytoplasmic fraction, which would indicate that the nuclear fraction is complementary to the cytoplasmic pool of the protein. The result more likely suggests that the majority of the ATG6 is in the cytoplasm, and that the weak bands detected in the nucleus are either background signal, or a contamination from the cytoplasmic pool. At this low protein level or poor immuno-detection the background signal is inevitable due to overexposure. Even though the actin marker is not detected in the nuclear fraction, it doesn't necessarily mean that there's no contamination from the cytoplasm in the nuclear fraction. The actin is just too abundant and can be detected at lower exposure.

      Thank you for pointing this out. In Fig. S5, we detected the subcellular localization of endogenous ATG6, although the image quality was somewhat low. Nevertheless, the cytosolic and nuclear localization of ATG6 could be clearly observed. In addition to this, we also verified the cytosolic and nuclear localization of ATG6 in Arabidopsis using confocal fluorescence microscopy and nucleoplasmic separation experiments. Actin and H3 were used as cytoplasmic and nucleus internal reference, respectively. (Fig. 2e and f). Furthermore, we observed the cytosolic and nuclear localization of ATG6 when we expressed ATG6-GFP or ATG6-mCherry in tobacco leaves through cis-transfection experiments (Fig. 2a-d). These results are consistent with the prediction of the subcellular location of ATG6 in the Arabidopsis subcellular database (https://suba.live/) (Fig. S3c). The reviewer's feedback has been valuable in helping us present these findings more clearly. We acknowledge the limitations in the image quality for the endogenous ATG6 localization, but we believe the combination of multiple experimental approaches, including the use of fluorescent protein fusions, provides robust evidence for the cytosolic localization of ATG6 in plant cells. Moving forward, we will continue to investigate the significance of ATG6's subcellular distribution and its potential dual roles in both the nucleus and the cytosol, particularly in the context of its interaction with the key immune regulator NPR1. We appreciate the reviewer's constructive comments, as they will help us strengthen the presentation and interpretation of our findings.

      In Figure 3a the images are of too low resolution to see the co-localization. The focal planes of the top and bottom panels are quite different: the top is focused on stomata, the bottom - on pavement cells. So, the number of the NPR1-GFP nuclei between these two focal planes is dramatically different. Also, it looks like the atg6-mcherry in these plants are predominantly in the cytoplasm, not the nucleus as the authors claim. A higher resolution and higher quality of images are required to determine this.

      Thank you for pointing this out. To ensure the clarity and accuracy of our confocal images, we have supplied a clearer image as supplementary evidence. The Bright images distinctly show that both sets of images are in the same plane of focus. Furthermore, in the figure (third one in the fourth column), the nucleus localization of ATG6-mCherry is clearly visible, and that ATG6-mCherry is co-localized with NPR1-GFP in the nucleus, as indicated by the white arrow.

      In Figure 3b, it is not indicated what exactly was measured and in what condition, mock or SA. If these are numbers of nuclei, then it should be indicated what size of the area was sampled, not just "section", and both mock and SA should be included in the measurements. Also, how many independent images have been sampled? what does the error bar represent? What does "normal" mean? Shouldn't this be a mock treatment?

      Thank you for pointing out this. The term "Normal" in this context refers to mock treatment, and we have revised the description for clarity. In Figure 3b, the graph illustrates the count of nuclear localizations of NPR1-GFP in ATG6-mCherry × NPR1-GFP and NPR1-GFP Arabidopsis plants following SA treatment. Statistical data were obtained from three independent experiments, each comprising five individual images, resulting in a total of 15 images analyzed for this comparison. Detailed descriptions were also added to the revised manuscript (Lines 568-570, 800-804).

      Lines 167-168: the proposed increase of NPR1-GFP in the nucleus could be simply due to a higher accumulation of SA in the hybrid plants, not because of the direct interaction of atg6.

      Thank you for pointing out this. Our results confirmed that ATG6 overexpression significantly increased nuclear accumulation of NPR1 (Fig. 3). Notably, the ratio (nucleus NPR1/total NPR1) in ATG6-mCherry × NPR1-GFP was not significantly different from that in NPR1-GFP, and there is a similar phenomenon in N. benthamiana (Fig. 3c-f). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 might result from higher levels and more stable NPR1, rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. Furthermore, we found that under SA treatment, the protein levels of NPR1 were significantly higher in the ATG6-mCherry × NPR1-GFP line compared to the NPR1-GFP line (Fig. 5a). Notably, even in the absence of differences in SA levels between the two lines, we observed that ATG6 could delay the degradation of NPR1 under normal conditions (Fig. 6). These findings suggest that ATG6 employs both SA-dependent and SA-independent mechanisms to maintain the stability of the key immune regulator NPR1. In summary, we therefore suggest that the increased nuclear accumulation in NPR1 cells is a dual effect of SA and ATG6.

      Lines 202-204: "Increased nuclear accumulation" implies increased translocation. However, they found that the ratio of NPR1-GFP does not change (Figure 3), so the reason for higher nuclear accumulation is not translocation, but abundance.

      Thank you for pointing out this. Our results confirmed that ATG6 overexpression significantly increased nuclear accumulation of NPR1 (Fig. 3). ATG6 also increases NPR1 protein levels and improves NPR1 stability (Fig. 5 and 6). Therefore, we consider that the increased nuclear accumulation of NPR1 in ATG6-mCherry x NPR1-GFP plants might result from higher levels and more stable NPR1 rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. To verify this possibility, we determined the ratio of NPR1-GFP in the nuclear localization versus total NPR1-GFP. Notably, the ratio (nucleus NPR1/total NPR1) in ATG6-mCherry × NPR1-GFP was not significantly different from that in NPR1-GFP, and there is a similar phenomenon in N. benthamiana (Fig. 3c-f). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 might result from higher levels and more stable NPR1, rather than the enhanced nuclear translocation of NPR1 facilitated by ATG6. Further we analyzed whether ATG6 affects NPR1 protein levels and protein stability. Our results show that ATG6 increases NPR1 protein levels under SA treatment and ATG6 maintains the protein stability of NPR1 (Fig. 5 and 6). These results suggested that the increased nuclear accumulation of NPR1 by ATG6 result from higher levels and more stable NPR1. The corresponding description is shown in revised manuscript (lines 338~352).

      Lines 204-205: the co-localization in Figure 1d cannot be interpreted as interaction.

      Thank you for the feedback. We have replaced the co-localization image with a BiFC (Bimolecular Fluorescence Complementation) image to better illustrate the interaction in Fig 1d.

      What age of plants were used for the analysis in Figures 4 and S7? The age of the plant might significantly affect the free SA levels under control conditions.

      Thank you for the feedback. In Figures 4 and S7, 3-week-old plants were used to determine salicylic acid (SA) levels and the expression of target genes. Figures 4 and S7 figure notes provide detailed descriptions (lines 818-819).

      In Figure 5a they treat with SA, but the analysis in Figure S10 is done with the pathogen, so how can these data be correlated?

      Thank you for pointing out this. Previous studies have demonstrated that pathogen infestation rapidly increases the salicylic acid (SA) content in plants, and the elevated SA then activates plant immune responses. Therefore, both pathogen treatment and direct SA treatment can activate SA-dependent plant immune responses. The NPR1 protein is known for its instability. In Figure 5a, we utilized a 0.5 mM SA treatment to assess the changes in NPR1 protein levels, as the impact of SA treatment is more immediate and pronounced.

      Lines 241-242: In Figure 5b, it is not clear why there's no detection of NPR1-GFP and atg6-mcherry at time 0?? The levels of proteins in the transient assay are sufficiently high for detection by WB.

      Thank you for pointing this out. The NPR1 protein is known to be unstable and prone to degradation through the 26S proteasome pathway (Spoel et al., 2009; Saleh et al., 2015). In addition, previous studies ATG6 can also be degraded via the 26s proteasome pathway (Qi et al., 2017). We speculate that this phenomenon might be attributed to the rapid turnover of NPR1 and ATG6 at time 0.

      In Figures 5c-d, the quality of these images is very poor, and they do not clearly show the signs. What structure was exactly measured in these images? There are so many fluorescent bodies there, that it is not clear what are we looking at. Also, it is not clear why they did not show the mcherry channel? It would be important to see if the bodies in SA-treated plants show co-localization with atg6-mcherry autophagosomes (if these exist at all).

      Thank you for pointing this out. Interestingly, similar to previous reports (Zavaliev et al., 2020), SA promoted the translocation of NPR1 into the nucleus, but still a significant amount of NPR1 was present in the cytoplasm (Fig. 3c and e). Previous studies have shown that SA increased NPR1 protein levels and facilitated the formation of SINCs in the cytoplasm, which are known to promote cell survival (Zavaliev et al., 2020). We therefore observed the fluorescence signal of SINCs-like condensates in the cytoplasm of tobacco leaves. After 1mM SA treatment, more SINCs-like condensates fluorescence were observed in N. benthamiana co-transformed with ATG6-mCherry + NPR1-GFP compared to mCherry + NPR1-GFP (Fig. 5c-d and Supplemental movie 1-2). We have a clearer demonstration in the supplemental video movie 1-2. Additionally, we observed that SINCs-like condensates signaling partial co-localized with certain ATG6-mCherry autophagosomes fluorescence signals.

      Lines 245-247: so, is it atg6 or SA that increases the NPR1 levels? If this is due to SA, then the whole study doesn't have novelty, because we already know from previous works that SA increases the stability of npr1.

      Thank you for pointing this out. Indeed, previous studies have shown that salicylic acid (SA) increases NPR1 levels and protein stability (Spoel et al., 2009; Saleh et al., 2015). In our experiments, we found that under SA treatment, the protein levels of NPR1 were significantly higher in the ATG6-mCherry × NPR1-GFP line compared to the NPR1-GFP line (Fig. 5a). Additionally, free SA levels were also significantly elevated in the ATG6-mCherry × NPR1-GFP line under pathogen challenge (Pst DC3000/avrRps4), but not under normal conditions (Fig. 4a). Furthermore, even in the absence of differences in SA levels between the two lines, we observed that ATG6 could delay the degradation of NPR1 under normal conditions (Fig. 6). These findings represent one of our new discoveries. These findings suggest that ATG6 employs both SA-dependent and SA-independent mechanisms to maintain the stability of the key immune regulator NPR1.

      Lines 313-316: npr1 and atg6 can function independently from each other, so the term "jointly" is misleading. Based on the overall data provided in this manuscript it cannot be concluded that the two proteins work in one complex to control plant immunity.

      Thank you for pointing this out. In the revised manuscript "jointly" has been changed to “cooperatively”.

      Lines 369-374: this speculation is beyond the main hypothesis claiming that atg6 functions through npr1. If atg6 can activate the transcription alone, then what is the significance of its activation of npr1? How can one distinguish between the two?

      Thank you for pointing this out. Transcription activation by transcription factors typically requires at least two conserved structural domains: a transcription activation domain and a DNA-binding domain. However, ATG6 does not possess these two typical conserved structural domains found in canonical transcription factors. Given this structural context, it is unlikely that ATG6 would be able to directly activate transcription on its own. The lack of the canonical transcription factor domains in ATG6 suggests that it may not be able to function as a direct transcriptional activator. Previous studies have shown that acidic activation domains (AADs) in transcriptional activators (such as Gal4, Gcn4 and VP16) play important roles in activating downstream target genes. Acidic amino acids and hydrophobic residues are the key structural elements of AAD (Pennica et al., 1984; Cress and Triezenberg, 1991; Van Hoy et al., 1993). Chen et al. found that EDS1 contains two ADD domains and confirmed that EDS1 is a transcriptional activator with AAD (Chen et al., 2021a). Here, we also have similar results that ATG6 overexpression significantly enhanced the expression of PR1 and PR5 (Fig. 4b-c and S9), and that the ADD domain containing acidic and hydrophobic amino acids is also found in ATG6 (148-295 AA) (Fig. S14). We speculate that ATG6 might act as a transcriptional coactivator to activate PRs expression synergistically with NPR1.

      Lines 389-400: the cell death due to AvrRPS4 in Col-0 ecotype is extremely weak as there's no complete receptor complex for this effector. So, one has to use a very high dose to induce cell death in Col-0, certainly higher than the one used for bacterial growth. The authors used the same dose in both assays, so it is likely that what we see as "cell death" is not an effector-triggered response, but rather symptom-associated for the virulent pathogen.

      Thank you for pointing this out. Indeed, as the reviewer pointed out, most cell death assays use higher concentrations of Pst DC3000/avrRps4 or Pst DC3000/avrRpt2, but they typically treat Arabidopsis for a relatively short period, usually less than 1 day(Hofius et al., 2009; Zavaliev et al., 2020). In this study, although we used relatively low Pst DC3000/avrRps4 (0.001) injections, we detected cell death under a relatively long period of Pst DC3000/avrRps4 infestation (3 days). Pst DC3000/avrRps4-infested plants multiply significantly in host cells, and therefore we assumed that the propagated pathogens after 3 days of incubation would be sufficient to induce intense cell death. Consequently, we chose this concentration of Pst DC3000/avrRps4 for the experiment.

      Lines 407-416: why do you expect "delay of degradation" with autophagy inhibitor? Shouldn't it be the opposite? In Figure S14, if we compare the bands between 120min and 120min+ConA+WM, the effect of autophagy inhibitors is actually quite strong (0.47 vs 0.22), with about 50% more degradation of NPR1 in their presence. So, the conclusion that the degradation of NPR1 is autophagy-independent is wrong according to this result.

      Thank you for pointing this out. We have revised the inaccurate description, as outlined in the revised manuscript (lines 413-425).

      References

      Backer R, Naidoo S, van den Berg N. 2019. The NONEXPRESSOR OF PATHOGENESIS-RELATED GENES 1 (NPR1) and Related Family: Mechanistic Insights in Plant Disease Resistance. Front Plant Sci 10, 102.

      Castello MJ, Medina-Puche L, Lamilla J, et al. 2018. NPR1 paralogs of Arabidopsis and their role in salicylic acid perception. PLoS One 13, e0209835.

      Chen H, Li M, Qi G, et al. 2021a. Two interacting transcriptional coactivators cooperatively control plant immune responses. Sci Adv 7, eabl7173.

      Chen J, Mohan R, Zhang Y, et al. 2019. NPR1 Promotes Its Own and Target Gene Expression in Plant Defense by Recruiting CDK8. Plant Physiol 181, 289-304.

      Chen J, Zhang J, Kong M, et al. 2021b. More stories to tell: NONEXPRESSOR OF PATHOGENESIS-RELATED GENES1, a salicylic acid receptor. Plant Cell Environ.

      Cress WD, Triezenberg SJ. 1991. Critical structural elements of the VP16 transcriptional activation domain. Science 251, 87-90.

      Devadas SK, Raina R. 2002. Preexisting systemic acquired resistance suppresses hypersensitive response-associated cell death in Arabidopsis hrl1 mutant. Plant Physiol 128, 1234-1244.

      Ding Y, Sun T, Ao K, et al. 2018. Opposite Roles of Salicylic Acid Receptors NPR1 and NPR3/NPR4 in Transcriptional Regulation of Plant Immunity. Cell 173, 1454-1467 e1415.

      Falk A, Feys BJ, Frost LN, et al. 1999. EDS1, an essential component of R gene-mediated disease resistance in Arabidopsis has homology to eukaryotic lipases. Proc Natl Acad Sci U S A 96, 3292-3297.

      Feys BJ, Moisan LJ, Newman MA, et al. 2001. Direct interaction between the Arabidopsis disease resistance signaling proteins, EDS1 and PAD4. EMBO J 20, 5400-5411.

      Fu ZQ, Yan S, Saleh A, et al. 2012. NPR3 and NPR4 are receptors for the immune signal salicylic acid in plants. Nature 486, 228-232.

      Hofius D, Schultz-Larsen T, Joensen J, et al. 2009. Autophagic components contribute to hypersensitive cell death in Arabidopsis. Cell 137, 773-783.

      Jones JD, Dangl JL. 2006. The plant immune system. Nature 444, 323-329.

      Jurkowski GI, Smith RK, Jr., Yu IC, et al. 2004. Arabidopsis DND2, a second cyclic nucleotide-gated ion channel gene for which mutation causes the "defense, no death" phenotype. Mol Plant Microbe Interact 17, 511-520.

      Lee HJ, Park YJ, Seo PJ, et al. 2015. Systemic Immunity Requires SnRK2.8-Mediated Nuclear Import of NPR1 in Arabidopsis. Plant Cell 27, 3425-3438.

      Liu Y, Schiff M, Czymmek K, et al. 2005. Autophagy regulates programmed cell death during the plant innate immune response. Cell 121, 567-577.

      Liu Y, Sun T, Sun Y, et al. 2020. Diverse Roles of the Salicylic Acid Receptors NPR1 and NPR3/NPR4 in Plant Immunity. Plant Cell 32, 4002-4016.

      McKim SM, Stenvik GE, Butenko MA, et al. 2008. The BLADE-ON-PETIOLE genes are essential for abscission zone formation in Arabidopsis. Development 135, 1537-1546.

      Nawrath C, Metraux JP. 1999. Salicylic acid induction-deficient mutants of Arabidopsis express PR-2 and PR-5 and accumulate high levels of camalexin after pathogen inoculation. Plant Cell 11, 1393-1404.

      Patel S, Dinesh-Kumar SP. 2008. Arabidopsis ATG6 is required to limit the pathogen-associated cell death response. Autophagy 4, 20-27.

      Pennica D, Goeddel DV, Hayflick JS, et al. 1984. The amino acid sequence of murine p53 determined from a c-DNA clone. Virology 134, 477-482.

      Qi H, Xia FN, Xie LJ, et al. 2017. TRAF Family Proteins Regulate Autophagy Dynamics by Modulating AUTOPHAGY PROTEIN6 Stability in Arabidopsis. Plant Cell 29, 890-911.

      Rate DN, Greenberg JT. 2001. The Arabidopsis aberrant growth and death2 mutant shows resistance to Pseudomonas syringae and reveals a role for NPR1 in suppressing hypersensitive cell death. Plant J 27, 203-211.

      Saleh A, Withers J, Mohan R, et al. 2015. Posttranslational Modifications of the Master Transcriptional Regulator NPR1 Enable Dynamic but Tight Control of Plant Immune Responses. Cell Host Microbe 18, 169-182.

      Skelly MJ, Furniss JJ, Grey H, et al. 2019. Dynamic ubiquitination determines transcriptional activity of the plant immune coactivator NPR1. Elife 8.

      Spoel SH, Mou Z, Tada Y, et al. 2009. Proteasome-mediated turnover of the transcription coactivator NPR1 plays dual roles in regulating plant immunity. Cell 137, 860-872.

      Van Hoy M, Leuther KK, Kodadek T, et al. 1993. The acidic activation domains of the GCN4 and GAL4 proteins are not alpha helical but form beta sheets. Cell 72, 587-594.

      Yuan M, Ngou BPM, Ding P, et al. 2021. PTI-ETI crosstalk: an integrative view of plant immunity. Curr Opin Plant Biol 62, 102030.

      Yue J, Sun H, Zhang W, et al. 2015. Wheat homologs of yeast ATG6 function in autophagy and are implicated in powdery mildew immunity. BMC Plant Biol 15, 95.

      Zavaliev R, Mohan R, Chen T, et al. 2020. Formation of NPR1 Condensates Promotes Cell Survival during the Plant Immune Response. Cell 182, 1093-1108 e1018.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The authors should possibly discuss more the other cases when LTPs of the same type of ORP9 and ORP10 have been found to dimerise. They should definitely cite and discuss the evidence reported in February this year in CMLS (see https://link.springer.com/article/10.1007/s00018-023-04728-5). In this paper, authors reported very similar findings as those the authors have in Figures 3, 4, S6, S7, and S8. Specifically, in this CMLS paper the authors find that ORP9 and ORP10 (not ORP11) interact through a central helical region and that ORP9 localises ORP10 to the ER-Golgi MCSs by providing ORP10 with a binding site for VAPs, where the heterodimer mediates the exchange of PtdIns(4)P for PtdSer. 

      We thank the reviewer for their recommendations. The mentioned paper has simply gone unnoticed by us and is now referred in the revised manuscript. Various other papers reporting on LTP dimerizations are already cited in our manuscript: ORP9-ORP10 dimerization (Kawasaki et al. 2022), ORP9-ORP11 dimerization (Zhou et al. 2010), and ORP9-ORP10/11 dimerization (Tan and Finkel 2022). Revised manuscript now discusses the dimerization of CERT and OSBP while citing Gehin et al. 2023, Ridgway et al. 1992 and de la Mora et al. 2021.

      Reviewer #2 (Recommendations For The Authors): 

      Model and Discussion: 

      Give an idea about the aspect of SMS1 function that is being affected. Even if no further experiments were carried out, the authors could discuss possibilities. One might speculate what the PS is being used for. For example, is it a co-factor for integral membrane proteins, such as flippases? Is it a co-factor for peripheral membrane proteins, such as yet more LTPs? The model could include the work of Peretti et al (2008), which linked Nir2 activity exchanging PI:PA (Yadav et al, 2015) to the eventual function of CERT. Could the PS have a role in removing/reducing DAG produced by CERT? 

      We thank the reviewer for their recommendations. The same recommendations were also scripted in the public review, which we believe we answered sufficiently. 

      Other, Minor: 

      Make clear that there is no sterol readout (Fig 1C) 

      We would like to point out that Figure 1C has a sterol readout as CE refers to cholesterol esters.

      PH domains of ORP9 and ORP11 localized only partially to the Golgi, unlike the PH domains of OSBP and CERT" (line 154). Say here where the non-Golgi ORP9 and ORP11 PH domain pool is - presumably in the cytoplasm.  

      We thank the reviewer for their suggestion and rephrase the sentence accordingly. 

      Fig 7H-J: histograms not lines as these are separate unlinked categories

      We thank the reviewer for their suggestion. However, we think the original figure represent our findings in the best possible way. Our analysis regarding individual lipid species is also included in Supplementary figure 10.

      Reviewer #3 (Recommendations For The Authors): 

      (1) At the end of the intro, in summarizing their findings, the authors state (p3. lines 48-49) "These findings highlight how phospholipid and sphingolipid gradients along the secretory pathway are linked at ER-Golgi membrane contact sites." This should instead read "These findings highlight THAT phospholipid and sphingolipid gradients along the secretory pathway are linked at ER-Golgi membrane contact sites." 

      We thank the reviewer for their suggestion and change the sentence accordingly.

      (2) As noted in the public section, to show that ORP9/11 do indeed exchange lipids, an in vitro experiment demonstrating that ORP11 can transfer PI4P is essential. Ideally, it would be best to examine PS AND PI4P transfer by ORP9 AND 11 separately AND then by the ORP9/11 heterodimer. This could lend insights as to the function of the heterodimer. The He et al et Yu paper should provide guidelines for this. Why have the heterodimers? 

      We believe we addressed this point by showing the lipid transfer ability of the ORP9-ORP11 dimer. These findings are now part of the revised manuscript.

      (3) It would be interesting to discuss the roles of ORP9/ORP11 versus ORP9/ORP10... they seem so analogous, although this is at the discretion of the authors. 

      We thank the reviewer for their suggestion. Since the difference between ORP9-ORP10 and ORP9-ORP11 dimers was also raised by other reviewers, we decided to include this discussion in the manuscript. A section based on our answer to Reviewer #2 in Public Review is now part of the Discussions.

      (4) The authors used a melanoma cell line in their screens (p3, line 59). Could they explain why they used this cell line versus others? 

      We chose MelJuSo cell for various reasons. Mainly, MelJuSo are diploid, which eases generating knockouts in a screening setup compared to other polyploid cancer cell lines (e.g. HeLa). Furthermore, our CRISPR/Cas9 screening protocols are optimized for these cell lines.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work provides significant insight into freshwater cable bacteria (CB) and is an important contribution to the emerging CB literature. In this manuscript, Yang et al. describe currentvoltage measurements on CB collected from two freshwater sources in Southern California. The studies use electrostatic and conductive atomic force microscopies, as well as four-probe measurements. These measurements are consistent with back-of-the-envelope calculations on conductivities needed to sustain CB function. The data shows that freshwater CB have a similar structure and function to the more studied marine cable bacteria.

      Strengths:

      Excellent measurements on a new class of cable bacteria.

      Weaknesses:

      The paper would benefit from additional analysis of the data.

      Reviewer #1 (Recommendations for The Authors):

      This work provides significant insight into freshwater cable bacteria (CB) and is an important contribution to the emerging CB literature. In this manuscript, Yang et al. describe current-voltage measurements on CB collected from two freshwater sources in Southern California. The studies use electrostatic and conductive atomic force microscopies, as well as four-probe measurements. These measurements are consistent with back-of-the-envelope calculations on conductivities needed to sustain CB function. The data shows that freshwater CB have a similar structure and function to the more studied marine cable bacteria. Minor comments follow.

      We are grateful to the reviewer for the encouraging feedback and for appreciating the central message of the preprint. Below we address the reviewer’s constructive comments.

      Additional information could be provided regarding the degraded cells where an 'empty cage' remains, as well as the polyphosphate granules, which were previously observed in marine CB (refs. 11 and 18). 

      We have edited the manuscript to note that the appearance of empty cages and the polyphosphate granules in freshwater cable bacteria is indeed consistent with these features as previously reported in marine CB. The size of polyphosphate granules in freshwater CB are comparable or slightly smaller than in marine CB (Sulu-Gambari et al., 2015). In the case of empty cages, these cells were previously described as ‘ghost filaments’ which had lost all cell membrane and cytoplasmic material (Cornelissen et al., 2018). 

      Manuscript edits: a sentence regarding polyphosphate granules has been added into the manuscript from lines 307 - 308. “The size of polyphosphate granules in freshwater CB (70 nm – 400 nm) is comparable or slightly smaller than in marine CB (35)”.

      A sentence regarding the empty cages has been added into the manuscript (lines 303-305). “These empty cages were previously described as ‘ghost filaments’ which had lost all cell membrane and cytoplasm material (20).”

      The authors also state that the 'phase difference between the elevated ridges and interridge regions is proportional to the tip voltage squared,' and refer to Fig. 4D. This figure has only three data points with large error bars. The authors may wish to explain this finding and justify their analysis in greater detail.

      We thank the reviewer for pointing out that we presented this result but did not adequately describe its origin or significance. In general, the probe phase response of electrostatic force microscopy (EFM) can originate not only from the electrostatic interaction with the sample (i.e. the electrical properties of interest) but also from shorter range van der Waals forces (which are more reflective of probe-sample distance i.e. topography). To ensure that EFM is reporting electrical interactions, we performed these measurements using a two-pass technique, with the second pass retracing the topography measured during the first pass, but at a fixed height above the surface where the interactions are long range (electrostatic) rather than short range (vdW) or resulting from topography cross-talk. The purpose of the voltage change measurement (Fig. 4D) is to simply assess whether this procedure is successful, since electrostatic forces are proportional to the square of the voltage at a fixed height (F = ½ . ∂C⁄∂z .V2). While the error bar of that measurement is high, due to the intrinsic noise in the dynamic (high frequency) EFM phase response measurement, we note that the purpose of this measurement is simply to assess that the interaction is due to the electrical interaction with the sample, before proceeding to actual conductance measurements (Figs. 5-8).

      Manuscript edits: we previously simply cited a reference where the reader can delve deeper into the origin of the square voltage signal. To put this into better context, we now include an additional information (lines 461 - 475), noting the origin and purpose of the result as described above.  

      It is interesting that the freshwater CB appear to be more resilient to air compared to marine CB (or at least some freshwater filaments, as the authors note that the level of resilience is filament-dependent). The authors indicate that salt affects oxygen solubility and there is a larger oxygen content in freshwater. Do the authors have thoughts on whether or not the differences between marine and freshwater CB could fit, or not fit, with the hypothesis that conductivity in air is lowered due to oxidation of the Ni/S species (ref. 25 in manuscript)? Could the freshwater CB have greater protection against oxidation?

      We thank the reviewer for highlighting this point. Indeed, our manuscript mentions the current hypothesis that conductivity of cable bacteria may be diminished upon oxidation of the Ni/S groups (lines 101 - 105 and 498 - 504). It remains unclear how this idea may lead to variability between marine and freshwater cables. Interestingly, however, a recent comparative bioRxiv preprint (Digel et. al. 2023) noted significant differences in the morphology, number, and crosssectional area of nanofibers between a freshwater and marine CB strain. These differences may lead to a different resiliency against oxidative degradation upon exposure air. Specifically, even though the marine CB strain was characterized by a larger cross-section area per nanofiber, it had significantly fewer nanofibers, leading to 40% smaller total area than its freshwater counterpart. We have edited the manuscript to highlight these possible differences (at least in size) between freshwater and marine cables.

      Manuscript edits (lines 506 – 514) “For example, a recent comparative study (21) hints at significant differences in the morphology, number, and size of nanofibers when comparing a marine CB strain to a freshwater CB strain. Specifically, while the marine CB was characterized by a 50% larger cross-sectional area per nanofiber, the total nanofibers’ area was 40% smaller than the freshwater strain due to a smaller number of nanofibers per CB filament. Given the proposed central role of nanofibers in mediating electron transport along CB, it is possible that such differences may also lead to different degrees of tolerance against oxidative degradation upon exposure to air.”

      Figure 6D shows current-voltage measurements from three representative cables; there is a large variation, most notably between Cable 1 and Cables 2 and 3. Is this variation typical for different cables? Can the authors comment on the range of values observed and how many cables fit into different ranges? Any thoughts on the reasons behind the range?

      Figure 6 B and C (red and blue) are representative of most of the cable conductance measured using the point IV CAFM technique, with the Figure 6 A (green) IV curve being an example of the upper limit, which was less frequently observed. In total we measured ten cables using the point IV CAFM technique. These variations may stem from actual differences in the conductivity of separate CB filaments, the environment of the measurement, or limitations in the conductive AFM measurement techniques. These limitations include a large contact resistance due to the interaction of the small probe with the sample, which may lead to large variability depending on the contact point.  For this reason, we rely on 4-probe measurements (Fig. 8) for quantitative conductive analyses, rather than conductive AFM. It is important to note, however, that the conductive AFM measurements (Fig. 6 and Fig. 7) provide other complementary information including the demonstration of both transverse and longitudinal transport (lines 389-393) in Fig. 6 and the visualizing of the current carrying nanofibers in Fig. 7. 

      Manuscript edits: we have edited the manuscript (lines 413 - 418) to make it clear that the quantitative estimate of conductivity was made only using 4 probe measurements due to the limitations of CAFM or two-probe techniques.

      Can the authors comment on how the number of fibers per CB in their samples compares with the number of fibers in marine CB? Marine CB are known to have pinwheel junctions where the fibers come together before branching out again. This pinwheel design could play a role in the function of the CB or in its survival (see Adv. Biosys. 2020, 4, 2000006). Were pinwheel structures observed in freshwater CB? If so, how do they compare?

      From the previous studies, estimates of the number of fibers in marine CB appeared to vary significantly from 15 or 17 (Pfeffer et. al., 2012) to 58 – 61 (Cornelissen et. al., 2018). In our freshwater CB, we estimated the number of fibers at ~35 per CB (line 423), which is comparable to the count of 34 per freshwater CB recently reported by Digel et al., bioRxiv 2023. We cannot specifically comment on the pinwheel structure as we did not perform the transverse thin section TEM imaging necessary to observe the cell-cell junctions in this particular study.

      On lines 95-96, the authors discuss the fact that marine cable bacteria have a wide variance in their measured conductivities. While one may ask if the larger marine conductivities (near 80 S/cm) are representative, a conductivity of 0.1 S/cm is 2 orders of magnitude lower than this value, which the field generally refers to as a high conductivity. The authors should mention whether or not any of their specimens display the high conductivities seen in select marine cable bacteria specimens.

      It is indeed important to note that the ~80 S/cm figure refers to an upper end previously observed (ref. 22) for marine CB conductivity. In our manuscript (lines 525 - 526), we highlight that the previously observed range (including in that same study) is 10−2-101 S/cm and we were careful to qualify the previously reported upper end with ‘reaching as high as’ (line 97). Note that this places our measurement of 0.1 S/cm within the previously reported range. We have not observed freshwater CB conductivity near the upper end of the previously reported range, and generally propose that these types of measurements are better analyzed in the context of the biological function rather than ‘high vs. low’. Towards that end, the manuscript (lines 527-537) makes the argument that the 10-1 S/cm figure may be sufficient to support the electrical currents mediated by CB in sediments. We have edited the manuscript to highlight that we did not observe single CB nanofiber conductivity near the upper limit previously observed in marine CB (lines 522 525). 

      Reviewer #2 (Public Review):

      Summary:

      In this work, Mohamed Y. El-Naggar and co-workers present a detailed electronic characterization of cable bacteria from Southern California freshwater sediments. The cable bacteria could be reliably enriched in laboratory incubations, and subsequent TEM characterization and 16S rRNA gene phylogeny demonstrated their belonging to the genus Candidatus Electronema. Atomic force microscopy and two-point probe resistance measurements were then used to map out the characteristics of the conductive nature, followed by microelectrode four-probe measurements to quantify the conductivity.

      Interestingly, the authors observe that some freshwater cable bacteria filaments displayed a higher degree of robustness upon oxygen exposure than what was previously reported for marine cable bacteria. Finally, a single nanofiber conductivity on the order of 0.1 S/cm is calculated, which matches the expected electron current densities linking electrogenic sulphur oxidation to oxygen reduction in sediment. This is consistent with hopping transport.

      Strengths and weaknesses:

      A comprehensive study is applied to characterize the conductive properties of the sampled freshwater cable bacteria. Electrostatic force microscopy and conductive atomic force microscopy provide direct evidence of the location of conductive structures. Four-probe microelectrode devices are used to quantify the filament resistance, which presents a significant advantage over commonly used two-probe measurements that include contributions from contact resistances. While the methodology is convincing, I find that some of the conclusions seem to be drawn on very limited sample sizes, which display widely different behavior. In particular:

      The authors observe that the conductivity of freshwater filaments may be less sensitive to oxygen exposure than previously observed for marine filaments. This is indeed the case for an interdigitated array microelectrode experiment (presented in Figure 5) and for a conductive atomic force microscopy experiment (described in line 391), but the opposite is observed in another experiment (Figure S1). It is therefore difficult to assess the validity of the conclusion until sufficient experimental replications are presented.

      We indeed acknowledge both in the abstract (line 23-26) and section 2.2 (lines 374-377) the variable nature of the sensitivity and filament-dependent response to air exposure. Our discussion (lines 498-506) considers the possible reasons for this variability:

      ‘While these observations showed a high degree of variability and therefore require a more detailed investigation, it is interesting to consider the possibility that the oxidative decline (or other damaging processes), thought to be a consequence of oxidation of Ni cofactors involved in electron transport (25), may not affect all sections of the cm long CB filaments simultaneously; under these conditions, IDA measurements, which probe multiple micrometer-scale electrode-crossing CB regions (e.g. 372 crossings in Figure 5 inset) may offer an advantage over techniques addressing entire CBs or specific CB regions. It is also interesting to consider an alternative possibility that the conductive properties of freshwater CB maybe intrinsically more oxygen-resistant than marine CB’.

      To summarize , the manuscript points to the likelihood that the IDA technique used here may offer an advantage for detecting currents under damaging conditions since it interrogates multiple sections simultaneously. Furthermore, in a recent preprint from Digel et al., (2023), the conductivity of the only freshwater strain investigated in that study was among the highest compared to other marine CB strains. Therefore, the freshwater CB being more resistant is one possibility to be investigated based on these observations and results. We therefore present the latter as a possibility in the discussion.

      The calculation of a single nanofiber conductivity is based on experiment and calculation with significant uncertainty. E.g. for the number of nanofibers in a single filament that varies depending on the filament size (Frontiers in microbiology, 2018, 9: 3044.), and the measured CB resistance, which does not scale well with inner probe separation (Figure 5). A more rigorous consideration of these uncertainties is required.

      The reviewer raises an important point. For these calculations, we made sure to determine the representative number of fibers per cable and thickness of the nanofibers (~50 nm) from our own samples. We indeed assessed the possible variability across our different cable filaments and found the fiber numbers varied from 30 – 44 (with 35 used as a representative figure in the paper). For the scaling of resistance with inner probe separation, our 4P results estimated that the CB resistances are 47 MΩ  and 240 MΩ for the 20 µm and 200 µm lengths, respectively, rather than an expected tenfold difference if the cable has a uniform conductivity along the entire filaments. This result suggests nonuniform conductivity in different sections of the CB filament. Since accounting for non-uniform conduction (and variability in fiber morphology/density) is clearly difficult, we were careful to limit our conclusion to an order of magnitude estimate (e.g. lines 522-525). Given the previously reported range of cable bacteria conductivity (10−2101 S/cm), this places our estimate within this range. We have further edited the manuscript to note that our reported single nanofiber conductivity cannot be constrained further than the order of 0.1 S/cm due to our estimates in nanofiber diameter and per cable amount as well as the possibility of nonuniform conductivity along the CB length (lines 522-525).

      Reviewer #2 (Recommendations for The Authors):

      Figure 4A: Please add scale- and color bar.

      Done - new Fig. 4 included with colors bars for topography and phase. The inset of Fig. 4A denotes a 200 nm scale bar (and that scale is now mentioned in the figure caption)

      Figure 5: A time series graph might be more instructive.

      Done - we indeed appreciate this suggestion and find that it improved the clarity of Figure 5. An inset has been included in Figure 5 plotting the resistance R change over time under different conditions. This inset demonstrates that the resistance of the cable on the IDA was slowly decreasing in the N2/H2 anaerobic chamber, only to start increasing upon exposure to ambient air.

      After putting the cable back into the chamber, the resistance again decreased over time.

    1. Author Response:

      We thank the reviewers for their insightful feedback. In our revised version of the manuscript, we will address all points raised.

      Regarding the preprocessing (Reviewer 1), we agree that the StandardRat pipeline is optimal for newly acquired datasets. However, since this study involves reanalyzing an already published dataset (Ionescu et al., JNM, 2023), which was preprocessed, analyzed, and published before the StandardRat paper, we aimed to maintain the same preprocessing. This approach allows for consistent interpretation of the readout regarding functional and molecular connectivity in the context of our previously published findings. Nonetheless, we agree that providing full access to the data will enable other researchers to reproduce our results using the StandardRat preprocessing pipeline and perform additional analyses on this rich dataset. Therefore, we will provide full access to the data via an open repository, as the reviewer suggested.

      Regarding anesthesia, we acknowledge that this is a limitation of our study, as more recent studies have indicated superior protocols. However, we and others have shown that, while not ideal, isoflurane at the used dose maintains stable physiology and does not cause burst suppression in rats. We will amend our discussion to reflect these points.

      Regarding the other points, we will amend the manuscript to provide more detail on the experimental design, including the tracer application as suggested by Reviewer 2, and clarify parts of the analysis that are unclear in the current version. Additionally, we agree with Reviewer 2 that our current terminology may cause confusion, and we will amend it accordingly. We will also discuss the other points raised by the reviewers, such as the reduced sample size for the pharmacological cohort as limitations in our discussion.

      Thank you for your understanding and the opportunity to improve our manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Editors’ recommendations for the authors

      The reviewers recommend the following: 

      (a) Digging deeper into the discussion of the density-dependent dispersal. 

      (b) Clarifying the microfluidic setup.  

      (c) Clarifying the description and interpretation of the transcriptomic evidence. 

      (d) Toning down carbon cycle connections (some reviewers felt the evidence did not fully support the claims). 

      We would like to thank the editors for their thoughtful evaluation of our manuscript and their clear suggestions. We have revised the manuscript in the light of these comments, as we outline below and address in detail in the point-by-point response to the reviewers’ comments that follows. 

      (a) We have expanded the discussion of density-dependent dispersal and revised Figure 2C to improve clarity. 

      (b) We have also added further information concerning the microfluidic setup in the results section and provide an illustration of the setup in a new figure panel, Figure 1A.

      (c) Addressing the reviewers’ comments on the transcriptomic analysis, we have added more information in the description and interpretation of the results. 

      (d) We have rephrased the text describing the role of degradation-dispersal cycles for carbon cycling to highlight it as the motivation of this study and emphasize the link to literature on foraging, without creating expectations of direct measurements of global carbon cycling.

      Public Reviews:

      Reviewer #1 (Public Review):

      [...]

      Weaknesses: 

      Much of the genetic analysis, as it stands, is quite speculative and descriptive. I found myself confused about many of the genes (e.g., quorum sensing) that pop up enriched during dispersal quite in contrast to my expectations. While the authors do mention some of this in the text as worth following up on, I think the analysis as it stands adds little insight into the behaviors studied. However, I acknowledge that it might have the potential to generate hypotheses and thus aid future studies. Further, I found the connections to the carbon cycle and marine environments in the abstract weak --- the microfluidics setup by the authors is nice, but it provides limited insight into naturalistic environments where the spatial distribution and dimensionality of resources are expected to be qualitatively different. 

      We thank the reviewer for their suggestions to improve our manuscript. We agree that the original manuscript would have benefitted from more detailed interpretation of the observed changes in gene expression. We have revised the manuscript to elaborate on the interpretation of the changes in expression of quorum sensing genes (see response to reviewer 1, comment 3), motility genes (see response to reviewer 1, comment 6), alginate lyase genes (see response to reviewer 1, comment 7 and reviewer 2, comment 2), and ribosomal and transporter genes (see response to reviewer 2, comment 2).

      In general, we think that the gene expression study not only supports the phenotypic observations that we made in the microfluidic device, such as the increased swimming motility when exposed to digested alginate medium, but  also adds further insights. Our reasoning for studying the transcriptomes in well mixed-batch cultures was the inability to study gene expression dynamics to support the phenotypic observations about differential motility and chemotaxis in our microfluidics setup. The transcriptomic data clearly show that even in well-mixed environments, growth on digested alginate instead of alginate is sufficient to increase the expression of motility and chemotaxis genes. In addition, the finding that expression of alginate lyases and metabolic genes is increased during growth on digested alginate was revealed through the analysis of transcriptomes, something which would not have been possible in the microfluidic setup. We agree with the reviewer that our analyses implicate further, perhaps unexpected, mechanisms like quorum sensing in the cellular response to breakdown products, and that this represents an interesting avenue for further studies.

      Finally, we  also agree with the reviewer that it would be good to be more explicit in the text that our microfluidic system cannot fully capture the complex dynamics of natural environments. Our approach does, however, allow the characterization of cellular behaviors at spatial and temporal scales that are relevant to the interactions of bacteria, and thus provides a better understanding of colonization and dispersal of marine bacteria in a manner that is not possible through in situ experiments. We have edited our manuscript to highlight this and modified our statements regarding carbon cycling towards emphasizing the role degradation-dispersal cycles in remineralization of polysaccharides (see response to reviewer 1, comment 2).  

      Reviewer #2 (Public Review):

      [...]

      Weaknesses: 

      The explanation of the microfluidics measurements is somewhat confusing but I think this could be easily remedied. The quantitative interpretation of the dispersal data could also be improved and I'm not clear if the data support the claim made. 

      We thank the reviewer for their comments and helpful suggestions. We have revised the manuscript with these suggestions in mind and believe that the manuscript is improved by a more detailed explanation of the microfluidic setup. We have added more information in the text (detailed in response to reviewer 2, comments 1 and 2) and have added a depiction of the microfluidic setup (Fig. 1A). We have also modified the presentation and discussion of the dispersal data (Fig. 2C), as described in detail below in response to reviewer 2, comment 4, and argue that they clearly show density-dependent dispersal. We believe that this modification of how the results are presented provides a more convincing case for our main conclusion, namely that the presence of degradation products controls bacterial dispersal in a density-dependent manner.  

      Reviewer #3 (Public Review):

      [...]

      Weaknesses: 

      I find this paper very descriptive and speculative. The results of the genetic analyses are quite counterintuitive; therefore, I understand the difficulty of connecting them to the observations coming from experiments in the microfluidic device. However, they could be better placed in the literature of foraging - dispersal cycles, beyond bacteria. In addition, the interpretation of the results is sometimes confusing. 

      We thank the reviewer for their suggestions to improve the manuscript. We have edited the manuscript to interpret the results of this study more clearly, in particular with regard to the fact that breakdown products of alginate cause cell dispersal (see response to reviewer 2, comment 1), gene expression changes of ribosomal proteins and transporters (see response to reviewer 2, comment 2), as well as genes relating to alginate catabolism (see response to reviewer 2, comment 3).

      To provide more context for the interpretation of our results we now also embed our findings in more detail in the previous work on foraging strategies and dispersal tradeoffs.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should clarify in more detail what they mean by density dependence in Figure 2. Usually density dependence refers to a per capita dependence, but here it seems that the per capita rate of dispersal might be roughly independent of density (Figure 2c; if you double the number of cells it doubles the number of cells leaving). Rather it seems the dispersal is such that the density of remaining cells falls below a threshold (~300 cells). 

      We thank the reviewer for raising this important point. To analyze the data more explicitly in terms of per capita dependence and so make the density dependence in the dispersal from the microfluidic chambers more clear, we have modified Figure 2C and edited the text. 

      In the modified Figure 2C, we computed the fraction of dispersed cells for each chamber (i.e the change in cell number divided by the cell number at the time of the nutrient switch). This quantity directly reveals the per-capita dependence, as mentioned by reviewer 1, and is now represented on the y-axis of Figure 2C instead of the absolute change in cell number. 

      These data demonstrate that the fraction of dispersed cells increases with increasing numbers of cells present in the chamber at the time of switching, with more highly populated chambers showing a higher fraction of dispersed cells. These findings indicate that there is a strong density dependence in the dispersal process.

      As pointed out by reviewer 1, another interesting aspect of the data is the transition at low cell number. The fraction of dispersed cells is negative in the case of the chamber with approximately 70 cells, consistent with no dispersal at this low density, and a moderate density increase as a function of continued growth.  

      In addition to the new analysis presented in Figure 2C, we have modified the paragraph that discusses this result as follows (line 208):

      “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      (2) The authors should tone down their claims about the carbon cycle in the abstract. I do not believe the results as they stand could be used to understand degradation-dispersal cycles in marine environments relevant to the carbon cycle, since these behaviors have been studied in microfluidic environments which in my understanding are quite different. As such, statements such as "degradation-dispersal cycles are an integral part in the global carbon cycle, we know little about how cells alternate between degradation and motility" and "Overall, our findings reveal the cellular mechanisms underlying bacterial degradation-dispersal cycles that drive remineralization in natural environments" are overstated in the abstract. 

      We appreciate the reviewer’s comments regarding the connections of our work with the carbon cycle. We have now rephrased these statements in our manuscript to describe a potential connection between our work and the marine carbon cycle. The colonization of polysaccharides particles by bacteria and subsequent degradation has been widely acknowledged to play a significant role in controlling the carbon flow in marine ecosystems. (Fenchel, 2002; Preheim et al., 2011; Yawata et al., 2014, 2020). We still refer to carbon flow in the revised manuscript, though cautiously, as microbial remineralization of biomass, which is recognized as an important factor in the marine biological carbon pump (e.g., (Chisholm, 2000; Jiao et al., 2024). As stated in the previous version of the manuscript, the main motivation of our work was to study the growth behaviors of marine heterotrophic bacteria during polysaccharide degradation, especially to understand when bacteria depart already colonized and degraded particles and find novel patches to grow and degrade, a process that is poorly understood. Therefore, it is conceivable that degradation-dispersal cycles do play a role in the flow of carbon in marine ecosystems. However, we acknowledge that the carbon cycle is influenced by a multitude of biological and chemical processes, and the bacterial degradation-dispersal cycle might not be the sole mechanism at play. 

      We also appreciate the reviewer’s comments highlighting that the complexity of natural environments is not fully captured in our microfluidics system. However, our microfluidics setup does allow us to quantify responses and behaviors of microbial groups at high spatial and temporal resolution, especially in the context of environmental fluctuations. Microbes in nature interact at small spatial scales and have to respond to changes in the environment, and the microfluidics setup enables the quantification of these responses. Moreover, dispersal of the bacterium V. cyclitrophicus that we use in our study, has been previously observed even during growth on particulate alginate (Alcolombri et al., 2021), but the cues and regulation controlling dispersal behaviors have been unclear.  Microfluidic experiments have now allowed us to study this process in a highly quantitative manner, and align well with observations from experiments from more nature-like settings. These quantitative experiments on bacterial strains isolated from marine particles are expected to constrain quantitative models of carbon degradation in the ocean (Nguyen et al., 2022).

      We have now adjusted our statements throughout our manuscript to reflect the knowledge gaps in understanding the triggers of degradation-dispersal cycles and their links with carbon flow in marine ecosystems. The revised manuscript, especially, contains the following statements (line 47 and line 60):

      “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      (3) The authors should clarify why they think quorum-sensing genes are increased in expression on digested alginate. The authors currently mention that QS could be used to trigger dispersal, but given the timescales of dispersal in Figure 2 (~half an hour), I find it hard to believe that these genes are expressed and have the suggested effect on those timescales. As such I would have expected the other way round - for QS genes to be expressed highly during alginate growth, so that density could be sensed and responded to. Please clarify. 

      We have now clarified this point in the revised manuscript. While the triggering of dispersal by quorum-sensing genes may indeed appear counterintuitive, and the response is rapid (we see dispersal of cells within 30-40 minutes), both observations are in line with previous studies in another model organism Vibrio cholerae. The dispersal time is similar to the dispersal time of V. cholerae cells from biofilms, as described by Singh and colleagues, (Figure 1E of Ref. Singh et al., 2017). In that case, induction of the quorum sensing dispersal regulator HapR was observed during biofilm dispersal within one hour after switch of condition (Fig. 2, middle panel of Ref. Singh et al., 2017). Even though the specific quorum sensing signaling molecules are probably different in our strain (there is no annotated homolog of the hapR gene in V. cyclitrophicus), we observed that the full set of quorum sensing genes was enriched in cells growing on digested alginate (as reported in line 314 and Fig. 4A).

      We have added this information in the manuscript (line 317): 

      “The set of quorum sensing genes was also positively enriched in cells growing on digested alginate (Fig. 4A and S4F, Table S13). This role in dispersal is in agreement with a previous study that showed induction of the quorum sensing master regulator in V. cholerae cells during dispersal from biofilms on a similar time scale as here (less than an hour) [28].”

      Reviewer #2 (Recommendations For The Authors):

      (1) Around line 144 - I don't really understand how you flow alginate through the microfluidic platform. It seems if the particles are transiently going through the microfluidic chamber then the flow rate and hence residence time of the alginate particles will matter a lot by controlling the time the cells have to colonize and excrete enzymes for alginate breakdown. Or perhaps the alginate is not particulate but is instead a large but soluble polymer? I think maybe a schematic of the microfluidic device would help -- there is an implicit assumption that we are familiar with the Dal Co et al device, but I don't recall its details and maybe a graphic added to Figure 1 would help. 

      a. In reviewing the Dal Co paper I see that cells are trapped and the medium flows through channels and the plane where the cells are held. I am still a little confused about the size of the polymeric alginate -- large scale (>1um) particles or very small polymers? 

      We have now provided a detailed description of our microfluidic experimental system. At the start of the experiments, cells are in fact not trapped within the microfluidic device, but grow and can move freely within a chamber designed with dimensions (sub-micron heights) so that growth occurs only as a monolayer. Cells were exposed to nutrients, either alginate or alginate digestion products, both in soluble form (not particles). These compounds were flowed into the device through a main channel, but entered the flowfree growth chambers by diffusion. To make these aspects of our experiments clearer, we have added further information on this in the Materials & Methods section (line 556), added this information in the abstract (line 51), and in the results (line123).

      To make our microfluidic setup clearer, we have followed this advice and added a schematic as Figure 1A and have added more information on the setup to the main text (line 153):

      “In brief, the microfluidic chips are made of an inert polymer (polydimethylsiloxane) bound to a glass coverslip. The PDMS layer contains flow channels through which the culture medium is pumped continuously. Each channel is connected to several growth chambers that are laterally positioned. The dimensions of these growth chambers (height: 0.85 µm, length: 60 µm, width: 90-120 µm) allow cells to freely move and grow as monolayers. The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4. This setup combined with time-lapse microscopy allowed us to follow the development of cell communities over time.”

      (2) What makes this confusing is the difference between Figure 1C and Figure S2A -- the authors state that the difference in Figure 1C is due to dispersal, but is there flow through the microfluidic device? So what role does that flow through the device have in dispersal? Is the adhesion of the cell groups driven at all by a physical interaction with high molecular weight polymers in the microfluidic devices or is this purely a biological effect? Could this also be explained by different real concentrations of nutrients in the two cases? 

      We realize from this comment that the role of flow of the medium in the microfluidic setup was not clearly addressed in our manuscript. In fact, cells were not exposed to flow, and nutrients were provided to the growth chambers by diffusion. We have added a clearer explanation of this point on line 158:

      “The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4.“

      One purely physical effect that we anticipate is that a high viscosity of the medium could immobilize cells. To address this point, we measured the viscosity of both alginate and digested alginate and conclude that the increase in viscosity is not strong enough to immobilize cells. We added a statement in the text (line 170)

      “To test the role of increased viscosity of polymeric alginate in causing the increased aggregation of cells, we measured the viscosity of 0.1% (w/v) alginate or digested alginate dissolved in TR media. For alginate, the viscosity was 1.03±0.01 mPa·s (mean and standard deviation of three technical replicates) whereas the viscosity of digested alginate in TR media was found to be 0.74±0.01 mPa·s. Both these values are relatively close to the viscosity of water at this temperature (0.89 mPa·s18) and, while they may affect swimming behavior [19], they are insufficient to physically restrain cell movement [20].”

      as well as a section in the Materials and Methods (line 594):

      “Viscosity of the alginate and digested alginate solution

      We measured the viscosity of alginate solutions using shear rheology measurements. We use a 40 mm cone-plate geometry (4° cone) in a Netzsch Kinexus Pro+ rheometer. 1200 uL of sample was placed on the bottom plate, the gap was set at 150 um and the sample trimmed. We used a solvent trap to avoid sample evaporation during measurement. The temperature was set to 25°C using a Peltier element. We measure the dynamic viscosity over a range of shear rates  = 0.1 – 100 s-1. We report the viscosity of each solution as the average viscosity measured over the shear rates 10 – 100 s-1, where the shear-dependence of the viscosity was low.

      We measured the viscosity of 0.1% (w/V) alginate dissolved in TR media, which was 1.03 +/- 0.01 mPa·s (reporting the mean and standard deviation of three technical replicates.). The viscosity of 0.1% digested alginate in TR media was found to be 0.74+/-0.01 mPa·s. This means that the viscosity of alginate in our microfluidic experiments is 36% higher than of digested alginate, but the viscosities are close to those expected of water (0.89 mPa·s at 25 degree Celsius according to Berstad and colleagues [18]).”

      While our microfluidic setup allows us to track the position and movement of cells in a spatially structured setting, these observations do not allow us to distinguish directly whether the differences in dispersal are a result of purely physical effects of polymers on cells or are a result of them triggering a biological response in cells that causes them to become sessile. It is known that bacterial appendages like pili interact with polysaccharide residues (Li et al., 2003). Therefore, it is quite plausible that cross-linking by polysaccharides can contribute growth behaviors on alginate. However, our analysis of gene expression demonstrates that flagellum-driven motility is decreased in the presence of alginate compared to digested alginate, alongside other major changes in gene expression. In addition, our measures of dispersal show that dispersal of cells when exposed to digested alginate is density dependent. Both observations suggest that the patterns in dispersal are governed by decision-making processes by cells resulting in changes in cell motility, rather than being a product of purely physical interactions with the polymer. 

      The finding that viscosities of both alginate and digested alginate are similar to that of water, suggests that diffusion of nutrients in the growth chambers should be similar. Therefore, we think that the differences in real concentrations of nutrients is likely not contributing to the observed differences in behavior. 

      (3) Why is Figure S1 arbitrary units? Does this have to do with the calibration of LC-MS? It would be better, it seems, to know the concentrations in real units of the monomer at least. 

      We agree with the reviewer that it would have been better to have absolute concentrations for these compounds. However, to calibrate the mass spectrometer signals (ion counts) to absolute concentrations for the different alginate compounds, we would need an analytical standard of known concentration. We are not aware of such a standard and thus report only relative concentrations. We agree that the y-axis label of Figure S1 should not contain ‘arbitrary’ units, as it shows a ratio (of measurements in the same arbitrary units). We have edited the labels of Figure S1 accordingly and the figure legend in line 26 of the Supplemental Material (“Relative concentrations…”).

      (4) Line 188 - density-dependent dispersal. The claim here is that "cells in chambers with many cells were more likely to disperse than cells in chambers with less cells." (my emphasis). Looking at the data in Figure 2C it appears that about 40% of the cells disperse irrespective of the density, before the switch to digested alginate. So it would seem that there is not a higher likelihood of dispersal at higher cell densities. For the very highest cell density, it does appear that this fraction is larger, but I'd be concerned about making this claim from what I understand to be a single experiment. To support the claim made should the authors plot Change in Cell number/Starting Cell number on the y-axis of Fig. 2C to show that the fraction is increasing? It would seem some additional data at higher starting cell densities would help support this claim more strongly. 

      We thank the reviewer for this comment, which is in line with a remark made by reviewer 1 in their comment 1. In response to these two comments (and as described above), we have edited Figure 2C and now have plotted the change in cell number relative to starting cell number at the y axis to directly show the density dependence. We observe a positive (approximately linear) relationship between the fraction of dispersed cells with the number of cells present in the chamber at the time of switching. This indicates that there is a density dependence in the dispersal process, with highly populated chambers showing a higher fraction of dispersed cells. 

      In addition to the change in Figure 2C, we have modified the paragraph around line 208: “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      The highest cell number at the start of the switch that we include is about 800 cells. The maximum number of cells that can fit into a chamber are ca. 1000 cells. Thus, 800 resident cells are close to the maximal density.

      (5) A comment -- I find the result of significant chemotaxis towards alginate but not the monomers of alginate to be quite surprising. The ecological relevance of this (line 219) seems like an important result that is worth expanding on a bit at least in the discussion. For now, my question is whether the authors know of any mechanism by which chemotaxis receptors could respond to alginate but not the monomer. How can a receptor distinguish between the two? 

      We agree that this result is surprising, given that oligomers can be more easily transported into the periplasm where sensing takes place, and they also provide an easier accessible nutrient source. Indeed, in case of the insoluble polymer chitin it has been shown that chemotaxis towards chitin is mediated by chitin oligomers (Bassler et al., 1991), which was suggested as a general motif to locate polysaccharide nutrient sources (Keegstra et al., 2022). However, a recent study has changed this perspective by showing widespread chemotaxis of marine bacteria towards the glucose-based marine polysaccharide laminarin, but not towards laminarin oligomers or glucose (Clerc et al., 2023). Together with our results on chemotaxis towards alginate (but not significantly toward alginate oligomers) this suggests that chemotaxis towards soluble polysaccharides can be mediated by direct sensing of the polysaccharide molecules.

      As recommended, we expanded the discussion of the ecological relevance and also added more information on possible mechanisms of selective sensing of alginate and its breakdown products (around line 479).:

      “Direct chemotaxis towards polysaccharides may facilitate the search for new polysaccharide sources after dispersal. We found that the presence of degradation products not only induces cell dispersal but also increases the expression of chemotaxis genes. Interestingly, we found that V. cyclitrophicus ZF270 cells show chemotaxis towards polymeric alginate but not digested alginate. This contrasts with previous findings for bacterial strains degrading the insoluble marine polysaccharide chitin, where chemotaxis was strongest towards chitin oligomers53, suggesting that oligomers may act as an environmental cue for polysaccharide nutrient sources55. However, recent work has shown that certain marine bacteria are attracted to the marine polysaccharide laminarin, and not laminarin oligomers56. Together with our results, this indicates that chemotaxis towards soluble polysaccharides may be mediated by the polysaccharide molecules themselves. The mechanism of this behavior is yet to be identified, but could be mediated by polysaccharide-binding proteins as have been found in Sphingomonas sp. A1 facilitating chemotaxis towards pectin57. Direct polysaccharide sensing adds complexity to chemosensing as polysaccharides cannot freely diffuse into the periplasm, which can lead to a trade-off between chemosensing and uptake58. Furthermore, most polysaccharides are not immediately metabolically accessible as they require degradation. But direct polysaccharide sensing can also provide certain benefits compared to using oligomers as sensory cues. First, it could enable bacterial strains to preferably navigate to polysaccharide nutrients sources that are relatively uncolonized and hence show little degradation activity. Second, strong chemotaxis towards degradation products could hinder a timely dispersal process as the dispersal then requires cells to travel against a strong attractant gradient formed by the degradation products. Overall, this strategy allows cells to alternate between degradation and dispersal to acquire carbon and energy in a heterogeneous world with nutrient hotspots [44,59–61].”

      (6) Comment on lines 287-8 -- that the "positive enrichment of the gene set containing bacterial motility proteins matched the increase in motile cells that we observe in Fig 3E." I'm confused about what is meant by the word "matched" here. Is the implication that there is some quantitative correspondence between increased motility in Figure 3 and the change in expression in Figure 4? Or is the statement a qualitative one -- that motility genes are upregulated in the presence of digested alginate? Table S12 didn't help me answer this question. 

      We thank the reviewer for their helpful comment. Our original statement was a qualitative one - observing that gene expression enrichment in genes associated with bacterial motility aligned with our expectations based on the previous observation of an increase in motile cells. We have now changed the wording to highlight the qualitative nature of this statement (line 315):

      “The positive enrichment of the gene set containing bacterial motility proteins aligned with our expectations based on the increase in motile cells that we observed in Figure 3E (Fig. 4A, Table S12).”

      (7) Line 326 - what is the explanation for the production of public enzymes in the presence of digest? How does this square with the previous narrative about cells growing on alginate digest expressing motility genes and chemotaxing towards alginate? It seems like the story is a bit tenuous here in the sense that digested alginates stimulate both motility - which is hypothesized to drive the discovery of new alginate particles - and lyase enzymes which are used to degrade alginate. So do the high motility cells that are chemotaxing towards alginate also express lyases en route? I'm of the opinion that constructing narratives like these in the absence of a more quantitative understanding of the colonization and degradation dynamics of alginate particles presents a major challenge and may be asking more of the data than the data can provide. 

      a. I noted later that this is addressed later around lines 393 in the Discussion section.

      Indeed, the notion that the presence of breakdown products triggers motility and also increases the expression of alginate lyases and other metabolic genes for alginate catabolism seems counterintuitive. We have now expanded our discussion of these results to contextualize these findings (around line 443):

      "One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell50. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients."

      (8) I like Figure 6, and I think this hypothesis is a good result from this paper, but I think it would be important to emphasize this as a proposal that needs further quantitative analysis to be supported. 

      We have now edited the manuscript to make this point more clear. While both degradation and dispersal are well-appreciated parts of microbial ecology, the transitions and underlying mechanisms are unclear. We have edited the discussion to improve the clarity (line 419): 

      “This cycle of biomass degradation and dispersal has long been discussed in the context of foraging e.g., [44,45,13,46,47], but the cellular mechanisms that drive the cell dispersal remain unclear.”

      Also, we have updated Figure 6 to indicate more clearly which new findings this work proposes (now bold font) and which previous findings that were made in different bacterial taxa and carbon sources that aligns with our  work (now light font). We edited the figure legend accordingly (line 503):

      "By integrating our results with previous studies on cooperative growth on the same system, as well as results on dispersal cycles in other systems, we highlight where the specific results of this work add to this framework (bold font)."

      Minor comments 

      (1) Is there any growth on the enzyme used for alginate digestion? E.g. is the enzyme used to digest the alginate at sufficiently high concentrations that cells could utilize it for a carbon/nitrogen source? 

      We thank the reviewer for raising this point. We added the following paragraph as Supplemental Text to address it (line 179):

      “Protein amount of the alginate lyases added to create digested alginate

      Based on the following calculation, we conclude that the amount of protein added to the growth medium by the addition of alginate lyases is so small that we consider it negligible. In our experiment we used 1 unit/ml of alginate lyases in a 4.5 ml solution to digest the alginate. As the commercially purchased alginate lyases are 10,000 units/g, our 4.5 ml solution contains 0.45 mg of alginate lyase protein. The digested alginate solution diluted 45x when added to culture medium. This means that we added 0.18 µg alginate lyase protein to 1 ml of culture medium.

      As a comparison, for 1ml of alginate medium, 1000µg of alginate is added or for 1 ml of Lysogeny broth (LB) culture medium, 3,500 µg of LB are added.  Thus, the amount of alginate lyase protein that we added is ca. 5000 - 20,000 times smaller than the amount of alginate or LB that one would add to support cell growth. Therefore, we expect the growth that the digestion of the added alginate lyases would allow to be negligible.”

      (2) The lines in Figure 2B are very hard to see. 

      We have addressed this comment by using thicker lines in Figure 2B.

      (3) The black background and images in Figure 3A and B are hard to see as well. 

      We have now replaced Figure 3A and B, now using a white background.

      (4) Typo at the beginning of line 251? 

      Unfortunately we failed to find the typo referred to. We are happy to address it if it still exists in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) I think there is not enough experimental evidence to conclude that the underlying cause of increased motility is the accumulation of digested alginate products. To conclusively show that this is the cause and not just some signal linked to cell density, perhaps the experiment should be repeated with a different carbon source. 

      We thank the reviewer for their comment, which made us realize that we did not make the nature of the dispersal cue clear. The gene expression data was obtained from batch cultures and measured at the same approximate bacterial densities in batch, which indeed shows that the digested alginate is a sufficient signal for an increase in motility gene expression. This agrees very well with our observation that cells growing on digested alginate in microfluidic chambers have an increased fraction of motile cells in comparison with cells exposed to alginate (Fig 3E). However, we did not mean to suggest that the observed dispersal by bacterial motility is not influenced by cell density, in fact, we see that dispersal (and hence the increase in cell motility) in microfluidic chambers that are switched from polymeric to digested alginate depends on the bacterial density in the chamber, with higher bacterial densities showing increased dispersal. This shows that the presence of alginate oligomers does trigger dispersal through motility, but this signal affects bacterial groups in a cell density dependent manner.

      Similar observations have been made in Caulobacter crescentus, which was found to form cell groups on the polymer xylan while cells disperse when the corresponding monomer xylose becomes available (D’Souza et al., 2021). We reference the additional work in lines 179 and 230. Taken together, these observations indicate a more general phenomenon in dispersal from polysaccharide substrates.

      (2) About the expression data: 

      • Ribosomal proteins and ABC transporters are enriched in cells grown on digested alginate and the authors discuss that this explains the difference in max growth rate between alginate and digested alginate. However, in Figure S2E the authors report no statistical difference between growth rates. 

      We have now edited the manuscript to clarify this point. We found that cells grown on degradation products reached their maximal growth rate around 7.5 hours earlier (Fig. S2D) and showed increased expression of ribosomal biosynthesis and ABC transporters in late-exponential phase (Fig. 4A). We consider this shorter lag time as a sign of a different growth state and therefore a possible reason for the difference in ribosomal protein expression.

      As the reviewer correctly points out, the maximum growth rates that were computed from the two growth curves were not significantly different (Fig. S2E). However, for our gene expression analysis, we harvested the transcriptome of cells that reached OD 0.39-0.41 (mid- to late-exponential phase). At this time point, the cell cultures may have differed in their momentary growth rate.

      We edited the manuscript to make this clearer (line 287):

      “Both observations likely relate to the different growth dynamics of V. cyclitrophicus ZF270 on digested alginate compared to alginate (Fig. S2A), where cells in digested alginate medium reached their maximal growth rate 7.5 hours earlier and thus showed a shorter lag time (Fig. S2D). As a consequence, the growth rate at the time of RNA extraction (mid-to-late exponential phase) may have differed, even though the maximum growth rate of cells grown in alginate medium and digested alginate medium were not found to be significantly different (Fig. S2E).”

      • The increased expression of transporters for lyases in cells grown on digested alginate (lines 273-274 and 325-328) is very confusing and the explanation provided in lines 412-420 is not very convincing. My two cents on this: Expression of more enzymes and induction of motility might be a strategy to be prepared for more likely future environments (after dispersal, alginate is the most likely carbon source they will find). This would be in line with observed increased chemotaxis towards the polymer rather than the monomer (Similar to C. elegans). 

      This comment is in line with reviewer 2, comment 7. In response to these two comments (and as described above), we expanded our discussion of these results to contextualize these findings (around line 443):

      “One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell [50]. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.”

      Additionally, we agree with the intriguing comment that continued expression of alginate lyases may also prepare cells for likely future environments. Further studies that aim to answer whether marine bacteria are primed by their growth on one carbon source towards faster re-initiation of degradation on a new particle will be an interesting research question. We now address this point in our manuscript (line 458):

      “However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.“

      (3) The yield reached by Vibrio on alginate is significantly higher than the yield in digested alginate, not similar, as stated in lines 133-134. Only cell counts are similar. Perhaps the author can correct this statement and speculate on the reason leading to this discrepancy: perhaps cells tend to aggregate in alginate despite the fact that these are well-mixed cultures. 

      We have edited the description of the OD measurements accordingly and agree with the reviewer that aggregation is indeed a possible reason for the discrepancy (line 141):

      “We also observed that the optical density at stationary phase was higher when cells were grown on alginate (Fig. S2B and C). However, colony counts did not show a significant difference in cell numbers (Fig. S3), suggesting that the increased optical density may stem from aggregation of cells in the alginate medium, as observed for other Vibrio species [7].”

      (4) I suggest toning down the importance of the results presented in this study for understanding global carbon cycling. There is a link but at present it is too much emphasized. 

      We have edited our statements regarding the carbon cycle. In the revised manuscript we stress the lack of direct quantifications of carbon cycling. . We still refer to carbon flow in the revised manuscript, as we would argue that microbial remineralization of biomass is recognized as an important factor in the marine biological carbon pump (e.g., Chisholm, 2000) and research on marine bacterial foraging investigates how bacterial cells manage to find and utilize this biomass.

      Our revised manuscript contains the following modified statements (line 47 and line 60): “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      References

      • Alcolombri, U., Peaudecerf, F. J., Fernandez, V. I., Behrendt, L., Lee, K. S., & Stocker, R. (2021). Sinking enhances the degradation of organic particles by marine bacteria. Nature Geoscience, 14(10), 775–780. https://doi.org/10.1038/s41561-021-00817-x
      • Bassler, B. L., Gibbons, P. J., Yu, C., & Roseman, S. (1991). Chitin utilization by marine bacteria. Chemotaxis to chitin oligosaccharides by Vibrio furnissii. Journal of Biological Chemistry, 266(36), 24268–24275. https://doi.org/10.1016/S0021-9258(18)54224-1
      • Chisholm, S. W. (2000). Stirring times in the Southern Ocean. Nature, 407(6805), 685–686. https://doi.org/10.1038/35037696
      • Chubukov, V., Gerosa, L., Kochanowski, K., & Sauer, U. (2014). Coordination of microbial metabolism. Nature Reviews. Microbiology, 12(5), 327–340. https://doi.org/10.1038/nrmicro3238
      • Clerc, E. E., Raina, J.-B., Keegstra, J. M., Landry, Z., Pontrelli, S., Alcolombri, U., Lambert, B. S., Anelli, V., Vincent, F., Masdeu-Navarro, M., Sichert, A., De Schaetzen, F., Sauer, U., Simó, R., Hehemann, J.-H., Vardi, A., Seymour, J. R., & Stocker, R. (2023). Strong chemotaxis by marine bacteria towards polysaccharides is enhanced by the abundant organosulfur compound DMSP. Nature Communications, 14(1), 8080. https://doi.org/10.1038/s41467-023-43143z
      • Dal Co, A., van Vliet, S., Kiviet, D. J., Schlegel, S., & Ackermann, M. (2020). Shortrange interactions govern the dynamics and functions of microbial communities. Nature Ecology and Evolution, 4(3), 366–375. https://doi.org/10.1038/s41559-019-1080-2
      • D’Souza, G., Ebrahimi, A., Stubbusch, A., Daniels, M., Keegstra, J., Stocker, R., Cordero, O., & Ackermann, M. (2023). Cell aggregation is associated with enzyme secretion strategies in marine polysaccharide-degrading bacteria. The ISME Journal. https://doi.org/10.1038/s41396-023-01385-1
      • D’Souza, G. G., Povolo, V. R., Keegstra, J. M., Stocker, R., & Ackermann, M. (2021). Nutrient complexity triggers transitions between solitary and colonial growth in bacterial populations. The ISME Journal, 15(9), 2614–2626. https://doi.org/10.1038/s41396-021-00953-7
      • D’Souza, G., Schwartzman, J., Keegstra, J., Schreier, J. E., Daniels, M., Cordero, O. X., Stocker, R., & Ackermann, M. (2023). Interspecies interactions determine growth dynamics of biopolymer-degrading populations in microbial communities. Proceedings of the National Academy of Sciences of the United States of America, 120(44), e2305198120. https://doi.org/10.1073/pnas.2305198120
      • Fenchel, T. (2002). Microbial Behavior in a Heterogeneous World. Science, 296(5570), 1068–1071. https://doi.org/10.1126/science.1070118
      • Jiao, N., Luo, T., Chen, Q., Zhao, Z., Xiao, X., Liu, J., Jian, Z., Xie, S., Thomas, H., Herndl, G. J., Benner, R., Gonsior, M., Chen, F., Cai, W.-J., & Robinson, C. (2024). The microbial carbon pump and climate change. Nature Reviews Microbiology. https://doi.org/10.1038/s41579-024-01018-0
      • Keegstra, J. M., Carrara, F., & Stocker, R. (2022). The ecological roles of bacterial chemotaxis. Nature Reviews Microbiology, 20(8), 491–504. https://doi.org/10.1038/s41579-022-00709-w
      • Konishi, H., Hio, M., Kobayashi, M., Takase, R., & Hashimoto, W. (2020). Bacterial chemotaxis towards polysaccharide pectin by pectin-binding protein. Scientific Reports, 10(1), 3977. https://doi.org/10.1038/s41598-020-60274-1
      • Li, Y., Sun, H., Ma, X., Lu, A., Lux, R., Zusman, D., & Shi, W. (2003). Extracellular polysaccharides mediate pilus retraction during social motility of Myxococcus xanthus. Proceedings of the National Academy of Sciences, 100(9), 5443–5448. https://doi.org/10.1073/pnas.0836639100
      • Martínez-Antonio, A., Janga, S. C., Salgado, H., & Collado-Vides, J. (2006). Internal sensing machinery directs the activity of the regulatory network in Escherichia coli. Trends in Microbiology, 14(1), 22–27. https://doi.org/10.1016/j.tim.2005.11.002
      • McDougald, D., Rice, S. A., Barraud, N., Steinberg, P. D., & Kjelleberg, S. (2012). Should we stay or should we go: Mechanisms and ecological consequences for biofilm dispersal. Nature Reviews Microbiology, 10(1), 39–50. https://doi.org/10.1038/nrmicro2695
      • Nguyen, T. T. H., Zakem, E. J., Ebrahimi, A., Schwartzman, J., Caglar, T., Amarnath, K., Alcolombri, U., Peaudecerf, F. J., Hwa, T., Stocker, R., Cordero, O. X., & Levine, N. M. (2022). Microbes contribute to setting the ocean carbon flux by altering the fate of sinking particulates. Nature Communications, 13(1), 1657. https://doi.org/10.1038/s41467-022-29297-2
      • Norris, N., Alcolombri, U., Keegstra, J. M., Yawata, Y., Menolascina, F., Frazzoli, E., Levine, N. M., Fernandez, V. I., & Stocker, R. (2022). Bacterial chemotaxis to saccharides is governed by a trade-off between sensing and uptake. Biophysical Journal, 121(11), 2046–2059. https://doi.org/10.1016/j.bpj.2022.05.003
      • Povolo, V. R., D’Souza, G. G., Kaczmarczyk, A., Stubbusch, A. K., Jenal, U., & Ackermann, M. (2022). Extracellular appendages govern spatial dynamics and growth of Caulobacter crescentus on a prevalent biopolymer. bioRxiv, 2022.06.13.495907. https://doi.org/10.1101/2022.06.13.495907
      • Preheim, S. P., Boucher, Y., Wildschutte, H., David, L. A., Veneziano, D., Alm, E. J., & Polz, M. F. (2011). Metapopulation structure of Vibrionaceae among coastal marine invertebrates. Environmental Microbiology, 13(1), 265–275. https://doi.org/10.1111/j.1462-2920.2010.02328.x
      • Schwartzman, J. A., Ebrahimi, A., Chadwick, G., Sato, Y., Orphan, V., & Cordero, O. X. (2021). Bacterial growth in multicellular aggregates leads to the emergence of complex lifecycles. bioRxiv, 2021.11.01.466752. https://doi.org/10.1101/2021.11.01.466752
      • Singh, P. K., Bartalomej, S., Hartmann, R., Jeckel, H., Vidakovic, L., Nadell, C. D., & Drescher, K. (2017). Vibrio cholerae Combines Individual and Collective Sensing to Trigger Biofilm Dispersal. Current Biology, 27(21), 3359-3366.e7. https://doi.org/10.1016/j.cub.2017.09.041
      • Ulrich, L. E., Koonin, E. V., & Zhulin, I. B. (2005). One-component systems dominate signal transduction in prokaryotes. Trends in Microbiology, 13(2), 52–56. https://doi.org/10.1016/j.tim.2004.12.006
      • Wall, M. E., Hlavacek, W. S., & Savageau, M. A. (2004). Design of gene circuits: Lessons from bacteria. Nature Reviews Genetics, 5(1), 34–42. https://doi.org/10.1038/nrg1244
      • Yawata, Y., Carrara, F., Menolascina, F., & Stocker, R. (2020). Constrained optimal foraging by marine bacterioplankton on particulate organic matter. Proceedings of the National Academy of Sciences, 117(41), 25571–25579. https://doi.org/10.1073/pnas.2012443117
      • Yawata, Y., Cordero, O. X., Menolascina, F., Hehemann, J.-H., Polz, M. F., & Stocker, R. (2014). Competition–dispersal tradeoff ecologically differentiates recently speciated marine bacterioplankton populations. Proceedings of the National Academy of Sciences, 111(15), 5622–5627. https://doi.org/10.1073/pnas.1318943111
      • Zöttl, A., & Yeomans, J. M. (2019). Enhanced bacterial swimming speeds in macromolecular polymer solutions. Nature Physics, 15(6), 554–558. https://doi.org/10.1038/s41567-019-0454-3
    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer 1

      (Cys25)PTH(1-84) does not show efficacy surpassing that of the previously used rhPTH(1-34). This needs to be discussed biologically and clinically.

      Thank you very much for your valuable comments for enhancing the manuscript. We appreciate your input and have noted that this aspect was not addressed in the discussion. The authors have included the following paragraph in discussion section.

      “This biological difference is thought to be due to dimeric R25CPTH(1-34) exhibiting a more preferential binding affinity for the RG versus R0 PTH1R conformation, despite having a diminished affinity for either conformation. Additionally, the potency of cAMP production in cells was lower for dimeric R25CPTH compared to monomeric R25CPTH, consistent with its lower PTH1R-binding affinity.  (Noh et al., 2024) One of the potential clinical advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolcast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. (Noh et al., 2024) Also, the effects of dimer were prominent, as we mentioned better bone formation than the control group.” (2nd paragraph, Discussion section)

      The terms (Cys25)PTH(1-84) and Dimeric R25CPTH(1-34) are being used interchangeably and incorrectly. A unification of these terms is necessary.

      We totally agree with the reviewer’s notion. R25CPTH(1-84) represents mutated human PTH, rhPTH(1-34) and dimeric R25CPTH(1-34) are synthesized PTH analogs. To clarified the terminology, we thus have changeed the terminology in the manuscript appear in red.

      The figure legend is incorrect. Not all figures are described, and even though there are figures from A to I, only up to E is explained, or the content is different.

      We apologize for our negligence. As suggested by a reviewer, we've fixed the figure legends throughout before the list of references in the manuscript as follows.

      “Figure legends

      Figure 1. Micro-CT analysis (A-D) Experimental design for the controlled delivery of rhPTH(1-34) and dimeric R25CPTH(1-34) in ovariectomized beagle model. Representative images for injection and placement of titanium implant. (E) Micro-CT analysis. bone mineral density (BMD), bone volume (TV; mm3), trabecular number (Tb.N; 1/mm), trabecular thickness (Tb. Th; um), trabecular separation (Tb.sp; ㎛). Error bars indicate standard deviation. Data are shown as mean ± s.d. *p<0.05, **p<0.01, ***p<0.001, n.s., not significant.  P, posterior. R, right

      Figure 2. (A-I) Histological analysis of the different groups stained in Goldner’s trichrome. The presence of bone is marked by the green color and soft tissue in red. Red arrows indicate the position with soft tissues without bone around the implant threads. The area of bone formed was the widest in the rhPTH(1-34)-treated group. In the dimeric R25CPTH(1-34)treated group, there is a greater amount of bone than vehicle-treated group. Green arrows represent the bone formed over the implant. blue dotted line, margin of bone and soft tissue; Scale bars: 1mm

      Figure 3. Histological analysis using Masson trichrome staining results in the rhPTH(1-34) and dimeric R25CPTH(1-34)-treated group (A-L) Masson trichrome-stained sections of cancellous bone in the mandibular bone. The formed bone is marked by the color red. Collagen is stained blue. Black dotted box magnification region of trabecular bone in the mandible. Scale bars, A-C, G-I: 1mm; D-F, J-L: 200 ㎛

      Figure 4. Immunohistochemical analysis using TRAP staining for bone remodeling activity (A-L) TRAP staining is used to evaluate bone remodeling by staining osteoclasts. Osteoclasts is presented by the purple color. Black dotted box magnification region of trabecular bone in the mandible. (M, N) The number of TRAP-positive cells in the mandible of the rhPTH(1-34) and dimeric R25CPTH(1-34)-treated beagles. Scale bars, A-C, G-I: 1mm; D-F, J-L: 200 ㎛. Error bars indicate standard deviation. Data are shown as mean ± s.d. *p<0.05, **p<0.01, n.s., not significant

      Figure 5. Measurement of biochemical Marker Dynamics in serum. The serum levels of calcium, phosphorus, P1NP, and CTX across three time points (T0, T1, T2) following treatment with dimeric dimeric R25CPTH(1-34), rhPTH(1-34), or control. (A-B) Calcium and phosphorus levels exhibit an upward trend in response to both PTH treatments compared to control, suggesting enhanced bone mineralization. (C) P1NP levels, indicative of bone formation, remain relatively unchanged across time and treatments. (D) CTX levels, associated with bone resorption, show no significant differences between groups. Data points for the dimeric R25CPTH(1-34), rhPTH(1-34), and control are marked by squares, circles, and triangles, respectively, with error bars representing confidence intervals.

      Supplementary Figure. Three-dimensional reconstructed image of the bone surrounding the implants. Three-dimensional reconstructed images of the peri-implant bone depicting the osseointegration after different therapeutic interventions. (A) Represents the bone response to recombinant human parathyroid hormone fragment (rhPTH 1-34) treatment, showing the most robust degree of bone formation around the implant in the three groups. (B) Shows the bone response to a modified PTH fragment (dimeric R25CPTH(1-34)), indicating a similar level of bone growth and integration as seen with rhPTH(1-34), although to a slightly lesser extent. (C) Serves as the control group, demonstrating the least amount of bone formation and osseointegration. The upper panel provides a top view of the bone-implant interface, while the lower panel offers a cross-sectional view highlighting the extent of bony ingrowth and integration with the implant surface.”

      In Figure 5, although the descriptions of T0, T1, T2 are mentioned in the method section, it would be more clear if there was a timeline like in Figure 1.

      Based on the reviewer’s advice, we have indicated the timing of T0, T1, and T2 in the materials & methods section describing the serum biochemical assay, and we have shown a timeline in figure 5.

      In Figure 5, instead of having calcium, phosphorus, P1NP, CTX graphs all under Figure 5, it would be more convenient for referencing in the text to label them as Figure 5A, Figure 5B, Figure 5C, Figure 5D.

      We totally understood the reviewer’s comment. As the reviewer’s suggested, we have corrected the labeling in the text for figure 5 as follows.

      “The levels of calcium, phosphorus, CTX, and P1NP were analyzed over time using RM-ANOVA (Figure 5). There were no significant differences between the groups for calcium and phosphorus at time points T0 and T1 (Figure 5A). However, after the PTH analog was administered at T2 (Figure 5A), the levels were highest in the rhPTH(1-34) group, followed by the dimeric R25CPTH(1-34) group, and then, lowest in the control group, which was statistically significant (Figure 5B,C). (P < 0.05) The differences between the groups over time for CTX and P1NP were not statistically significant (Figure 5D, E).”

      Significance should be indicated in the figure (no asterisk present).

      As the reviewer’s comment, we put the asterisk in the figure 5.

      Addition of Figures in Text:

      Line 112: change from "figure 2" to "figure 1" / Line 115: mention "figure 1. E"

      Line 120: refer to "figure 1. E" / Line 123: change from "figure 3" to "figure 2"

      Line 128: refer to "figure 2.A-C" / Line 137: mention "figure 3"

      Line 138: refer to "figure 3. A-L" / Line 143: mention "figure 3. A-L"

      Line 144: refer to "figure 3. E,F,K,L" / Line 148: mention "figure 4"

      Line 150: refer to "figure 4 M,N" / Line 152: mention "figure 4. M,N"

      Line 155: refer to "figure 5" / Line 157: mention "figure 5"

      Line 159: refer to "figure 5" / Line 171: mention "figure 1 E"

      Line 175: refer to "figure 2 M, N"/ Line 194: mention "figure 3"

      Above all, thank you for the reviewer’s notion. We corrected detailed figure labeling in text to red color.

      Response to Reviewer 2

      First, the authors should clarify why they compared the effects of rhPTH(1-34) and of dimeric R25C2 PTH(1-34)? In most of the parameters, rhPTH(1-34) seems to be superior to dimeric R25C2 PTH(1-34). Why did the authors insist that the anabolic effects of dimer were prominent? Even though implication of dimeric R25C2 PTH(1-34) was drawn from genetic mutation studies, the authors should describe more clearly in the discussion the potential clinical benefits of the dimeric R25C2 PTH(1-34) compared to rhPTH(1-34), especially if dimeric R25C2 PTH(1-34) has just partial agonistic effect in pharmacodynamics.

      Thank you for your insightful comments and questions regarding our results between rhPTH(1-34) and dimeric R25CPTH(1-34). rhPTH(1-34) is a well-characterized therapy for osteoporosis. In this study, rhPTH(1-34) generally showed superior outcomes in most parameters tested, the dimeric R25CPTH(1-34) exhibited specific anabolic effects that are not as pronounced with rhPTH(1-34). We recognized R25CPTH(1-34) as a anabolic effector. One of the potential advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. Also, based on our results, we notes that the effects of dimer were prominent, as we mentioned better bone formation than the control group. We appreciate your input and have noted that this aspect was not addressed in the discussion. As a result, we have included the following paragraph in discussion section.

      “This biological difference is thought to be due to dimeric R25CPTH(1-34) exhibiting a more preferential binding affinity for the RG versus R0 PTH1R conformation, despite having a diminished affinity for either conformation. Additionally, the potency of cAMP production in cells was lower for dimeric R25CPTH compared to monomeric R25CPTH, consistent with its lower PTH1R-binding affinity.  (Noh et al., 2024) One of the potential clinical advantages of dimeric R25CPTH(1-34) is its partial agonistic effect in pharmacodynamics. This property may allow for a more fine-tuned regulation of bone metabolism, potentially reducing the risk of adverse effects associated with full agonism, such as hypercalcemia and bone resorption by osteolcast activity. Moreover, the dimeric form may offer a more sustained anabolic response, which could be beneficial in the context of long-term treatment strategies. (Noh et al., 2024) Also, the effects of dimer were prominent, as we mentioned better bone formation than the control group.” (2nd paragraph, Discussion section)

      Second, please describe the intermittent and continuous application of PTH analogues. Many of the readers may misunderstand that the authors' daily injection of PTHs were actually to mimic the clinical intermittent application or continuous one. Incorporation of the author's intention for experimental design would be more helpful for readers.

      Thank you for your insightful comments regarding the need for clearer differentiation between intermittent and continuous applications of PTH analogs in this study. We appreciate your concern that the readers may not fully grasp whether our daily injection protocol was intended to mimic clinical intermittent or continuous PTH administration. To address this, we have revised the manuscript to explicitly clarify that the daily injections of rhPTH(1-34) and dimeric R25CPTH(1-34) were designed to simulate the intermittent dosing regimen commonly used in clinical practice. This regimen is known to maximize the anabolic effects on bone while minimizing potential catabolic actions associated with more frequent or continuous hormone exposure. We have added detailed explanations in the Introduction, Methods, and Discussion sections to help readers understand our experimental design and its relevance to clinical settings.

      Introduction section

      “Administration of prathyroid hormone (PTH) analogs can be categorized into two distinct protocols: intermittent and continuous. Intermittent rhPTH(1-34) therapy, typically characterized by daily injections, is clinically used to enhance bone formation and strength. This method leverages the anabolic effects of rhPTH(1-34) without significant bone resorption, which can occur with more frequent or continuous exposure. On the other hand, continuous rhPTH(1-34) exposure, often modeled in research as constant infusion, tends to accelerate bone resorption activities, potentially leading to bone loss (Silva and Bilezikian, 2015; Jilka, 2007). Understanding these differences is crucial for interpreting the therapeutic implications of rhPTH(1-34) in bone health.”

      Silva, B. C., & Bilezikian, J. P. (2015). Parathyroid hormone: anabolic and catabolic actions on the skeleton. Current Opinion in Pharmacology, 22, 41-50.

      Jilka, R. L. (2007). Molecular and cellular mechanisms of the anabolic effect of intermittent PTH. Bone, 40(6), 1434-1446.

      Materials and Methods section

      “Each animal received one injection per day, aimed at replicating the intermittent rhPTH(1-34) exposure proven beneficial for bone regeneration and overall skeletal health in clinical settings (Neer et al., 2001; Kendler et al., 2018). This regimen was chosen to investigate the potential anabolic effects of these specific PTH analogs under conditions closely resembling therapeutic use.”

      Neer, R. M., Arnaud, C. D., Zanchetta, J. R., Prince, R., Gaich, G. A., Reginster, J. Y., Hodsman, A. B., Eriksen, E. F., Ish-Shalom, S., Genant, H. K., Wang, O., and Mitlak, B. H. (2001). Effect of Parathyroid Hormone (1-34) on Fractures and Bone Mineral Density in Postmenopausal Women with Osteoporosis. The New England Journal of Medicine, 344(19), 1434-1441.

      Kendler, D. L., Marin, F., Zerbini, C. A. F., Russo, L. A., Greenspan, S. L., Zikan, V., Bagur, A., Malouf-Sierra, J., Lakatos, P., Fahrleitner-Pammer, A., Lespessailles, E., Minisola, S., Body, J. J., Geusens, P., Moricke, R., & Lopez-Romero, P. (2018). Effects of Teriparatide and Risedronate on New Fractures in Post-Menopausal Women with Severe Osteoporosis (VERO): A Multicenter, Double-Blind, Double-Dummy, Randomized Controlled Trial. The Lancet, 391(10117), 230-240.

      Discussion section

      “The use of daily injections in this study was intended to simulate intermittent PTH therapy, a well-established clinical approach for managing osteoporosis and enhancing bone regeneration. Intermittent administration of PTH, as opposed to continuous exposure, is critical for maximizing the anabolic response while minimizing the catabolic effects that are associated with higher frequency or continuous hormone levels. Our findings support the notion that even with daily administration, both rhPTH(1-34) and dimeric dimeric R25CPTH(1-34) promote bone formation and osseointegration, consistent with the outcomes expected from intermittent therapy. It’s important for future research to consider the dosage and timing of administration to further optimize the therapeutic benefits of PTH analogs (Dempster et al., 2001; Hodsman et al., 2005).”

      Dempster, D. W., Cosman, F., Kurland, E. S., Zhou, H., Nieves, J., Woelfert, L., Shane, E., Plavetic, K., Müller, R., Bilezikian, J., & Lindsay, R. (2001). Effects of Daily Treatment with Parathyroid Hormone on Bone Microarchitecture and Turnover in Patients with Osteoporosis: A Paired Biopsy Study. Journal of Bone and Mineral Research, 16(10), 1846-1853.

      Hodsman, A. B., Bauer, D. C., Dempster, D. W., Dian, L., Hanley, D. A., Harris, S. T., Kendler, D. L., McClung, M. R., Miller, P. D., Olszynski, W. P., Orwoll, E., Yuen, C. K. (2005). Parathyroid Hormone and Teriparatide for the Treatment of Osteoporosis: A Review of the Evidence and Suggested Guidelines for Its Use. Endocrine Reviews, 26(5), 688-703.

      Third, please unify the nomenclature. Ensure consistency in the nomenclature throughout the article. Unify the naming conventions for PTH analogues, such as rhPTH(1-34) vs teriparatide and (Cys25)PTH(1-84) vs R25CPTH(1-34) vs R25CPTH(1-34) vs (1-84). Choose one nomenclature for each analogue and use it consistently throughout the article.

      We totally agree with the reviewer’s notion. R25CPTH(1-84) represents mutated human PTH, rhPTH(1-34) and dimeric R25CPTH(1-34) are synthesized PTH analogs. To clarified the terminology, we thus have changed the terminology in the manuscript appear in red.

      Response to Reviewer 3

      I would recommend to rewrite the manuscript in a form that is more understandable to the readers. In fact, it appears to me that this work was originally formatted in a way that would need the Materials and Methods to precede the results. As presented (and as requested by the eLife formatting) the Materials and Methods are available only at the end of the reading and, as a consequence, the readers needs to refer to the Materials and Methods to have a general and initial understanding of the study design (i.e. type of treatment for each group, etc are not well specified in the Results section).

      Thank you for you constructive comments and suggestions regarding the manuscript. We appreciate your feedback on the organization of the manuscript entirely. As reviewer mentioned, Materials and methods were placed after the discussion section in accordance with the format of the elife journal. For a better and initial understanding, a description of each experimental group has been added to the Results section as follow. Thank you again for your valuable comments.

      “To investigate evaluating and comparing the efficacy of rhPTH(1-34) and the dimeric R25CPTH(1-34) in promoting bone regeneration and healing in a clinically relevant animal model. In our study, beagle dogs were selected as the model due to their anatomical similarity to human oral structures, suitable size for surgeries, human-like bone turnover rates, and established oral health profiles, ensuring comparable and ethically sound research outcomes. The normal saline injected-control group, injected with 40ug/day PTH (Forsteo, Eli Lilly) group, and 40ug/day PTH analog-injected group. Animals in each group were injected subcutaneously for 10 weeks.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this fundamental study, the authors use innovative fine-scale motion capture technologies to study visual vigilance with high-acuity vision, to estimate the visual fixation of free-feeding pigeons. The authors present convincing evidence for use of the fovea to inspect predator cues, the behavioral state influencing the latency for fovea use, and the use of the fovea decreasing the latency to escape of both the focal individual and other flock members. The work will be of broad interest to behavioral ecologists.

      We thank the editor for his interest and feedback on the manuscript. We hereafter addressed the comments of the reviewer.

      Reviewer #1 (Public Review):

      Summary:

      The authors were using an innovative technic to study the visual vigilance based on high-acuity vision, the fovea. Combining motion-capture features and visual space around the head, the authors were able to estimate the visual fixation of free-feeding pigeon at any moment. Simulating predator attacks on screens, they showed that 1) pigeons used their fovea to inspect predators cues, 2) the behavioural state (feeding or head-up) influenced the latency to use the fovea and 3) the use of the fovea decrease the latency to escape of both the individual that foveate the predators cues but also the other flock members.

      Strengths:

      The paper is very interesting, and combines innovative technic well adapted to study the importance of high-acuity vision for spotting a predator, but also of improving the behavioural response (escaping). The results are strong and the models used are well-adapted. This paper is a major contribution to our understanding of the use of visual adaptation in a foraging context when at risk. This is also a major contribution to the understanding of individual interaction in a flock.

      Weaknesses:

      I have identified only two weaknesses:

      (1) The authors often mixed the methods and the results, Which reduces the readability and fluidity of the manuscript. I would recommend the authors to re-structure the manuscript.<br /> (2) In some parts, the authors stated that they reconstructed the visual field of the pigeon, which is not true. They identified the foveal positions, but not the visual fields, which involve different sectors (binocular, monocular or blind). Similarly, they sometimes mix-up the area centralis and the fovea, which are two different visual adaptations.

      Thank you for your positive feedback. We addressed these comments by restructuring the methods and result sections as suggested, and by checking the terminology and specific vocabulary used throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      First, I would like to say that I really enjoyed the manuscript. This is a great contribution to the field.

      Thank you for the positive feedback, we highly appreciate it.

      Then, I have some comments that I hope, would help the authors to improve the manuscript.

      Major comments :

      I would recommend the authors to restructure the methods and the results section. In many parts, the models used are presented in the results section, while this should be presented in the methods section.

      Thank you for the suggestion, we now have ensured that the model descriptions are presented in the statistic section of the methods.

      To me, the introduction is too long (more than 5 pages). It would be beneficial to reduce it considerably. Furthermore, in the introduction, it misses some information about the visual abilities of your species ((visual acuity, visual field, temporal resolution, contrast sensitivity....).

      We agree that the introduction was very long and reduced it by removing the “Methodological issues” as well as strongly reducing the “Experimental rationales” to a minimum. We also added the missing information on the visual abilities of the pigeons in the “Experimental rationales” section (see L135-150). Please note, however, that we refer to the temporal resolution of pigeon vision in the method section, to associate it with the information of the used monitor’s resolution.

      Minor comments :

      Lines 37-39: This needs a reference.

      A reference has been added (McFarland, 1977)

      Lines 39-41: But see some papers published recently on Harris's hawks.

      Thank you for the references, we added the citation as well as a few more papers (Kane et al., 2015; Kano et al., 2018; Miñano et al., 2023; Yorzinski & Platt, 2014).

      Lines 41-43: This sentence needs a reference as well.

      A reference has been added (Cresswell, 1994; M. H. R. Evans et al., 2018; Inglis & Lazarus, 1981)

      Lines 56-103: In this paragraph, head down and head up also depends from the retinal map of the birds! Some birds have visual streak that allow them to see a potential threats while foraging. Please add more information about the importance of photoreceptors distribution.

      Thank you for pointing out this issue. We rewrote the sentence L65-69 as follows to include the importance retinal structures.

      “In several species, especially those with a broad visual field and specific retinal structures such as the visual streaks, individuals can simultaneously engage in foraging activities while remaining vigilant (Fernández-Juricic, 2012), likely using peripheral vision to detect approaching threats (Bednekoff & Lima, 2005; Cresswell et al., 2003; Kaby & Lind, 2003; Lima & Bednekoff, 1999).”

      Lines 76-79: you wrote : ".... favor alternative hypotheses based on their findings". Which findings? You need to explain.

      We rewrote this part as follows (L80-81).

      “other studies found evidence for the risk dilution (Beauchamp & Ruxton, 2008) and the edge effect (Inglis & Lazarus, 1981) in their study systems.”

      Lines 109-110: It would be good to have a representation of what is an area and a fovea, and how it is placed in the eye, what type of fovea exists and how it is related to visual field. Where does it project?

      We now give a better description of the pigeon’s visual field in the experimental rationales section that we hope will help the reader understanding the key features of pigeon’s vision (see L135-150). Specifically, we now say in L137-138:

      “they have one fovea centrally located in the retina of each eye, with an acuity of 12.6 c/deg (Hodos et al., 1985). Their fovea projects laterally at ~75° into the horizon in their visual field.”

      Lines 109-113: You might need to see some new papers here about the fovea. See for instance Bringmann 2019.

      Thank you for the suggestion, we now give a more precise definition of the fovea and refer to Bringmann’s paper for more details (L113-114):

      “a pit-like area in the retina with high concentration of cone cells where visual acuity is highest, and is responsible for sharp, detailed, and color vision.”

      Lines 113-120: Please explain how the visual field is related to fovea? Where is the fovea project in the visual fields?

      Similarly to the question above, we now give a more precise description of the pigeon’s visual field (see L135-150).

      Line 131-134: For a non-expert, you would need to explain what is micro, meso and macro scale?

      These sentences have been removed when shortening the introduction and we are not referring to micro, meso and macro scales anymore.

      Lines 134-136: Please explain in one sentence the technique here.

      We now explain in one sentence how motion capture enables the tracking of head and body orientation (L130-132):

      “Motion capture cameras track with high accuracy the 3D position of markers, which, when attached to the pigeon’s head and body, enables to reconstruct the rotations of the head and body in all directions.”

      Line 140: You presented here for the first time the word "foveation". Has this term been used before? If so, please add a reference. If not, please explain what you mean by foveation precisely.

      Thank you for noticing this lack. We are now providing the following definition “directing visual focus to the fovea to achieve the clearest vision” in the first place where we mention the term foveation (L149-150).

      Lines 146-148: Please explain why this proves that it is appropriate to not record eyes movements, and is this true for every behaviours?

      We acknowledge that some small eye movement might occur and reduce the accuracy of the method. This error is considered in the system using the +-10 degrees range around the foveas. The lines the reviewer referred to were removed when shortening the introduction, but we added an explanation in the paragraph describing pigeon vision to make it clearer (L147-150):

      “Yet, it should be noted that their eye movement was not tracked in our system, although it is typically confined within a 5 degrees range (Wohlschläger et al., 1993). We thus considered this estimation error of the foveation (directing visual focus to the fovea to achieve the clearest vision) in our analysis, as a part of the error margin (see Methods).”

      Lines 161-163: What is the frontal and binocular field for? You would need to explain the different fields of view and what they are supposed to be for.

      Furthermore, does the visual field of pigeon have been studied? If so, you would need to add more information about it.

      This information is now given in the new paragraph describing the pigeon’s vision in the  “Experimental rationales” section (see L135-150).

      Figure 1: It is not clear here which panels correspond to a, b or c. Please use some boxes to clarify it.

      Thank you for the comment, we now have made the figure’s sub-panels clearer.

      Lines 193-194: You wrote "... such as foveas (also known as the area centralis). No, this is not the same.

      (1) In some species, you have two foveas, one placed centrally in the retina, one place temporally. So the fovea is not the area centralis.

      (2) Second, some species do have an area centralis but without a fovea.

      Thank you for pointing out the inaccuracy. In this case, we were referring specifically to the pigeon’s fovea which is sometimes referred to as “area centralis”, but we now changed the sentence as follow to avoid any confusion (L174-175):

      “The initial two hypotheses (Hypotheses 1 and 2) aim to examine whether foveation correlates with predator detection.”

      Lines 192-212: I did not understand the logic of the hypotheses numbers? Why do you have 2.1 but not 3.1 for instance? And if you have two hypotheses for the within a global one (for instance, 2.1 and 2.2), what is the main hypothesis 2? You should explain more here because we get lost here and in the result section as well.

      We recognize this section might have appeared confusing to the reader. In short, we had four main hypotheses: 1) the fovea is used to evaluate predator cues, 2) the latency to foveate is related to vigilance behaviors. These first 2 hypotheses aimed to determine if the latency to foveate on the predator cue could be related to the detection. 3) foveation is related to the escape response of the pigeons and 4) there is a collective influence in the escape response. We further divided some of the hypotheses into 2 sub-hypotheses whenever 2 different tests were used to answer the same question. We have modified this section to be clearer.

      Lines 224-229: Where are the figures and statistics for these results?

      These results are presented in Table S1. We apologize for forgetting to add this reference and have now added it (L211).

      Lines 229-231: This should be in the method section.

      This model explanation (as well as all other hereafter mentioned) have been moved to the method section as suggested.

      Lines 248-252: This should be in the method section. Furthermore, you should better explain the model selection.

      Please see earlier comment. Additionally, we are now better explaining how the model has been built.

      Figure 2: It is not clear on the figure which letters correspond to which panels. Please improve the readability of the figure.

      It was modified accordingly.

      Lines 274-278: This should be in the method section.

      Please see earlier comment.

      Line 281: The "Fig.3" should be mentioned in the previous sentence.

      It was modified accordingly.

      Figure 3: Please explain why the latency to foveate had negative values in Fig.2 but not here, and not in Fig. 4 as well. This again highlights that we missed a number of information in the methods about the transformation of the data and the model selection.

      The variable presented in Fig 2d is not the latency to foveate but the “Normalized frequency at which the object was observed within foveal regions” (hypothesis 1). It represents the amount of time the object was lying within one of the foveal regions of the individual (“how long the pigeons foveated on it”), further normalized to unit sum to make all objects comparable. This variable was indeed logit-transformed (hence the negative value) to improve residual fit in the model, but this information (as well as other transformations) are always clearly stated on the axis caption of the graphs. Additionally, we now have improved the statistical analysis section to make the model used for each hypothesis testing clearer. But please let us know if you have suggestions for a further improvement in terms of presentation.

      Lines 297-301: This should be in the method section.

      Please see earlier comment.

      Lines 301-305: Fig. 3 b and c only referred to the two first factors. Please add more figures for the other factors. This could be in supp. Mat.

      We added the 3 graphs for the proportion of time foveating on the monitor, the saccade rate and the proportion of time foveating on conspecifics in the supplementary (Fig S6).

      Lines 306-309: This should be in methods, and you should have explained in methods how you performed your model selection.....

      We prefer leaving this paragraph in the result section, as it was intended to give the reader extra information on the predictive power of the different variables (by comparing the effectiveness of the models including one variable at a time, all the rest being equal) and not on the model selection per se. However, we now explain our goal better in the statistics section regarding this analysis (L635-636):

      “We further tested the relative predictive power of the different test variables by comparing the resulting models’ efficiency using AIC scores.”

      Lines 317-319: This should be in the method section.

      Please see earlier comment.

      Lines 320-322: This should be in the method section.

      Please see earlier comment.

      Lines 332-334: This should be in the method section.

      Please see earlier comment.

      Lines 334-336: Then, if this is not significant, you cannot say that.

      Thank you for noticing the inaccuracy, we have now rephrased it as (L298-299):

      “Earlier foveation of the first pigeon was not significantly related to an earlier escape responses among the other flock members, although there was a trend (χ2(1) = 3.66, p = 0.0559).”

      Line 336: Please explain why you did different models. We missed a lot of information in the method about your strategy for statistics.?

      We have now added a lot more information on the models in the statistics section, according to this comment as well as the previous ones. We hope the explanations of the analyses are now clearer to the reader.

      Lines 339-349: This should be in the method section.

      Please see earlier comment.

      Results section: As you may have understood, there are too many sentence that should be moved into the method section. Futhermore, I would recommend to modify the headdings so that they are more biologically speaking. Similarly to what you have done in the discussion section.

      Thank you for the comments. We agree with most of them, and have modified the manuscript accordingly. Additionally, we now use the same headings in the results section as the ones used in the discussion to make the text easier to follow.

      Lines 500-501: What were the body weight of the pigeon? At which weight of their full weight they were?

      This information is now added (492 ± 41g; mean ± SD). We did not control the amount of food during our experiments and only ensured 24h without food by feeding the pigeons after the experiment was completed. This information was added as follows (L454-456):

      “On experimental days, they were fed only after the experiments was completed; this ensures 24-hour no feeding at the time of the experiment, although we did not control the amount of the food over the course of the experimental periods.”

      Line 522-523: Those screens are very good for pigeons.

      Thank you for the positive comment, we indeed tried to match bird vision as close as possible.

      Lines 527-528: At which frequency was produced the moving stimulus? Your screen can display up to 144Hz, which is very good. But can your laptop do it? If not, it is important to mention it as pigeons may have a temporal resolution of vision up to 149Hz.

      Our laptop indeed supports 144Hz display. In addition, we now mention the temporal resolution of pigeon vision (L480-482).

      “We specifically chose a monitor with high temporal resolution to match the pigeon’s Critical Flicker Fusion Frequency (threshold at which a flickering light is perceived by the eye as steady) that reaches up to 143Hz (Dodt & Wirth, 1954).”

      Lines 555-572: Did you use a control shape in your experiment? Indeed, they may escape because of a moving pattern but not a predator shape?

      We did not use a control shape, as the aim of the experiment was not to directly test the effect of the shape itself. We designed the predator cue to resemble an approaching predator to ensure a response from the pigeons, but it might be that other shapes would have worked as well.

      Lines 588-589: Please explain why the coordinate system of the pigeon's head is considered as the visual field?

      From what I have understood, you did not reconstruct the visual fields, but only the position of the fovea. This should be noted like this as visual field involves more than a sphere around the head (binocular and monocular sectors, blind sectors, vertical extension....).

      Thank you for noticing the inaccuracy, we indeed did not consider other sectors of the visual field and therefore rephrased it as (L551): “the location of the objects and conspecifics from the pigeon’s perspective”.

      Lines 601-604: How much does it represent?

      As this was estimated by visual inspection, we do not have the exact percentage of data loss that was caused by grooming. However, because of the number of cameras in the SMART BARN motion capture system, it is reliable in detecting markers inside the space in “ideal” conditions (without occlusion). For example, a similar set-up found marker track loss of only <1% using a model bird (Itahara & Kano 2022)

      Itahara, A., & Kano, F. (2022). “Corvid Tracking Studio”: A custom-built motion capture system to track head movements of corvids. Japanese Journal of Animal Psychology, 72(1), 1–16. https://doi.org/10.2502/janip.72.1.1

      Lines 610-612: You would need to cite Wood 1917 and Hodos et al. 1991 who described the presence of a fovea in this species.

      We added both citations to the manuscript.

      Line 611: Again, the fovea is not egal to area centralis.

      Thank you, we changed it as well.

      Lines 625-626: you wrote "... in a few instances....". Please explain more. How many? What proportion?

      This happened in 9 observations out of 120. We now specify it in the text as well (L587-589):

      “in a few instances (9 out of 120 observations), pigeons foveated on the model predator after the looming stimulus had disappeared, but these cases were excluded from our analysis.”

      Lines 640-653: We missed a lot of information in the section "statistical analysis". If you moved most of the sentence from the results that describe the methods in the method section, that would be much better. Furthermore, you would need to explain more what statistics you used, which model selection, what type of data transformation....

      We agree this section lacked information, and we moved the information from the result to the statistics section.

      Supplmentary materials: boxplots from Fig. S1 and S2 are too small and impossible to read. Please improve the readability.

      We now have enlarged these plots to make them more readable.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "Engineering of PAClight1P78A: A High-Performance Class-B1 GPCR-Based Sensor for PACAP1-38" by Cola et al. presents the development of a novel genetically encoded sensor, PAClight1P78A, based on the human PAC1 receptor. The authors provide a thorough in vitro and in vivo characterization of this sensor, demonstrating its potential utility across various applications in life sciences, including drug development and basic research.

      The diverse methods to validate PAClight1P78A demonstrate a comprehensive approach to sensor engineering by combining biochemical characterization with in vivo studies in rodent brains and zebrafish. This establishes the sensor's biophysical properties (e.g., sensitivity, specificity, kinetics, and spectral properties) and demonstrates its functionality in physiologically relevant settings. Importantly, the inclusion of control sensors and the testing of potential intracellular downstream effects such as G-protein activation underscore a careful consideration of specificity and biological impact.

      Strengths:

      The fundamental development of PAClight1P78A addresses a significant gap in sensors for Class-B1 GPCRs. The iterative design process -starting from PAClight0.1 to the final PAClight1P78A variant - demonstrates compelling optimization. The innovative engineering results in a sensor with a high apparent dynamic range and excellent ligand selectivity, representing a significant advancement in the field. The rigorous in vitro characterization, including dynamic range, ligand specificity, and activation kinetics, provides a critical understanding of the sensor's utility. Including in vivo experiments in mice and zebrafish larvae demonstrates the sensor's applicability in complex biological systems.

      Weaknesses:

      The manuscript shows that the sensor fundamentally works in vivo, albeit in a limited capacity. The titration curves show sensitivity in the nmol range at which endogenous detection might be possible. However, perhaps the sensor is not sensitive enough or there are not any known robust paradigms for PACAP release. A more detailed discussion of the sensors's limitations, particularly regarding in vivo applications and the potential for detecting endogenous PACAP release, would be helpful.

      We thank the reviewer for carefully analyzing our in vivo data and highlighting the limitation of our results regarding the sensor’s applicability in detecting endogenous PACAP. We added several sections conversing future possibilities for optimization in the discussion (see paragraphs 2-4). We agree that a more specific discussion of the limitations of our study is an important addition to help design future experiments. 

      There are several experiments with an n=1 and other low single-digit numbers. I assume that refers to biological replicates such as mice or culture wells, but it is not well defined. n=1 in experimental contexts, particularly in Figure 1, raises significant concerns about the exact dynamic range of the sensor, data reproducibility, and the robustness of conclusions drawn from these experiments. Also, ROI for cell cultures, like in Figure 1, is not well defined. The methods mentioned ROIs were manually selected, which appears very selective, and the values in Figure 1c become unnecessarily questionable. The lack of definition for "ROI" is confusing. Do ROIs refer to cells, specific locations on the cell membrane, or groups of cells? It would be best if the authors could use unbiased methods for image analysis that include the majority of responsive areas or an explanation of why certain ROIs are included or excluded.

      We thank the reviewer for the helpful suggestions. We have increased the number of replicates to n=3 for both HEK293T and neuron data depicted in Fig.1c. Furthermore, we have added Fig.1c’ containing the quantification of the maximum responses obtained in the dataset shown in Fig.1c also depicting the single values for each replicate. To clarify the definition of an ROI in our manuscript, we have detailed the process of ROI selection in the Methods section “Cell culture, imaging and quantification section”. Additionally, we also increased mouse numbers for in vivo PACAP infusions in mice (see Figure 4g).

      Reviewer #2 (Public Review):

      Summary:

      The PAClight1 sensor was developed using an approach successful for the development of other fluorescence-based GPCR sensors, which is the complete replacement of the third intracellular loop of the receptor with a circularly-permuted green fluorescent protein. When expressed in HEK cells, this sensor showed good expression and a weak but measurable response to the extracellular presence of PACAP1-38 (a

      F/Fo of 43%). Additional mutation near the site of insertion of the linearized GPF, at the C-terminus of the receptor, and within the second intracellular loop produced a final optimized sensor with F/Fo of >1000%. Finally, screening of mutational libraries that also included alterations in the extracellular ligand-binding domain of the receptor yielded a molecule, PAClight1P78A, that exhibited a high ligand-dependent fluorescence response combined with a high differential sensitivity to PACAP (EC50 30 nM based on cytometric sorting of stably transfected HEK293 cells) compared to its congener VIP, (with which PACAP shares two highly related receptors, VPAC1 and VPAC2) as well as several unrelated neuropeptides, and significantly slowed activation kinetics by PACAP in the presence of a 10-fold molar excess of the PAC1 antagonist PACAP6-38. A structurally highly similar control construct, PAClight1P78Actl, showed correspondingly similar basal expression in HEK293 cells, but no PACAP-dependent enhancement in fluorescent properties.

      PAClight1P78A was expressed in neurons of the mouse cortex via AAV9.hSyn-mediated gene transduction. Slices taken from PAClight1P78A-transfected cortex, but not slices taken from PAClight1P78Actl-transfected cortex exhibited prompt and persistent elevation of F/Fo after 2 minutes of perfusion with PACAP1-38 which persisted for up to 14 minutes and was statistically significant after perfusion with 3000, but not 300 or 30 nM, of peptide. Likewise, microinfusion of 200 nL of 300 uM PACAP1-38 into the cortex of optical fiber-implanted freely moving mice elicited a F/Fo (%) of greater than 15, and significantly higher than that elicited by application of similar concentrations of VIP, CRF, or enkephalin, or vehicle alone. In vivo experiments were carried out in zebrafish larvae by the introduction of PAClight1P78A into single-cell stage Danio rerio embryos using a Tol2 transposase-based plasmid with a UAS promoter via injection (of plasmid and transposase mRNA), and sorting of post-fertilization embryos using a marker for transgenesis carried in the UAS :

      PAClight1P78A construct. Expression of PAClight1P78A was directed to cells in the olfactory bulb which express the fish paralog of the human PAC1 receptor by using the Tg(GnRH3:gal4ff) line, and fluorescent signals were elicited by intracerebroventricular administration of PACAP1-38 at a single concentration (1 mM), which were specific to PACAP and to the presence of PAClight1P78A per se, as controlled by parallel experiments in which PAClight1P78Actl instead of PAClight1P78A was contained in the transgenic plasmid.

      Major strengths and weaknesses of the methods and results

      The report represents a rigorous demonstration of the elicitation of fluorescent signals upon pharmacological exposure to PACAP in nervous system tissue expressing PAClight1P78A in both mammals (mice) and fish (zebrafish larvae). Figure 4d shows a change in GFP fluorescence activation by PACAP occurring several seconds after the cessation of PACAP perfusion over a two-minute period, and its persistence for several minutes following. One wonders if one is apprehending the graphical presentation of the data incorrectly, or if the activation of fluorescence efficiency by ligand presentation is irreversible in this context, in which case the utility of the probe as a real-time indicator, in vivo, of released peptide might be diminished.

      We thank the reviewer for their careful consideration of our manuscript and agree that the activation of PAClight persisting for several minutes at micromolar concentrations could be a potential limitation for in vivo applications. We added a possible explanation for the persisting sensor activation in response to artificial application of PACAP38 in paragraph 3 of the discussion. We agree that this addition eases the interpretation of PAClight signals detected in vivo. 

      Appraisal of achievement of aims, and data support of conclusions:

      Small cavils with controls are omitted for clarity; the larger issue of appraisal of results based on the scope of the designed experiments is discussed in the section below. An interesting question related to the time dependence of the PACAP-elicited activation of PAClight1P87A is its onset and reversibility, and additional data related to this would be welcome.

      We agree that the reversibility of the sensor’s fluorescence is indeed an important feature especially for detecting endogenous PACAP release. Our data indicate that the sensor’s fluorescence is reversible when detecting small to medium doses of PACAP38 (see Figure 4d – Application of 30-300nM) that are presumably closer to physiological concentrations than the non-reversible concentration of 3000nM. Please, see also our new discussion on peptide concentrations in paragraph 4 of our discussion. For future experiments, it is indeed advisable to adjust the interval of repeated applications to the decay of the response at the respective concentration. Considering, the long-lasting downstream effects of endogenous signaling, longer intervals between ligand applications are generally preferred to match more closely the physiological range in which endogenous PAC1 is most likely affective. 

      Discussion of the impact of the work, and utility of the methods and data:

      Increasingly, neurotransmitter function may be observed in vivo, rather than by inferring in vivo function from in vitro, in cellular, or ex vivo experimentation. This very valuable report discloses the invention of a genetically encoded sensor for the class B1 GPCR PAC1. PAC1 is the major receptor for the neuropeptide PACAP, which in turn is a major neurotransmitter involved in brain response to psychogenic stress, or threat, in vertebrates as diverse as mammals and fishes. If this sensor possesses the sensitivity to detect endogenously released PACAP in vivo it will indeed be an impactful tool for understanding PACAP neurotransmission (and indeed PACAP action in general, in immune and endocrine compartments as well) in future experiments.

      However, the sensor has not yet been used to detect endogenously released PACAP. Until this has been done, one cannot answer the question as to whether the levels of exogenously perfused/administered PACAP used here merely to calibrate the sensor's sensitivity are indeed unphysiologically high. If endogenous PACAP levels don't get that high, then the sensor will not be useful for its intended purpose. The authors should address this issue and allude to what kind of experiments would need to be done in order to detect endogenous PACAP release in living tissue in intact animals. The authors could comment upon the success of other GPCR sensors that have been used to observe endogenous ligand release, and where along the pathway to becoming a truly useful reagent this particular sensor is.

      We thank the reviewer for highlighting the lack in clarity that the scope of this paper was not intended to cover the detection of endogenous PACAP release. We therefore expanded our discussion to encompass the intended purpose of detecting artificially infused or applied PAC1 agonists, such as conducting fundamental tests of drug specificity and developing new pharmacological ligands to selectively target PAC1. This includes a more detailed discussion of our in vivo findings and a clearer phrasing that stresses the potential application for applied drugs and not endogenous PACAP (see last paragraph in the discussion).

      We also agree that little is known about endogenous concentrations of PACAP in the brain. However, we have supplemented our discussion with several references estimating lower concentrations of PACAP and other peptides in vivo, suggesting average PACAP levels below the detection threshold of the sensor. Importantly, within certain brain regions and in closer proximity to release sites, significantly higher concentrations might be reached. Additionally, our data indicate that the concentrations observed under our current conditions do not saturate the sensor in vivo.  

      We therefore acknowledge the reviewer’s comment on the sensor’s potential limitations under our current experimental conditions. Hence, we expanded our discussion and suggest the use of higher resolution imaging to potentially reveal loci of high PACAP concentrations, which should be validated by future studies (see also our added discussion in paragraph 4). 

      Reviewer #3 (Public Review):

      Summary:

      The manuscript introduces PAClight1P78A, a novel genetically encoded sensor designed to facilitate the study of class-B1 G protein-coupled receptors (GPCRs), focusing on the human PAC1 receptor. Addressing the significant challenge of investigating these clinically relevant drug targets, the sensor demonstrates a high dynamic range, excellent ligand selectivity, and rapid activation kinetics. It is validated across a variety of experimental contexts including in vitro, ex vivo, and in vivo models in mice and zebrafish, showcasing its utility for high-throughput screening, basic research, and drug development efforts related to GPCR dynamics and pharmacology.

      Strengths:

      The innovative design of PAClight1P78A successfully bridges a crucial gap in GPCR research by enabling realtime monitoring of receptor activation with high specificity and sensitivity. The extensive validation across multiple models emphasizes the sensor's reliability and versatility, promising significant contributions to both the scientific understanding of GPCR mechanisms and the development of novel therapeutics. Furthermore, by providing the research community with detailed methodologies and access to the necessary viral vectors and plasmids, the authors ensure the sensor's broad applicability and ease of adoption for a wide range of studies focused on GPCR biology and drug targeting.

      Weaknesses

      To further strengthen the manuscript and validate the efficacy of PAClight1P78A as a selective PACAP sensor, it is crucial to demonstrate the sensor's ability to detect endogenous PACAP release in vivo under physiological conditions. While the current data from artificial PACAP application in mouse brain slices and microinfusion in behaving mice provide foundational insights into the sensor's functionality, these approaches predominantly simulate conditions with potentially higher concentrations of PACAP than naturally occurring levels.

      We thank the reviewer for their valuable comments and agree that the use of PAClight for detecting endogenous PACAP will be of big interest for the scientific community and should be a goal for future research. Considering the time, equipment and additional animal licenses necessary, we are convinced that these questions would go beyond the scope of the current paper and might rather be addressed in a follow-up publication. We therefore rephrased the discussion and added more details to clarify further the intended purpose of the current study. Additionally, we added a paragraph in the discussion suggesting experiments needed to validate PAClight for putative future in vivo applications. 

      Although the sensor's specificity for the PAC1 receptor and its primary ligand is a pivotal achievement, exploring its potential application to other GPCRs within the class-B1 family or broader categories could enhance the manuscript's impact, suggesting ways to adapt this technology for a wider array of receptor studies. Additionally, while the sensor's performance is convincingly demonstrated in short-term experiments, insights into its long-term stability and reusability in more prolonged or repeated measures scenarios would be valuable for researchers interested in chronic studies or longitudinal behavioral analyses. Addressing these aspects could broaden the understanding of the sensor's practical utility over extended research timelines.

      We extend our gratitude to the reviewer for diligently assessing our results. 

      Indeed, the very high level of sensitivity that we could achieve in PAClight leads us to think that potentially a grafting-based approach, such as the one we’ve recently described for class-A GPCR-based sensors (PMID: 37474807) could also work for the direct generation of multiple class-B1 sensors based on the optimized fluorescent protein module present in PAClight. Unfortunately, considering the amount of work that testing this hypothesis would entail, we are not able to perform these experiments in the context of this revision, and would rather pursue them as a future project. Nevertheless, we have expanded the discussion of the manuscript with a paragraph with these considerations.

      While we lack comprehensive data on the long-term stability of the sensor, our preliminary findings from photometry recordings optimization indicate consistent baseline expression of PAClight and PACLight ctrl over several weeks. Conducting experiments to systematically assess stability would require several months, which is currently impractical due to limitations in tools and licenses for repeated in vivo infusions. Hence, we intend to include these experiments in potential follow-up studies.

      Furthermore, the current in vivo experiments involving microinfusion of PACAP near sensor-expressing areas in behaving mice are based on a relatively small sample size (n=2), which might limit the generalizability of the findings. Increasing the number of subjects in these experimental groups would enhance the statistical power of the results and provide a more robust assessment of the sensor's in vivo functionality. Expanding the sample size will not only validate the findings but also address potential variability within the population, thereby reinforcing the conclusions drawn from these crucial experiments.

      We agree with the reviewer that a sample size of N=2 is not sufficient for in vivo recordings. We therefore increased the sample size and now present recordings with 5 PAClight1P78A and 4 PACLight-control mice. Of note, the new data validate our previous findings and conclusions and give a better idea of the variability in vivo that we now discuss in much more detail in the discussion (see paragraph 2). 

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      The lower potency of maxadilan activation might reflect broader implications for ligand-receptor dynamics. Perhaps the authors could discuss the maxadilan binding from a structural perspective, including AlphaFold models. Also, discussing how these findings might influence sensor application in diverse biological contexts would be insightful. Clear definitions and consistent use of these terms are crucial for ensuring that readers understand the methods and results.

      We would like to thank the reviewer for the comments. As part of this work, we did not obtain a dose-response curve for maxadilan peptide, and only reported the maximal response of the sensor to a high concentration of the peptide (10 µM). Thus, our findings would rather inform us on the maximal efficacy of the peptide, as opposed to its potency towards the PAC1R. Furthermore, we would like to point out that due to the lack of structural details for any GPCR-based sensor published to date, we cannot make any molecularly accurate conclusion regarding the precise reasons why a different ligand (in this case the sandfly maxadilan) induces a lower maximal efficacy of the response compared to the endogenous cognate ligand of the receptor. We do not believe that AlphaFold models can accurately replace structural information in this regard, especially given the consideration that the aminoacid linker regions between the GPCR and the fluorescent protein, which are a critical determinant of allosteric chromophore modulation by ligand-induced conformational changes, typically obtain the lowest confidence score in all AlphaFold predicted structural models of GPCR-based sensors. Finally, we would like to refer the reviewer to a very nice recent publication (PMID: 32047270) which resolved the structures of each of these peptides bound to the PAC1 receptor-Gs protein complex, which provides accurate molecular details on the different modalities of receptor binding and activation by PACAP138  versus maxadilan.

      Reviewer #2 (Recommendations For The Authors):

      The authors are congratulated on the meticulous achievement of their aim, i.e. a fluorescence-based sensor for the detection of PACAP with in vivo utility. Whether or not this sensor will have the requisite sensitivity to detect the release of endogenous PACAP within various regions of the nervous system, in response to specific environmental stimuli or changes in brain or physiological state, remains to be determined.

      We thank the reviewer for the very positive evaluation of our manuscript and for the suggested additions that will improve the strength of our arguments.

      We agree that the in vivo detection of endogenous PACAP will be an important objective for future studies. Due to time, resource and animal license constraints, we are not able to address this objective in our current study, but we now detail possible future experiments in the discussion section. Please see also our answer to the suggested discussion points previously.

      Reviewer #3 (Recommendations For The Authors):

      To comprehensively assess the sensor's sensitivity and specificity to endogenous PACAP, I recommend conducting additional in vivo experiments where PAClight1P78A is expressed in neurons that endogenously express the Pac1r receptor (using Adcyap1r1-Cre mouse line). These experiments should involve applying sensory or emotional stimuli known to evoke PACAP release or activating upstream PACAP-expressing neurons. Such studies would offer valuable data on the sensor's performance under natural physiological conditions and its potential utility for exploring PACAP's roles in vivo.

      We express our gratitude to the reviewer for providing detailed methodological approaches to examine endogenous PACAP release. These suggestions will prove invaluable for future investigations and are important additions to a follow-up publication. As mentioned earlier, we have incorporated some of these approaches into our discussion. Additionally, we have underscored the existing limitations in detecting endogenous PACAP in vivo and emphasized the relevance of PAClight for drug development purposes.

    1. Author response:

      eLife assessment

      This useful study describes an antibody-free method to map G-quadruplexes (G4s) in vertebrate cells. While the method might have potential, the current analysis is primarily descriptive and does not add substantial new insights beyond existing data (e.g., PMID:34792172). While the datasets provided might constitute a good starting point for future functional studies, additional data and analyses would be needed to fully support the major conclusions and, at the same time, clarify the advantage of this method over other methods. Specifically, the strength of the evidence for DHX9 interfering with the ability of mESCs to differentiate by regulating directly the stability of either G4s or R-loops is still incomplete.

      We thank the editors for their helpful comments.

      Given that antibody-based methods have been reported to leave open the possibility of recognizing partially folded G4s and promoting their folding, we have employed the peroxidase activity of the G4-hemin complex to develop a new method for capturing endogenous G4s that significantly reduces the risk of capturing partially folded G4s. We will be happy to clarify the advantage of our method.

      In the Fig. 7, we applied the Dhx9 CUT&Tag assay to identify the G4s and R-loops directly bound by Dhx9 and further characterized the differential Dhx9-bound G4s and R-loops in the absence of Dhx9. Dhx9 is a versatile helicase capable of directly resolving R-loops and G4s or promoting R-loop formation (PMID: 21561811, 30341290, 29742442, 32541651, 35905379, 34316718). Furthermore, we showed that depletion of Dhx9 significantly altered the levels of G4s or R-loops around the TSS or gene bodies of several key regulators of mESC and embryonic development, such as Nanog, Lin28a, Bmp4, Wnt8a, Gata2, and Lef1, and also their RNA levels (Fig.7 I). The above evidence is sufficient to support the transcriptional regulation of mESCs cell fate by directly modulating the G4s or R-loops within the key regulators of mESCs.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Non-B DNA structures such as G4s and R-loops have the potential to impact genome stability, gene transcription, and cell differentiation. This study investigates the distribution of G4s and R-loops in human and mouse cells using some interesting technical modifications of existing Tn5-based approaches. This work confirms that the helicase DHX9 could regulate the formation and/or stability of both structures in mouse embryonic stem cells (mESCs). It also provides evidence that the lack of DHX9 in mESCs interferes with their ability to differentiate.

      Strengths:

      HepG4-seq, the new antibody-free strategy to map G4s based on the ability of Hemin to act as a peroxidase when complexed to G4s, is interesting. This study also provides more evidence that the distribution pattern of G4s and R-loops might vary substantially from one cell type to another.

      We appreciate your valuable points.

      Weaknesses:

      This study is essentially descriptive and does not provide conclusive evidence that lack of DHX9 does interfere with the ability of mESCs to differentiate by regulating directly the stability of either G4 or R-loops. In the end, it does not substantially improve our understanding of DHX9's mode of action.

      In this study, we aimed to report new methods for capturing endogenous G4s and R-loops in living cells. Dhx9 has been reported to directly unwind R-loops and G4s or promote R-loop formation (PMID: 21561811, 30341290, 29742442, 32541651, 35905379, 34316718). To understand the direct Dhx9-bound G4s and R-loops, we performed the Dhx9 CUT&Tag assay and analyzed the co-localization of Dhx9-binding sites and G4s or R-loops. We found that 47,857 co-localized G4s and R-loops are directly bound by Dhx9 in the wild-type mESCs and 4,060 of them display significantly differential signals in absence of Dhx9, suggesting that redundant regulators exist as well. We showed that depletion of Dhx9 significantly altered the RNA levels of several key regulators of mESC and embryonic development, such as Nanog, Lin28a, Bmp4, Wnt8a, Gata2, and Lef1, which coincides with the significantly differential levels of G4s or R-loops around the TSS or gene bodies of these genes (Fig.7). The comprehensive molecular mechanism of Dhx9 action is indeed not the focus of this study. We will work on it in the future studies. Thank you for the comments.

      There is no in-depth comparison of the newly generated data with existing datasets and no rigorous control was presented to test the specificity of the hemin-G4 interaction (a lot of the hemin-dependent signal seems to occur in the cytoplasm, which is unexpected).

      The specificity of hemin-G4-induced peroxidase activity and self-biotinylation has been well demonstrated in previous studies (PMID: 19618960, 22106035, 28973477, 32329781). In the Fig.1A, we compared the hemin-G4-induced biotinylation levels in different conditions. Cells treated with hemin and Bio-An exhibited a robust fluorescence signal, while the absence of either hemin or Bio-An almost completely abolished the biotinylation signals, suggesting a specific and active biotinylation activity. To identify the specific signals, we have included the non-label control and used this control to call confident HepG4 peaks in all HepG4-seq assays.

      The hemin-RNA G4 complex has also been reported to have mimic peroxidase activity and trigger similar self-biotinylation signals as DNA G4s (PMID: 32329781, 31257395, 27422869). Therefore, it is not surprising to observe hemin-dependent signals in the cytoplasm generated by cytoplasmic RNA G4s.

      In the revised version, we will include careful comparison between our data and previous datasets.

      The authors talk about co-occurrence between G4 and R-loops but their data does not actually demonstrate co-occurrence in time. If the same loci could form alternatively either R-loops or G4 and if DHX9 was somehow involved in determining the balance between G4s and R-loops, the authors would probably obtain the same distribution pattern. To manipulate R-loop levels in vivo and test how this affects HEPG4-seq signals would have been helpful.

      Single-molecule fluorescence studies have shown the existence of a positive feedback mechanism of G4 and R-loop formation during transcription (PMID: 32810236, 32636376), suggesting that G4s and Rloops could co-localize at the same molecule. Dhx9 is a versatile helicase capable of directly resolving R-loops and G4s or promoting R-loop formation (PMID: 21561811, 30341290, 29742442, 32541651, 35905379, 34316718). Although depletion of Dhx9 resulted in 6,171 Dhx9-bound co-localized G4s and R-loops with significantly altered levels of G4s or R-loops, only 276 of them (~4.5%) harbored altered G4s and R-loops, suggesting that the interacting G4s and R-loops are rare in living cells. Nowadays, the genome-wide co-occurrence of two factors are mainly obtained by bioinformatically intersection analysis. We agreed that the heterogenous distribution between cells will give false positive co-occurrence patterns. We will carefully discuss this point in the revised version. At the same time, we will make efforts to develop a new method to map the co-localized G4 and R-loop in the same molecule in the future study.

      This study relies exclusively on Tn5-based mapping strategies. This is a problem as global changes in DNA accessibility might strongly skew the results. It is unclear at this stage whether the lack of DHX9, BLM, or WRN has an impact on DNA accessibility, which might underlie the differences that were observed. Moreover, Tn5 cleaves DNA at a nearby accessible site, which might be at an unknown distance away from the site of interest. The spatial accuracy of Tn5-based methods is therefore debatable, which is a problem when trying to demonstrate spatial co-occurrence. Alternative mapping methods would have been helpful.

      In this study, we used the recombinant streptavidin monomer and anti-GP41 nanobody fusion protein (mSA-scFv) to specifically recognize hemin-G4-induced biotinylated G4 and then recruit the recombinant GP41-tagged Tn5 protein to these G4s sites. Similarly, the recombinant V5-tagged N-terminal hybrid-binding domain (HBD) of RNase H1 specifically recognizes R-loops and recruit the recombinant protein G-Tn5 (pG-Tn5) with the help of anti-V5 antibody. Therefore, the spatial distance of Tn5 to the target sites is well controlled and very short, and also the recruitment of Tn5 is specifically determined by the existence of G4s in HepG4-seq and R-loops in HBD-seq.

      Reviewer #2 (Public Review):

      Summary:

      In this study, Liu et al. explore the interplay between G-quadruplexes (G4s) and R-loops. The authors developed novel techniques, HepG4-seq and HBD-seq, to capture and map these nucleic acid structures genome-wide in human HEK293 cells and mouse embryonic stem cells (mESCs). They identified dynamic, cell-type-specific distributions of co-localized G4s and R-loops, which predominantly localize at active promoters and enhancers of transcriptionally active genes. Furthermore, they assessed the role of helicase Dhx9 in regulating these structures and their impact on gene expression and cellular functions.

      The manuscript provides a detailed catalogue of the genome-wide distribution of G4s and R-loops. However, the conceptual advance and the physiological relevance of the findings are not obvious. Overall, the impact of the work on the field is limited to the utility of the presented methods and datasets.

      Strengths:

      (1) The development and optimization of HepG4-seq and HBD-seq offer novel methods to map native G4s and R-loops.

      (2) The study provides extensive data on the distribution of G4s and R-loops, highlighting their co-localization in human and mouse cells.

      (3) The study consolidates the role of Dhx9 in modulating these structures and explores its impact on mESC self-renewal and differentiation.

      We appreciate your valuable points.

      Weaknesses:

      (1) The specificity of the biotinylation process and potential off-target effects are not addressed. The authors should provide more data to validate the specificity of the G4-hemin.

      The specificity of hemin-G4-induced peroxidase activity and self-biotinylation has been well demonstrated in previous studies (PMID: 19618960, 22106035, 28973477, 32329781). In the Fig.1A, we compared the hemin-G4-induced biotinylation levels in different conditions. Cells treated with hemin and Bio-An exhibited a robust fluorescence signal, while the absence of either hemin or Bio-An almost completely abolished the biotinylation signals, suggesting a specific and active biotinylation activity.

      (2) Other methods exploring a catalytic dead RNAseH or the HBD to pull down R-loops have been described before. The superior quality of the presented methods in comparison to existing ones is not established. A clear comparison with other methods (BG4 CUT&Tag-seq, DRIP-seq, R-CHIP, etc) should be provided.

      Thank you for the suggestions. We will include the comparisons in the revised version.

      (3) Although the study demonstrates Dhx9's role in regulating co-localized G4s and R-loops, additional functional experiments (e.g., rescue experiments) are needed to confirm these findings.

      Dhx9 has been demonstrate as a versatile helicase capable of directly resolving R-loops and G4s or promoting R-loop formation in previous studies (PMID: 21561811, 30341290, 29742442, 32541651, 35905379, 34316718). We believe that the current new dataset and previous studies are enough to support the capability of Dhx9 in regulating co-localized G4s and R-loops.

      (4) The manuscript would benefit from a more detailed discussion of the broader implications of co-localized G4s and R-loops.

      Thank you for the suggestions. We will include a more detailed discussion in the revised version.

      (5) The manuscript lacks appropriate statistical analyses to support the major conclusions.

      We apologized for this point. Whereas we have applied careful statistical analyses in this study, lacking of some statistical details make people hard to understand some conclusions. We will carefully add details of all statistical analysis.

      (6) The discussion could be expanded to address potential limitations and alternative explanations for the results.

      Thank you for the suggestions. We will include a more detailed discussion about this point in the revised version.

      Reviewer #3 (Public Review):

      Summary:

      The authors developed and optimized the methods for detecting G4s and R-loops independent of BG4 and S9.6 antibody, and mapped genomic native G4s and R-loops by HepG4-seq and HBD-seq, revealing that co-localized G4s and R-loops participate in regulating transcription and affecting the self-renewal and differentiation capabilities of mESCs.

      Strengths:

      By utilizing the peroxidase activity of G4-hemin complex and combining proximity labeling technology, the authors developed HepG4-seq (high throughput sequencing of hemin-induced proximal labelled G4s), which can detect the dynamics of G4s in vivo. Meanwhile, the "GST-His6-2xHBD"-mediated CUT&Tag protocol (Wang et al., 2021) was optimized by replacing fusion protein and tag, the optimized HBD-seq avoids the generation of GST fusion protein aggregates and can reflect the genome-wide distribution of R-loops in vivo.

      The authors employed HepG4-seq and HBD-seq to establish comprehensive maps of native co-localized G4s and R-loops in human HEK293 cells and mouse embryonic stem cells (mESCs). The data indicate that co-localized G4s and R-loops are dynamically altered in a cell type-dependent manner and are largely localized at active promoters and enhancers of transcriptionally active genes.

      Combined with Dhx9 ChIP-seq and co-localized G4s and R-loops data in wild-type and dhx9KO mESCs, the authors confirm that the helicase Dhx9 is a direct and major regulator that regulates the formation and resolution of co-localized G4s and R-loops.

      Depletion of Dhx9 impaired the self-renewal and differentiation capacities of mESCs by altering the transcription of co-localized G4s and R-loops-associated genes.

      In conclusion, the authors provide an approach to studying the interplay between G4s and R-loops, shedding light on the important roles of co-localized G4s and R-loops in development and disease by regulating the transcription of related genes.

      We appreciate your valuable points.

      Weaknesses:

      As we know, there are at least two structure data of S9.6 antibody very recently, and the questions about the specificity of the S9.6 antibody on RNA:DNA hybrids should be finished. The authors referred to (Hartono et al., 2018; Konig et al., 2017; Phillips et al., 2013) need to be updated, and the authors' bias against S9.6 antibodies needs also to be changed. However, as the authors had questioned the specificity of the S9.6 antibody, they should compare it in parallel with the data they have and the data generated by the widely used S9.6 antibody.

      Thank you for the updating information about the structure data of S9.6 antibody. We politely disagree the specificity of the S9.6 antibody on RNA:DNA hybrids. The structural studies of S9.6 (PMID: 35347133, 35550870) used only one RNA:DNA hybrid to show the superior specificity of S9.6 on RNA:DNA hybrid than dsRNA and dsDNA. However, Fabian K. et al has reported that the binding affinities of S9.6 on RNA:DNA hybrid exhibits obvious sequence-dependent bias from null to nanomolar range (PMID: 28594954). We will include the comparison between S9.6-derived data and our HBD-seq data in the revised version.

      Although HepG4-seq is an effective G4s detection technique, and the authors have also verified its reliability to some extent, given the strong link between ROS homeostasis and G4s formation, and hemin's affinity for different types of G4s, whether HepG4-seq reflects the dynamics of G4s in vivo more accurately than existing detection techniques still needs to be more carefully corroborated.

      Thank you for pointing out this issue. In the in vitro hemin-G4 induced self-biotinylation assay, parallel G4s exhibit higher peroxidase activities than anti-parallel G4s. Thus, the dynamics of G4 conformation could affect the HepG4-seq signals (PMID: 32329781). In the future, people may need to combine HepG4-seq and BG4s-eq to carefully explain the endogenous G4s. We will carefully discuss this point in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Due to the significant difference between the infection timeline of mild (1 day post symptom onset) and severe (10 days post symptom onset) cohort at enrollment, an informative analysis to consider is to compare timepoint 2 from the mild cohort to timepoint 1 from the severe cohort.

      In agreement with what the reviewer noted on his comment, to be more helpful we completed the analysis comparing timepoint 2 from the mild cohort to timepoint 1 from severe cohort, which is now included as Figure 4-figure supplement 5. The new text added is on pages 13-14, lines 346-355 explaining this analysis. We also included a paragraph in the discussion on page 22, lines 595-604. We have resolved to show this comparison to enforce the main observation related to Natural Killer Cytotoxicity pathways enriched in all analyses of this work.

      (2) Alternatively, as this information is available, the authors may group the samples based on the individual's infection timeline as opposed to the recruitment timeline.

      Patients in both groups were enrolled at the peak of their symptoms. According to this criterion, we grouped the patients to generate more significant results. Since these infections occurred naturally, we have no accurate information regarding the infection timing of patients. However, if the samples were grouped in order of individual infection timeline, the analysis would be statistically weak to make conclusions about the course of COVID-19, as disease progression would not be coordinated. Our grouping approach provided us a good confidence range, despite the tiny population evaluated.

      (3) The authors selected three co-regulated network modules based on the size of module membership genes, selecting the three modules containing the largest gene membership. Small co-regulated networks can also offer important biological insights into specific molecular machinery associated with disease outcomes.

      Figure 5 was updated including two more networks (besides blue), for brown and turquoise modules (5E and 5F). This new information allowed us to understand deeply the three larger modules with the most significant results, due to the number of genes they included (blue: 704, brown: 508, and turquoise: 712). The new text describing this analysis is included in page 15 lines 388-396. The remaining 7 modules were also analyzed, and the Gene Ontology/Pathways enrichment were included in 2 new supplemental figures (Figure 5 - figure supplement 1 and 2). The new text describing this analysis is included on page 15, lines 397-401.

      (4) An alternative selection criterion that can inform biological associations between module genes and disease severity is the strength of the correlation coefficients. It seems from Figure 5B, that yellow, turquoise, and green modules have a moderate positive correlation with severe patients, while brown, blue, and gray modules show a slight positive correlation with mild outpatients. A recommendation for the authors is to consider revising Figure 5C to include the enrichment of these additional modules and include these modules in the interpretation of the results.

      The correlations between cohorts and the modules (blue, brown and turquoise) are clearly identified for severe or mild patients. However, for several smaller modules, correlations are heterogenous for different patients of the cohorts, making it hard to gain a clear conclusion related to severity groups. In this sense, the 7 modules were analyzed as is indicated in the previous response number #3, and the results offer an idea of the different transcriptional programs present at different patients in different stages of disease. However, the small number of genes in some modules brings weak results of GO and enriched pathways, making it difficult to interpretation. The text describing this figure is included in page 15 lines 397-401. Also, the network analyses for brown and turquoise modules were included in figure 5 as 5E-F and the text detailing these figures was included on page 15 lines 388-396.

      (5) In Figures 3E and 3F, the authors present enrichment analyses of differentially expressed genes from day 28. However, earlier in the results (lines 226-228), the authors reported no differentially expressed genes observed between the mild and severe participant cohort at this time point. Can the authors clarify which comparison was performed to obtain the list of differentially expressed genes used in the enrichment analyses in Figures 3E and 3F?

      The discrepancy in this case stems from separate criteria employed for comparison in each case. At the pairwise comparison, DEGs list is different from the longitudinal comparison mentioned afterwards, as for this later analysis we selected only the genes with different trajectories throughout the study (Figure 3). To clarify this point, we included a new paragraph on page 11, lines 278-285.

      Original:

      “We detected 828 genes that exhibited temporal and quantitative expression level differences during the progression of disease. We discovered additional biological processes and KEGG pathways that were differentially enriched during the COVID-19 progression in mild and severe patients (Figure 3) using the Enrichr platform (G. Chen et al., 2020)”

      Changed to:

      “To do so, we first identified genes that were differentially expressed between severity groups, and second, we chose only those that also showed changes in their trajectories across sampling times. In doing so, we found 828 genes that exhibited temporal differences in expression level during disease progression. Then using the Enrichr platform (G. Chen et al., 2020), we discovered additional biological processes and KEGG pathways that were differentially enriched during the COVID-19 progression in mild and severe patients (Figure 3).”

      (6) Additionally, the authors refer to specific enriched genes in Figure 3 (lines 298-302), but Figure 3 only displays the enriched terms. Can the authors include the results from the enrichment analysis that include gene membership for each enriched term in the supplement?

      Certainly, there is no figure or table in the initial version that includes the gene list for this analysis. We have now included a supplement table 1 and 2 that details each pathway, along with its gene list.

      (7) In line 104, can the authors clarify the parameters used to define well-matched samples?

      Based on the observations made by the reviewers, we decided to change the wording to make it more obvious about the message of this paper. The update was included on page 5, line  as follows:

      Original:

      “Here, we designed a longitudinal investigation using well-matched samples to study how changes in gene expression in distinct immune effector cells changed during the earliest time points after diagnosis and during progression of clinical disease”,

      Changed to:

      “Here, we designed a longitudinal comparison between mild and severe patients, choosing the appropriate samples according to the clinical progression and the unbiased gene expression profile”

      (8) In lines 113-116, can the authors clarify how their approach mitigates noise/potential biases and very briefly, describe what the nature of noise/biases could be?

      The main goal of this paragraph is to show that, while there are several pathways with statistical significance in our analyses, the focus was on NK cell cytotoxicity because this molecular pathway showed bridges between other relevant immune responses; thus, the pathways chosen to respond to its intricated transcriptional program instead of a biased interest. The text was edited and included on page 6, line 111-131 as follows:

      Original:

      “We used a pairwise comparison of gene expression, gene set enrichment, and weight-correlated gene network analyses to detect differential expression of genes involved with the cytotoxic signaling pathway of Natural Killer (NK) cells in mild verses severe progression of disease. We promoted a broad and integrated point of view throughout the transcriptomic analysis of functional pathways to mitigate noise and potential biases (Bastard et al., 2020; Delorey et al., 2021; Schultze & Aschenbrenner, 2021; S. Zhang et al., 2022). We found close connectivity between NK signaling pathway genes and those of cytokine-cytokine receptor signaling pathways, along with Th1/Th2 cell differentiation genes, as part of the transcriptional circuit executed preferentially among mildly ill patients. Our results detected transcriptional circuits engaging multiple regulatory checkpoints. These findings indicated that the innate NK signaling pathway (cell cytotoxic activity) is beneficial, perhaps a critically-necessary activity needed to effectively eradicate coronavirus. We interpreted that an adaptive immune response that included early cell-mediated immunity was important for reducing disease severity in mild patients. This balance between humoral- and cell-mediated immunity appeared to be less robust in patients presenting with severe COVID-19. These results detected components of the immune response that were significantly associated with the differences in symptom severity observed between mild and severely ill COVID-19 patients.”

      Changed to:

      “Briefly, to gain more insights into our findings and complement their functional context, we used a pairwise comparison of gene expression, gene set enrichment, and weight-correlated gene network analyses. By doing so, we identified pathways of genes involved with the NK cell cytotoxicity enriched in mild patients when compared to severe. Besides focusing on a particular molecular pathway, we investigated the interactions to better comprehend the underlying phenomena of a successful immune response, contributing to an integrated point of view throughout the transcriptomic analyses of functional pathways to mitigate potential biases attributed to focusing the study on a single pathway. In this regard, we revealed that the NK signaling pathway was intricately related to other transcriptional circuits, such as those governing Th1/Th2 cell differentiation and cytokine-cytokine receptor signaling pathways. These interactions highlight the importance of these pathways as bridges between the innate and adaptive immune responses throughout the disease, implying that the innate NK signaling pathway (cell cytotoxic activity) is beneficial, and possibly a critical activity required to effectively eradicate coronavirus. We also concluded that an adaptive immune response including early cell-mediated immunity was significant in lowering disease severity. The link between the primary innate NK cell activity and the transcriptional priming of adaptive Th1 and Th2 cell responses appears to be more robust in mild patients than in severe.”

      (9) In line 120, can the authors clarify which regulatory checkpoints were being referred to?

      The concept of “checkpoint” was changed to “bridges” (line 124), because offers a clearer idea about the molecular interaction displayed across the different enriched pathways described in our study. In this sense, the bridges show the connection between innate immune response by NK cell and the adaptive immune response by Th1/Th2 cells

      (10) In lines 125-126, can the authors refer to specific results to support this observation?

      Lines 111 to 129 summarize the results of the analysis that support the aforementioned phrase. However, the original sentence referred was modified for better comprehension on page 6, lines 129-131 as follows:

      Original:

      “This balance between humoral- and cell-mediated immunity appeared to be less robust in patients presenting with severe COVID-19”

      Changed to:

      “The link between the primary innate NK cell activity and the transcriptional priming of adaptive Th1 and Th2 cell responses appears to be more robust in mild patients than in severe.”

      (11) In lines 184-185, can the authors clarify what the term "mixed" specifically refers to?

      The original text was modified for better comprehension on page 8, lines 177-179 as follows:

      Original:

      “Interestingly, on day-28, when the majority of patients had recovered, samples from severely ill patients were still mixed compared to those with mild symptoms.”

      Changed to:

      “Interestingly, on day-28, when the majority of patients had recovered, samples from severely ill patients were pooled together with those mild patients who had already recovered”.

      (12) In line 286, can the authors clarify how quantitative expression level differences are distinct from temporal expression level differences?

      Despite the differences in the enrollment time between mild and severe cohorts, it was made precisely during COVID-19 symptoms peaks, as illustrated in figure 1B. Also supporting this criterion, the longitudinal analysis outlined in figure 3 was performed taking into account the changes in gene expression trajectories along all sampling times. This point has significance because the results obtained from it exposed several transcriptional programs that were dynamically executed along disease progression, even independently of the pairwise comparison approaches carried out previously.

      (13) In Figure 1C, there seemed to be two data points associated with "M1 0 days" and "M4 28 days" with distinct PC projections. Could these samples be mislabeled?

      The figure was revised and completed. The hexagon symbol for day-28 was changed for a star symbol. The “M1 0 days” and “M4 28 days” samples were labeled correctly.  See below figure 1C with changes as follows:  

      (14) In Figure 1D caption: could authors clarify if the ranking of 100 genes was based on the log2FC or adjusted p-values?

      The criteria considered was Fold Change ≥ 2 and the FDR ≤ 0.05 which is included in the methodology on page 23, lines 657-660

      (15) In Figure 4D, can the authors include the expression z score for the healthy participants?

      We could include this information, but we consider that it would not help for the understanding of this figure because in this way we put the focus on the differential trajectories between mild and severe patients. Also, DEGs from mild and severe cohorts from this analysis or any other in this work were obtained relatively to healthy donors.

      (16) Related to this, can the authors clarify if the expression z scores were computed using the mean and standard deviations of all samples within the study or relative to a specific participant cohort?

      The z-score was used considering the mild and severe patients to calculate mean and then the standard deviation of each group. A new paragraph was included in material and methods on page 24, lines 662-664.

      (17) In Figure 5B, can the authors include column annotations for participants and sampling time points?

      The figure 5B was updated and completed with the suggested information.

      (18) In Figure 1 - Figure Supplement 2, can the authors include the volcano plot from the pairwise comparison for day 28 showing no differentially expressed genes between mild and severe participants as reported in the results (lines 226-228)?

      The third volcano plot for day 28 was included in the updated figure 1 supplement 2.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript is generally very well-constructed and well-written. However, the following are the major concerns mostly regarding the study design and participant selection.

      (1) The authors have used enrolment day as D0 which is not reflective of the immune response timeline. Especially when the designated 'D0' for the severe group is 10.0 + 1.8 days post symptom (DPS) onset while the 'D0' for the mild group is 1.2 + 1.3 DPS. In the context of an acute infection discussed herewith, this difference is critical.

      As tempting as it is to conduct longitudinal studies on COVID-19, the authors might do better focusing on specific acute time points (within 10 days post-symptom onset) and convalescent time points (beyond 28 days post-symptom). A better comparison would be D0 severe with D7 mild (aligning the DPS to be between 7-10 days in both groups).

      Despite the differences in the enrolment time between mild and severe cohorts, it was made precisely during COVID-19 symptoms peaks, as illustrated in figure 1B. Also supporting this criterion, the longitudinal analysis outlined in figure 3 was performed taking into account the changes in gene expression trajectories along all sampling times. This point has significance because the results obtained from it exposed several transcriptional programs that were dynamically executed along disease progression, even independently of the pairwise comparison approaches carried out previously. Likewise, we agree with the observation of the reviewer, because as we mentioned in the article, it is difficult to properly compare disease progression between naturally infected patients. So, to better support our findings, we complemented them throughout a pairwise comparison between day-7 samples from mild and day-0 samples from severely ill individuals, finding GO terms and enriched pathways related to NK cell function across the mild cohort, as seen in Figure 4-figure supplement 5. This result enforced the main findings gained from the different analyses carried out in this work, highlighting the relevance of the innate immune response of Natural Killer cells, which correlated with a mild progression of disease. The new paragraph describing this analysis was included in pages 13-14, lines 346-355. We also included a paragraph in the discussion on page 22, lines 595-604.

      (2) Though there are four participants within each group, one of the participants with severe infection (S1) only has the D0 time point which probably undermines the statistical significance of the results.

      This is an accurate observation, as the statistical weight will allow the deeper alterations to be evaluated while the more subtle ones will most likely be excluded from this study. In our analyses, we focused on variations with high statistical significance, which led to the discovery of a distinct Natural Killer response between mild and severe cohorts.

      (3) The authors should also account for any medications administered to the severe group in the ICU before enrolment in the study -immune-dampening drugs or steroids which may alter neutrophil recruitment or other immune functions.

      Only one severe patient received medication both prior to and during the COVID-19 disease. Even though several medications were administered to this patient, their effects have not been found to increase the neutrophil response.

      (4) What was the viral load status at the different time points analyzed - how does this relate to the immune and clinical findings?

      In this recruitment the viral load status was not measured.

      (5) Was any complete blood count or basic immune phenotyping conducted on these samples? Important to know the various cell frequencies in the PBMC mix sent for sequencing to account for contamination of lymphocytes with RBCs/monocytes/neutrophils as well as any lymphopenia.

      This measurement was not done for these samples. However, our protocol of PBMC purification has been tested before and showed small quantities of red blood cell contamination in the process. Furthermore, in all analysis of Gene Ontology or Enriched Pathways, there is not any related to red blood cell genes that could generate noise in the interpretation of our results.

      (6) The neutrophil/lymphocyte ratio is already skewed during SARS-CoV-2 infection - which could be the reason for higher readings in severe participants? - speculate?

      Effectively, the ratio in several cell types is changed during SARS-CoV-2 infection. However, despite this noise in the proportion of immune cells, different functions in our study are more represented in cells with less count as Natural Killer cells. The modules of co-expression analysis support the notion that despite the number of cells being in different proportions, a transcriptional program is being executed differentially in the cohorts.

      (7) CD247/ZAP70 also influences the CD16-mediated NK cell ADCC activity which the authors can add to the innate-adaptive bridging section.

      NK CD16a is more highly expressed in NK cells. The circuit involving CD247/ZAP70 and CD16 could explain the cytotoxicity of these cells and how they contribute to the establishment of a response to fight the viral infection of SARS-CoV-2. In our study, CD16a (FcgammaRIIIa) expression was similar in both mild and severe cohorts. Because our methodology only counts transcriptional changes, genes that did not change were excluded from our discussion. However, our group's research focuses on this node or bridge between innate and adaptive immune responses, with a particular emphasis on fc-antibodies functions, being a topic of interest for future research.

      (8) Some of the figures lacked clarity making it difficult to review. (Eg. Fig 4 A, Fig 4 - supplement 1 A&B, Fig 5).

      Figure 4A was redesigned, Figure 4-figure supplement 1 was presented in a full page for better resolution.

      Specific Comments:

      (1) Consider changing "covid-19" in the title of the manuscript to "COVID-19"

      Probably the journal platform changes the letters. The original title is in capital letters according to the observation. In the clinical table “COVID-19” was changed to capital letters.

      (2) Page 2: Line 24 - Consider revising this line. Not sure what the authors mean by 'early compromise'

      The paragraph was revised and rewritten.

      Original:

      “Mild COVID-19 patients presented an early compromise with NK cell function, whereas severe patients do so with neutrophil function”

      Changed to:

      ”Mild COVID-19 patients displayed an early transcriptional commitment with NK cell function, whereas severe patients do so with neutrophil function”

      (3) Page 4: Lines 57 & 58 - Verify the reference. The paper referenced was published in 2016 and is in regard to SARS-CoV, MERS-CoV, and enterovirus D68.

      Effectively, this reference was appropriate for drawing parallels with other respiratory viruses. Due to the emphasis on SARS-CoV-2, the paragraph has been strengthened with two additional references: Shen 2023, and Wauters 2022.

      (4) Page 10: Lines 229 - 234 - Consider referring to the appropriate figure (i.e., Figure Supplement 2 A or B). The figure associated with D28 DEGs (Volcano plot) is missing in the supplementary. Erroneously referred here as Figure 1C which is a PCA plot?

      The original text was changed because the figure referenced was correct but misunderstood. The final sentence is on page 9, lines 220-223.

      (5) Page 10: Line 224 - Change the sentence to " We found upregulated.." instead of " We found regulated..".

      The text was edited in accordance with this recommendation, which is currently found in line 232.

      (6) Page 13: Line 326 - Figure 4A referenced here is not clear - unable to review.

      Figure 4A was updated for a better resolution and included in the manuscript.

      (7) Page 15: Line 398 - Consider rewording "after diagnosis" since the days here are "after enrolment".

      This recommendation was considered and the text was rewritten on page 15, lines 404-406:

      Original:

      “We systematically analyzed transcriptomic features of PBMCs from COVID-19 patients with mild and severe symptoms at three sequential time-points (D0, D7, and D28) after diagnosis”

      Changed to:

      “We systematically analyzed transcriptomic features of PBMCs from COVID-19 patients with mild and severe symptoms at three sequential time-points (D0, D7, and D28) during the peak of the symptoms”

      (8) Page 17: Move text from the next page to eliminate blank space.

      Resolved

      (9) Page 32: Figure 1C - Consider changing the symbol for D28 since it looks very similar to the D0 symbol. Use the colors consistently instead of different shades for each group.

      The hexagon symbol was changed by a star symbol for D28 in figure 1C.  In this figure each color indicates the three different groups, and the transparent color was used to differentiate the symbols when are close together.

      (10) Page 36: Figure 4A - Unable to review.

      This figure was resized for better resolution.

      (11) Page 42-49: Consider relabeling and renumbering the Supplementary figures for consistency and reference the modified numbers in the appropriate location in the main text.

      The supplementary figures were relabeling for consistency and better understanding.

      (12) Pages 44 & 48: Unable to review the figures.

      The figures indicated were resized for better resolution.

      Examples of consistency review:

      (1) Use of D0,D7 / D-0, D-7 throughout the manuscript

      The selected format for the final version of the manuscript is D0, D7, and D28.

      (2) Reporting the source of reagents consistently (Name, Place, Country, Catalog number)

      The source reagents were reformatted for consistency in lines 626-628-632-642.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1: The authors may consider moving the supplemental figures into the main body of the paper since they finally would end up with a total of eight figures.

      As we added two more supplementary figures, we left them separated from the main part of the manuscript in the supplement. All of them describe important experimental details but we believe that it is easier to follow if there is a focus on the key results.

      Reviewer #1: In general, the methods and techniques used here are beside some required but important additions described in sufficient detail.

      Reviewer #2: Given the identified importance of glow-discharge treatment of precoated tape to the flat deposition of sections during ATUM, a corresponding schematic or appropriate reference(s) providing more information about the custom-built tape plasma device would likely be a prerequisite for effective reproduction of this technique in other laboratories.

      Thank you for the valuable comments on the missing experimental details, which could affect the ease of establishing ATUM-Tomo in other labs. We will clearly highlight the ATUM-Tomo-specific vs. some general EM processing steps of the workflow in the proposed way. A detailed description of the custom-built tape plasma device will be added to the methods section. In addition, we will reference more explicitly our published protocols, which describe the standard electron microscopy embedding steps in great detail (Kislinger et al., STAR protocols, 2020; Kislinger et al., Meth Cell Biol, 2023).

      Reviewer #1: Concerning the results section: In my opinion, the results section is a bit unbalanced. There is a mismatch between the detailed description of the methodology (experimental approach) and the scientific findings of the paper. The reviewer can see the enormous methodological impact of the paper, which on the other hand is the major drawback of the paper. To my opinion, the authors should also give a more detailed description of their scientific results.

      Concerning the discussion: It would have been nice to give a perspective to which the described methodology can be used not only to describe diverse biological aspects that can be addressed and answered by this experimental approach. For example, how could this method be used to address various questions about the normal and pathologically altered brain?

      In my opinion, the paper has one major drawback which is that it is more methodologically based although the authors included a scientific application of the method. The question here is to balance the methodology vs. the scientific achievement of this paper, a decision hard to take. In other words, one could recommend this paper to more methodologically based journals, for example, Nature Methods.

      Balancing the technological and biological parts is indeed a difficult issue. We agree that this manuscript mainly describes a technical advancement and demonstrates its power to answer previously unsolved scientific questions. We exemplify this in our model system, neuropathology of the blood-brain barrier. The biological impact of ATUM-SEM has been described in detail in Khalin et al., Small, 2022, and is referenced accordingly. Here we describe how ATUM-Tomo can be applied to reveal biological insights exceeding the capabilities of ATUM-SEM and other volume electron microscopy techniques. However, the description of the methodological development outweighs by far the one of the biological details. We consider eLife‘s Tools and Resources (which, in our view, is in scope similar to Nat Methods) an ideal format for this technically focused manuscript while targeting eLife’s readership with diverse biological fields of interest for potential applications of the method. We suggested the application in connectomics (for chemical synapses), the study of endocytosis and the detection of virus particles in the discussion. Hopefully, this accommodates the Reviewer’s concern that having only a single application might seem arbitrary or even suggest a very narrow utility of the technique.

      “While we demonstrate a neuropathology-related application, further biological targets that require high-resolution isotropic voxels and the spatial orientation within a larger ultrastructural context can potentially be studied by ATUM-Tomo. This includes the detection of gap junctions for connectomics or for the study of long-range projections (Holler et al., 2021) and the subcellular location of virus particles (Wu et al., 2022, Roingeard, 2008, Pelchen-Matthews and Marsh, 2007). Thus, ATUM-Tomo opens up new avenues for multimodal volume EM imaging of diverse biological research areas.”

      Reviewer #2: Is the separation of sections from permanent marker-treated tape sensitive to the time interval between deposition/SEM imaging and acetone treatment?

      Thank you for pointing out this important methodological aspect. We have not systematically investigated whether there is a critical time window between microtomy, SEM, and detachment. From the samples generated for this study, we assessed the importance of timing in retrospect:

      “The sections could be recovered even four months after collection and nine months after SEM imaging.”

      Reviewer #2: To what extent is slice detachment from permanent marker-treated tape resin-dependent [i.e. has ATUM-Tomo been tested on resin compositions beyond LX112 (LADD)]?

      We appreciate this comment addressing the broader technical applicability of ATUM-Tomo. We tested the general workflow with tissue embedded in other commonly used resin types (epon, durcupan).

      Reviewer #2: Minor corrections to the text and figures.

      Line 83: ((Khalin et al., 2022) should read (Khalin et al., 2022)

      Line 86 : 30nm should read 30 nm

      Line 139: "...morphological normal tight junctions..." should read "...morphologically normal tight junctions..."

      Line 283: "....despite glutaraldehyde fixation, a prerequisite for optimal ultrastructural preservation...".

      Line 295: "In contrast, our CLEM approach provides high ultrastructural quality by optimal chemical fixation".

      The concepts of optimal preservation and optimal fixation are arguably context- and application-dependent. These statements should be toned down or their context explicitly stated.

      Thank you for the detailed corrections. We have applied them accordingly.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, Faniyan and colleagues build on their recent finding that renal Glut2 knockout mice display normal fasting blood glucose levels despite massive glucosuria. Renal Glut2 knockout mice were found to exhibit increased endogenous glucose production along with decreased hepatic metabolites associated with glucose metabolism. Crh mRNA levels were higher in the hypothalamus while circulating ACTH and corticosterone was elevated in this model. While these mice were able to maintain normal fasting glucose levels, ablating afferent renal signals to the brain caused low fasting blood glucose levels. In addition, the higher CRH and higher corticosterone levels of the knockout mice were lost following this denervation. Finally, acute phase proteins were altered, plasma Gpx3 was lower, and major urinary protein MUP18 and its gene expression were higher in renal Glut2 knockout mice. Overall, the main conclusion that afferent signaling from the kidney is required for renal glut2 dependent increases in endogenous glucose production is well supported by these findings.

      Strengths:

      An important strength of the paper is the novelty of the identification of kidney to brain communication as being important for glucose homeostasis. Previous studies had focused on other functions of the kidney modulated by or modulating brain function. This work is likely to promote interest in CNS pathways that respond to afferent renal signals and the response of the HPA axis to glucosuria. Additional strengths of this paper stem from the use of incisive techniques. Specifically, the authors use isotope enabled measurement of endogenous glucose production by GC-MS/MS, capsaicin ablation of afferent renal nerves, and multifiber recording from the renal nerve. The authors also paid excellent attention to rigor in the design and performance of these studies. For example, they used appropriate surgical controls, confirmed denervation through renal pelvic CGRP measurement, and avoided the confounding effects of nerve regrowth over time. These factors strengthen confidence in their results. Finally, humans with glucose transporter mutations and those being treated with SGLT2 inhibitors show a compensatory increase in endogenous glucose production. Therefore, this study strengthens the case for using renal Glut2 knockout mice as a model for understanding the physiology of these patients.

      Weaknesses:

      A few weaknesses exist. Most concerns relate to the interpretation of this study's findings. The authors state that loss of glucose in urine is sensed as a biological threat based on the HPA axis activation seen in this mouse model. This interpretation is understandable but speculative. Importantly, whether stress hormones mediate the increase in endogenous glucose production in this model and in humans with altered glucose transporter function remains to be demonstrated conclusively. For example, the paper found several other circulating and local factors that could be causal. This model is also unable to shed light on how elevated stress hormones might interact with insulin resistance, which is known to increase endogenous glucose production. That issue is of substantial clinical relevance for patients with T2D and metabolic disease. Finally, how these findings can contribute to improving the efficiency of drugs like SGLT2 inhibitors remains to be seen.

      -  We agree with the reviewer’s overall assessment of this manuscript.

      - Confirming the contribution of each secreted protein shown in Fig. 4, whose levels were changed between the two groups of mice, toward causing a compensatory increase in glucose production in response to elevated glycosuria is beyond the scope of this manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors previously generated renal Glut2 knockout mice, which have high levels of glycosuria but normal fasting glucose. They use this as an opportunity to investigate how compensatory mechanisms are engaged in response to glycosuria. They show that renal and hepatic glucose production, but not metabolism, is elevated in renal Glut2 male mice. They show that renal Glut2 male mice have elevated Crh mRNA in the hypothalamus, and elevated plasma levels of ACTH and corticosterone. They also show that temporary denervation of renal nerves leads to a decrease in fasting and fed blood glucose levels in female renal Glut2 mice, but not control mice. Finally, they perform plasma proteomics in male mice to identify plasma proteins that are changed (up or down) between the knockouts and controls.

      Strengths:

      The question that is trying to be addressed is clinically important: enhancing glycosuria is a current treatment for diabetes, but is limited in efficacy because of compensatory increases in glucose production.

      Weaknesses:

      (1) Although I appreciate that the initial characterization of the mice in another publication showed that both males and females have glycosuria, this does not mean that both sexes have the same mechanisms giving rise to glycosuria. There are many examples of sex differences in HPA activation in response to threat, for example. There is an unfounded assumption here that males and females have the same underlying mechanisms of glycosuria that undermines the significance of the findings.

      - We agree with the reviewer that although we didn’t observe sex differences in renal Glut2 KO mice in the context of glucose homeostasis, their response (or mechanisms) to elevated glycosuria in enhancing compensatory glucose production may be different between the sexes. Therefore, we have included this limitation in discussion section.

      (2) The authors state that they induced the Glut2 knockout with taxomifen as in their previous publication. The methods of that publication indicate that all experiments were completed within 14 days of inducing the Glut2 knockout. This means that the last dose of tamoxifen was delivered 14 days prior to the experimental endpoint of each experiment. This seems like an important experimental constraint that should be discussed in this manuscript. Is the glycosuria that follows Glut2 knockout only a temporary change? If so, then the long-term change in glycosuria that follows SGLT2 inhibition in humans might not be best modelled by this knockout. Please specify when the surgeries to implant a jugular catheter or ablate the renal nerves performed relative to the Glut2 knockout in the Methods.

      - The reviewer’s statement ‘The methods of that publication indicate that all experiments were completed within 14 days of inducing the Glut2 knockout’ is incorrect. In the referred publication, we had explicitly mentioned in methods, ‘All of the experiments, except those using a diet-induced obesity mouse model or noted otherwise, were completed within 14 days of inducing the Glut2 deficiency.’ Please see figures 5h-l and 6 in the cited publication, which demonstrate that all the experiments were not completed within 14 days of inducing renal Glut2 deficiency. Per the reviewer’s advice, in the present manuscript we have include the timeline (which in some cases is 4 months beyond inducing glycosuria) in all the figure legends. In addition, for a separate project (which is unpublished) we have measured glycosuria up to 1 year after inducing renal Glut2 deficiency. Therefore, the glycosuria observed in the renal Glut2 KO mice is not temporary.

      (3) I am still unclear what group was used for controls. Are these wild-type mice who receive tamoxifen? Are they KspCadCreERT2;Glut2loxP/loxP mice who do not receive tamoxifen? This is important and needs to be specified.

      - In our previous response to the reviewer, we had already mentioned which control group was used in this study. Please see our response to the second reviewer’s point 3. As mentioned to the reviewer, we had used Glut2loxp/loxp mice as the control group, which is also described multiple times in the figure legends of our previous paper that reported the phenotype of renal Glut2 KO mice. Per the reviewer’s advice, we have provided the information again in a revised version of this manuscript.

      (4) The authors should report some additional control measures for the renal denervation that could also impact blood glucose and perhaps some of their other measures. The control measures, which one would like to see unimpacted by renal denervation, include body weights, food consumption and water intake, and glycosuria itself.

      - Please also see fig. 3 in the present manuscript that demonstrates renal afferent denervation doesn’t influence baseline blood glucose or plasma insulin levels. We have now also mentioned in the text that the denervation doesn’t affect food intake or bodyweight.

      (5) The graphical abstract shows a link between the hypothalamus and the liver that is completely unsupported by any of the current findings. That arrow should be removed.

      - Because we observed an increase in hepatic glucose production in renal Glut2 KO mice (Fig. 1) - which was reduced by 50% after selective afferent renal denervation (Fig. 3) - in the graphical abstract we are suggesting a neural connection between the kidney-brain-liver or an endocrine factor(s) to account for these changes in blood glucose levels as also described in the discussion section. We can include a question mark ‘?’ in the graphical abstract to show that further studies are need to validate these proposed mechanisms; however, we cannot just remove the arrow as advised by the reviewer.

      (6) Though the authors have toned down their language implying a causal link between the HPA measures and compensatory elevation of blood glucose in the face of glycosuria, the title still implies this causal link. It is still the case that their data do not support causation. There are many potential ways to establish a causal link but those experiments are not performed here. The renal afferents are correlated with Crh content of the PVN, but nothing has been done to show that the Crh content is important for elevating blood glucose. In light of this, the title should be toned down. Perhaps something like "Renal nerves maintain blood glucose production and elevated HPA activity in response to glycosuria". The link between HPA and glucose is not shown in this paper.

      - We request the reviewer to take a look at figure 1, showing an increase in glucose production in renal Glut2 KO mice and figure 3, which demonstrates that an afferent renal denervation reduces blood glucose levels by 50%. The afferent renal denervation (ablation of afferent renal nerves) does reduce blood glucose levels in renal Glut2 KO mice. Therefore, the use of the word ‘promote’ in the title is accurate and appropriate to reflect the role of the afferent renal nerves in contributing to about 50% increase in blood glucose levels in renal Glut2 KO mice.

      - Regarding the reviewer's comment on changes in Crh gene expression, please look at figure 3. Ablation of renal afferent nerves decreases hypothalamic Crh gene expression and other mediators of the HPA axis by 50%. Therefore, the afferent renal nerves do contribute to regulating blood glucose levels, at least in part, by the HPA axis (which is widely known to change blood glucose levels). The use of words such as ‘required’ or ‘necessary’ in the title may have indicated causal role or could have been misleading here; therefore, we have purposely used ‘promote’ in the title to accurately reflect the findings of this study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have only minor text corrections to add:

      - line 223 "A list"

      - line 253 "independent"

      - line 271 "the body's"

      - line 304 "do not"

      Yes, we have corrected these errors in a revised version of this manuscript.

      Reviewer #2 (Recommendations For The Authors):

      (1) Please report the dilutions used, if any, for the ELISAs. If the samples were run neat, please report this. Many manufacturer's instructions say that the user must determine the correct dilution to use for the samples collected. Also, sometimes when small blood volumes are collected, samples must be diluted to achieve the minimum volume collected for the assay. It is not sufficient to indicate that a reader refers to the manufacturer's instructions.

      - Per the reviewer’s advice, we have included the dilutions used for each assay in the methods.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Point 1: The authors have demonstrated that Cs9g12620 contains the EBE of PthA4 in the promoter region, to show that PthA4 controls Cs9g12620, the authors need to compare to the wild type Xcc and pthA4 mutant for Cs9g12620 expression. The data in Figure 1 is not enough.

      The data in Figure 1 D and E show a pthA4 Tn5 insertion mutant Mxac126-80 and the expression level of Cs9g12620 in citrus inoculated with the pthA4 mutant.

      Point 2: The authors confirmed the interaction between PthA4 and the EBE in the promoter of Cs9g12620 using DNA electrophoretic mobility shift assay (EMSA). However, Figure 2B is not convincing. The lane without GST-PthA4 also clearly showed a mobility shift. For the EMSA assay, the authors need also to include a non-labeled probe as a competitor to verify the specificity. The description of the EMSA in this paper suggests that it was not done properly. It is suggested the authors redo this EMSA assay following the protocol: Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions PMID: 17703195.

      Thank you very much for your comments. We have re-conducted the EMSA analysis based on your suggestion. The DNA probe was labeled with Cy5, included a non-labeled probe as a competitor. (Figure 3 B and D; Figure 4B and E)

      Point 3: The authors also claimed that PthA4 suppresses the promote activity of Cs9g12620. The data is not convincing and also contradicts with their own data that overexpression of Cs9g12620 causes canker and silencing of it reduces canker considering PthA4 is required for canker development. The authors conducted the assays using transient expression of PthA4. It should be done with Xcc wild type, pthA4 mutant, and negative control to inoculate citrus plants to check the expression of Cs9g12620.

      We have detected Cs9g12620 expression in silencing citrus plants inoculated wild type Xcc 29-1. (Figure 7F)

      Point 4: Figure 6 AB is not convincing. There are no apparent differences. The variations shown in B are common in different wild-type samples. It is suggested that the authors conduct transgenic instead of transient overexpression.

      It has been proven that transient expression of PthA4 leads to canker-like phenotype, suggesting that this experiment is effective. However, it will be more confident if conduct transgenic plant overexpressing pthA4 and Cs9g12620. We’ll create the plants in our following research to confirm the phenotype.

      Point 5: Gene silencing data needs more appropriate controls. Figure D seems to suggest canker symptoms actually happen for the RNAi treated. The authors need to make sure the same amount of Xcc was used for both CTV empty vector and the RNAi. It is suggested a blink test is needed here.

      We used the same amount of Xcc to inoculate CTV empty vector and the RNAi. In either inoculation, the cultured Xcc cells were suspended in sterile distilled water to a final concentration of 108 CFU/mL (OD600 = 0.3).

      Point 6: Figure 1. Please draw a figure to clearly show the location of the EBE in the promoter of Cs9g12620, including the transcription start site, and translational start site.

      The EBE in Cs9g12620 promoter was indicated by underlined in Figure supplement 1. We did not sure about the translation start site, but the translation start site was labelled.

    1. Author response:

      Reviewer #1 (Public Review):

      Areas of improvement and suggestions:

      (1) "These results suggest the SP targets interneurons in the brain that feed into higher processing centers from different entry points likely representing different sensory input" and "All together, these data suggest that the abdominal ganglion harbors several distinct type of neurons involved in directing PMRs"

      The characterization of the post-mating circuitry has been largely described by the group of Barry Dickson and other labs. I suggest ruling out a potential effect of mSP in any of the well-known post-mating neuronal circuitry, i.e: SPSN, SAG, pC1, vpoDN or OviDNs neurons. A combination of available split-Gal4 should be sufficient to prove this.

      Indeed, we have tested drivers for some of these neurons already and agree that this information is important to distinguish neurons which are direct SP target from neurons which are involved in directing reproductive behaviors.

      (2) Authors must show how specific is their "head" (elav/otd-flp) and "trunk" (elav/tsh) expression of mSP by showing images of the same constructs driving GFP.

      The expression pattern for tshGAL, which expresses in the trunk is already published (Soller et al., 2006). We will add images for “head” expression.

      (3) VT3280 is termed as a SAG driver. However, VT3280 is a SPSN specific driver (Feng et al., 2014; Jang et al., 2017; Scheunemann et al., 2019; Laturney et al., 2023). The authors should clarify this.

      According to the reviewers suggestion, we will clarify the specificity of VT3280.

      (4) Intersectional approaches must rule out the influence of SP on sex-peptide sensing neurons (SPSN) in the ovary by combining their constructs with SPSN-Gal80 construct. In line with this, most of their lines targets the SAG circuit (4I, J and K). Again, here they need to rule out the involvement of SPSN in their receptivity/egg laying phenotypes. Especially because "In the female genital tract, these split-Gal4 combinations show expression in genital tract neurons with innervations running along oviduct and uterine walls (Figures S3A-S3E)".

      We agree with this reviewer that we need a higher resolution of expression to only one cell type. However, this is a major task that we will continue in follow up studies.

      In principal, use of GAL80 is a valid approach to restrict expression, if levels of GAL80 are higher than those of GAL4, because GAL80 binds GAL4 to inhibit its activity. Hence, if levels of GAL80 are lower, results could be difficult to interpret.

      (5) The authors separate head (brain) from trunk (VNC) responses, but they don't narrow down the neural circuits involved on each response. A detailed characterization of the involved circuits especially in the case of the VNC is needed to (a) show that the intersectional approach is indeed labelling distinct subtypes and (b) how these distinct neurons influence oviposition.

      Again, we agree with this reviewer that we need a higher resolution of expression to only one cell type. However, this is a major task that we will continue in follow up studies.

      Reviewer #2 (Public Review):

      Strength:

      The intersectional approach is appropriate and state-of-the art. The analysis is a very comprehensive tour-de-force and experiments are carefully performed to a high standard. The authors also produced a useful new transgenic line (UAS-FRTstopFRT mSP). The finding that neurons in the brain (head) mediate the SP effect on receptivity, while neurons in the abdomen and thorax (ventral nerve cord or peripheral neurons) mediate the SP effect on oviposition, is a significant step forward in the endavour to identify the underlying neuronal networks and hence a mechanistic understanding of SP action. Though this result is not entirely unexpected, it is novel as it was not shown before.

      We thank reviewer 2 for recognizing the advance of our work.

      Weakness:

      Though the analysis identifies a small set of neurons underlying SP responses, it does not go the last step to individually identify at least a few of them. The last paragraph in the discussion rightfully speculates about the neurochemical identity of some of the intersection neurons (e.g. dopaminergic P1 neurons, NPF neurons). At least these suggested identities could have been confirmed by straight-forward immunostainings agains NPF or TH, for which antisera are available. Moreover, specific GAL4 lines for NPF or P1 or at least TH neurons are available which could be used to express mSP to test whether SP activation of those neurons is sufficient to trigger the SP effect.

      We appreciate this reviewers recognition of our previous work showing that receptivity and oviposition are separable. As pointed out we have now gone one step further and identified in a tour de force approach subsets of neurons in the brain and VNC.

      We agree with this reviewer that we need a higher resolution of expression to only one cell type. As pointed out by this reviewer, the neurochemical identity is an excellent suggestions and will help to further restrict expression to just one type of neuron. However, this is a major task that we will continue in follow up studies.

      Reviewer #3 (Public Review):

      Strengths:

      Besides the main results described in the summary above, the authors discovered the following:

      (1) Reduction of receptivity and induction of egg-laying are separable by restricting the expression of membrane-tethered SP (mSP): head-specific expression of mSP induces reduction of receptivity only, whereas trunk-specific expression of mSP induces oviposition only. Also, they identified a GAL4 line (SPR12) that induced egg laying but did not reduce receptivity.

      (2) Expression of mSP in the genital tract sensory neurons does not induce PMR. The authors identified three GAL4 drivers (SPR3, SPR 21, and fru9), which robustly expressed mSP in genital tract sensory neurons but did not induce PMRs. Also, SPR12 does not express in genital tract neurons but induces egg laying by expressing mSP.

      We thank reviewer 3 for recognizing these two important points regarding the SP response that point to a revised model for how the underlying circuitry induces the post-mating response.

      Weaknesses:

      (1) Intersectional expression involving ppk-GAL4-DBD was negative in all GAL4AD lines (Supp. Fig.S5). As the authors mentioned, ppk neurons may not intersect with SPR, fru, dsx, and FD6 neurons in inducing PMRs by mSP. However, since there was no PMR induction and no GAL4 expression at all in any combination with GAL4-AD lines used in this study, I would like to have a positive control, where intersectional expression of mSP in ppk-GAL4-DBD and other GAL4-AD lines (e.g., ppk-GAL4-AD) would induce PMR.

      We will add positive controls of for ppk-DBD expression and expand the discussion section.

      (2) The results of SPR RNAi knock-down experiments are inconclusive (Figure 5). SPR RNAi cancelled the PMR in dsx ∩ fru11/12 and partially in SPR8 ∩ fru 11/12 neurons. SPR RNAi in dsx ∩ SPR8 neurons turned virgin females unreceptive; it is unclear whether SPR mediates the phenotype in SPR8 ∩ fru 11/12 and dsx ∩ SPR8 neurons.

      We agree with this reviewer that the interpretation of the SPR RNAi results are complicated by the fact that SP has additional receptors (Haussmann et al 2013). The results are conclusive for all three intersections when expressing UAS mSP in SPR RNAi with respect to oviposition, e.g. egg laying is not induced in the absence of SPR. For receptivity, the results are conclusive for dsx ∩ fru11/12 and partially for SPR8 ∩ fru 11/12.

      Potentially, SPR RNAi knock-down does not sufficiently reduce SPR levels to completely reduce receptivity in some intersection patterns, likely also because splitGal4 expression is less efficient.

      Why SPR RNAi in dsx ∩ SPR8 neurons turned virgin females unreceptive is unclear, but we anticipate that we need a higher resolution of expression to only one cell type to resolve this unexpected result. However, this is a major task that we will continue in follow up studies.

      SPR RNAi knock-down experiments may also help clarify whether mSP worked autocrine or juxtacrine to induce PMR. mSP may produce juxtacrine signaling, which is cell non-autonomous.

      Whether membrane-tethered SP induces the response in a autocrine manner is an import aspect in the interpretation of the results from mSP expression.

      Removing SPR by SPR RNAi and expression of mSP in the same neurons did not induce egg laying for all three intersection and did not reduce receptivity for dsx ∩ fru11/12 and for SPR8 ∩ fru 11/12. Accordingly, we can conclude that for these neurons the response is induced in an autocrine manner.

      We will add this aspect to the discussion section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This paper by Beath et. al. identifies a potential regulatory role for proteins involved in cytoplasmic streaming and maintaining the grouping of paternal organelles: holding sperm contents in the fertilized embryos away from the oocyte meiotic spindle so that they don't get ejected into the polar body during meiotic chromosome segregation. The authors show that by time-lapse video, paternal mitochondria (used as a readout for sperm and its genome) is excluded from yolk granules and maternal mitochondria, even when moving long distances by cytoplasmic streaming. To understand how this exclusion is accomplished, they first show that it is independent of both internal packing and the engulfment of the paternal chromosomes by maternal endoplasmic reticulum creating an impermeable barrier. They then test whether the control of cytoplasmic steaming affects this exclusion by knocking down two microtubule motors, Katanin and kinesis I. They find that the ER ring, which is used as a proxy for paternal chromosomes, undergoes extensive displacement with these treatments during anaphase I and interacts with the meiotic spindle, supporting their hypothesis that the exclusion of paternal chromosomes is regulated by cytoplasmic streaming. Next, they test whether a regulator of maternal ER organization, ATX-2, disrupts sperm organization so that they can combine the double depletion of ATX-2 and KLP-7, presumably because klp-7 RNAi (unlike mei-1 RNAi) does not affect polar body extrusion and they can report on what happens to paternal chromosomes. They find that the knockdown of both ATX-2 and KLP-7 produces a higher incidence of what appears to be the capture of paternal chromosomes by the meiotic spindle (5/24 vs 1/25). However, this capture event appears to halt the cell cycle, preventing the authors from directly observing whether this would result in the paternal chromosomes being ejected into the polar body. 

      Strengths: 

      This is a useful, descriptive paper that highlights a potential challenge for embryos during fertilization: when fertilization results in the resumption of meiotic divisions, how are the paternal and maternal genomes kept apart so that the maternal genome can undergo chromosome segregation and polar body extrusion without endangering the paternal genome? In general, the experiments are well-executed and analyzed. In particular, the authors' use of multiple ways to knock down ATX-2 shows rigor. 

      Weaknesses: 

      The paper makes a case that this regulation may be important but the authors should do some additional work to make this case more convincing and accessible for those outside the field. In particular, some of the figures could include greater detail to support their conclusions, they could explain the rationale for some experiments better and they could perform some additional control experiments with their double depletion experiments to better support their interpretations. Also, the authors' inability to assess the functional biological consequences of the capture of the sperm genome by the oocyte spindle should be discussed, particularly in light of the cell cycle arrest that they observe. 

      These general comments are addressed in the more specific critiques below.

      Reviewer #2 (Public Review): 

      Summary 

      In this manuscript, Beath et al. use primarily C. elegans zygotes to test the overarching hypothesis that cytoplasmic mechanisms exit to prevent interaction between paternal chromosomes and the meiotic spindle, which are present in a shared zygotic cytoplasm after fertilization. Previous work, much of which by this group, had characterized cytoplasmic streaming in the zygote and the behavior of paternal components shortly after fertilization, primarily the clustering of paternal mitochondria and membranous organelles around the paternal chromosomes. This work set out to identify the molecular mechanisms responsible for that clustering and test the specific hypothesis that the "paternal cloud" helps prevent the association of paternal chromosomes with the meiotic spindle. 

      Strengths 

      This work is a collection of technical achievements. The data are primarily 3- and 4-channel time-lapse images of zygotes shortly after fertilization, which were performed inside intact animals. There are many instances in which the experiments show extreme technical skill, such as tracking the paternal chromosomes over large displacements throughout the volume of the embryo. The authors employ a wide variety of fluorescent reporters to provide a remarkably clear picture of what is going on in the zygote. These reagents and the novel characterization of these stages that they provide will be widely beneficial to the community. 

      The data provide direct visualization of what had previously been a mostly hypothetical structure, the "paternal cloud," using simultaneous labeling of paternal DNA and mitochondria in combination with a variety of maternal proteins including maternal mitochondria, yolk granules, tubulin, and plasma membrane. Together, these images provided convincing evidence of the existence of this specified cytoplasmic domain. They go on to show that the knockdown of the ataxin-2 homolog ALX-2, a protein previously shown to affect ER dynamics, disrupted the paternal cloud, identifying a role for ER organization in this structure. 

      The authors then used the system to test the functional consequences of perturbing the cytoplasmic organization. Consistent with the paternal cloud being a stable structure, it stayed intact during large movements the authors generated using previously published knockdowns (of mei-1/katanin and kinesin-13/kpl-7) that increased cytoplasmic streaming. They used this data to document instances in which the paternal chromosomes were likely to have been attached to the spindle. They concluded with direct evidence of spindle fibers connecting to the paternal chromatin upon knockdown of ATX-2 in combination with increased cytoplasmic streaming, providing strong, direct support for their overarching hypothesis. 

      Weaknesses 

      While the data is convincing, the narrative of the paper could be streamlined to highlight the novelty of the experiments and better articulate the aims. For example, the cloud of paternal mitochondria and membranous organelles was previously shown, but Figures 1-2 largely reiterate that observation. The innovation seems to be that the combination of ER, yolk, and maternal mitochondrial markers makes the existence of a specified domain more concrete. There are also some instances where more description is needed to make the conclusions from the images clear. 

      These general comments are addressed in the more specific critiques below.

      The manuscript intersperses what read like basic characterizations of fluorescent markers that, as written, can distract from the main story. The authors characterized the dynamics of ER organization throughout the substages of meiosis and the permeability of the envelope of ER that surrounds the paternal chromatin, but it could be more clearly established how the ability to visualize these structures allowed them to address their aims.

      We have added the following after the initial description of ER morphology changes: (ER morphology was used to determine cell-cycle stages during live imaging reported below in Fig. 6.)

      More background on what was previously known about ER organization in M-phase and the role of ataxin proteins specifically may help provide more continuity. 

      We have added references to transitions to ER sheets during mitotic M-phase in HeLa cells and Xenopus extracts.

      Reviewer #3 (Public Review): 

      Summary: 

      This study by Beath et al. investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. Time-lapse imaging revealed that sperm DNA is surrounded by paternal mitochondria and maternal ER that is permeable to proteins. By increasing cytoplasmic streaming using kinesin-13 or katanin RNAi, the authors demonstrated that limiting cytoplasmic streaming in the embryo is an important step that prevents the capture of sperm DNA by the oocyte meiotic spindle. Further experiments showed that the Ataxin-2 protein is required to hold paternal mitochondria together and close to the sperm DNA. Finally, double depletion of kinesin-13 and Ataxin-2 suggested an increased risk of meiotic spindle capture of sperm DNA. 

      Overall, this is an interesting finding that could provide a new understanding of how meiotic spindle capture of sperm DNA and its accidental expulsion into the polar body is prevented. However, some conceptual gaps need to be addressed and further experiments and improved data analyses would strengthen the paper. 

      - It would be helpful if the authors could discuss in good detail how they think maternal ER surrounds the sperm DNA

      We have added 2 references to papers about nuclear envelope re-assembly from Shirin Bahmanyar’s lab and suggest the ER envelope is a halted intermediate in nuclear envelope reassembly.

      and why is it not disrupted following Ataxin disruption. 

      We have been attempting to disrupt ER structures in the meiotic embryo for the last 5 years by depleting profilin, BiP, atlastin, ATX-2 and by optogenetically packing ER into a ball in the middle of the oocyte.  None of these treatments prevent envelopment of the sperm DNA by maternal ER.  None of these treatments remove ER from the spindle envelope and none remove ER from the plasma membrane.  These treatments mostly result in “large aggregates” of ER that we have not examined by EM.  Wild speculation: any disruption of the ER strong enough to prevent ER envelopment around chromatin would be sterile because the M to S transition in the mitotic zone of the germline would be blocked.  Rapid depletion of ATX-2 to the extent shown by rigorous data in this manuscript does not prevent ER envelopment around chromatin.  We chose not to speculate about the reasons for this because we do not know why.

      - Since important phenotypes revealed in RNAi experiments (e.g. kinesin-13 and ataxin-2 double depletion) are not very robust, the authors should consider toning down their conclusions and revising some of their section headings. I appreciate that they are upfront about some limitations, but they do nonetheless make strong concluding sentences. 

      We have changed the discussion of the klp-7 atx-2 double depletion to: “The capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos suggests that the integrity of the exclusion zone around the sperm DNA might insulate the sperm DNA from spindle microtubules.  However, a much larger number of klp-7(RNAi) singly depleted and atx-2(degron) singly depleted time-lapse sequences are needed to rigorously support this idea. “

      - The discussion section could be improved further to present the authors' findings in the larger context of current knowledge in the field. 

      We have expanded the discussion as suggested.

      - The authors previously demonstrated that F-actin prevents meiotic spindle capture of sperm DNA in this system. However, the current manuscript does not discuss how the katanin, kinesin-13 and Ataxin-2 mechanisms could work together with previously established functions of F-actin in this process. 

      We have added pfn-1(RNAi) to the discussion section.

      - How can the authors exclude off-target effects in their RNAi depletion experiments? Can kinesin-13, katanin, and Ataxin phenotypes be rescued for instance? 

      For ataxin-2 phenotypes, two completely independent controls for off target effects are shown.  GFP(RNAi) on a strain with and endogenous ATX-2::GFP tag vs GFP(RNAi) on a strain with no tag on the ATX-2.  ATX-2::AID with or without auxin.  For kinesin-13 and katanin, we did not do a rigorous control for off-target effects of RNAi.  However, the effects of these depletions on cytoplasmic microtubules have been previously reported by others

      - How are the authors able to determine if the paternal genome was actually captured by the spindle? Does lack of movement definitively suggest capture without using a spindle marker? 

      mKate::tubulin labels the spindle in each capture event.  This can be seen in Video S3. for mei-1(RNAi) and Figure 9 for atx-2 klp-7 double depletions.

      (1) Major issues: 

      The images provided are not convincing that mitochondria are entirely excluded from the regions with yolk granules from the images provided. Please provide insets of magnified images of the paternal mitochondria in Figure 1E to more clearly show the exclusion even when paternal mitochondria are streaming. Providing grayscale images, individual z-sections and/or some quantification of this data might also be more convincing to this reviewer. 

      We have modified Fig. 1 by adding single wavelength magnified insets to more clearly show that paternal mitochondria are in a “black hole” in the maternal yolk granules during  cytoplasmic streaming.

      Figure 2 -This figure can be retitled to highlight that the paternal organelle cloud is impermeable to mitochondria and conserved. 

      The legend has been re-titled as suggested.

      Figure 3B, An image of the DNA within the ring of maternal ER especially since the maternal ER ring is used as a proxy for the paternal chromosomes in later figures would strengthen the authors' claims.

      We have added a panel showing DAPI-stained DNA in the center of the ER ring and paternal mitochondria cloud. 

      Why is the faster time scale imaging significant? I think this could be more clearly set up in the paper. Perhaps rapid imaging of maternal mito-labeled kca-1(RNAi) embryos would better show the difference in time scale, with the expectation that the paternal cloud forms and persists while the ER invades. 

      We are not sure what the reviewer means.  5 sec time intervals were used throughout the paper.  We are also not sure how kca-1(RNAi) would help.  Movement of the entire oocyte into and out of the spermatheca is what limits the ability to keep a fusing sperm in focus.  kca-1(RNAi) would prevent cytoplasmic streaming but not ovulation movements.

      Figure 4 - The question about the permeability of the ER envelope seems to come out of nowhere as written. It isn't clear how it contributes to the larger story about preventing sperm incorporation in the spindle.

      This section of the results is introduced with: “If the maternal ER envelope around sperm DNA was sealed and impermeable during meiosis, this could both prevent the sperm DNA from inducing ectopic spindle assembly and prevent the sperm DNA from interacting with meiotic spindle microtubules.” 

      The data in Figure 4 would probably not be expected to be in this paper based on the paper title. Maybe the title needs something about ER dynamics? "eg. ATX-2 but not an ER envelope" isolates the paternal chromatin? 

      In Figure 5, it seems that RNAi of klp-7 and Mei-1 had slightly different effects on short-axis displacement of the ER envelope (klp-7 affecting it more dramatically than mei-1) and slightly different effects on interaction with the meiotic spindle (capture vs streaming past the spindle). The authors mention in their discussion that the difference in the interaction with the meiotic spindle might reflect the effects that loss of Mei-1 may have on the spindle but could it also be a consequence of the differences in cytoplasmic streaming observed?

      With our current data, the only statistically significant difference between cytoplasmic streaming of the sperm contents in mei-1(RNAi) vs klp-7(RNAi) is that excessive streaming persists longer into metaphase II in klp-7(RNAi).  We have added a sentence describing this difference to the results.  If differences in streaming were the cause of different capture frequencies, then klp-7(RNAi) would cause more capture events than mei-1(RNAi) but the opposite was observed.  We have avoided too much discussion here because the frequency of capture events is too low to demonstrate statistically significant differences between mei-1(RNAi), klp-7(RNAi), and atx-2(degron) + klp-7(RNAi) without a very large increase in the number of time-lapse sequences.  

      Also, the authors should find a way to represent this interaction with the meiotic spindle in a quantitative or table form to allow the reader to observe some of the patterns they report more easily.

      We have added a table to Fig. 9 that summarizes capture data.

      Finally, can the authors report when they observe the closest association with the meiotic spindle: Does it correlate with the period of greatest displacement (AI) or are they unlinked? 

      The low frequency of capture events makes it difficult to test this rigorously.

      Figure 6- 'Endogenously tagged ATX-2 was observed throughout oocytes and meiotic embryos without partial co-localization with ER.' How can the authors exclude co-localization with ER? 

      We have changed the wording to: “Endogenously tagged ATX-2 was observed throughout oocytes and meiotic embryos (Fig. 6A; Fig. S2).  ATX-2 did not uniquely  co-localize with ER (Fig. S2).“

      The rationale for why the authors think that the integrity of sperm organelles is important to keep the genomes apart is not clear to this reviewer and needs to be explained better. Moving the discussion of the displacement experiments in Figure S3 from the end of the results section to the ATX-2 knockdown section would help accomplish this. 

      We have added the sentence: “The frequency of sperm capture by the meiotic spindle (Fig. 9D) was significantly higher than wild-type controls in klp-7(RNAi) atx-2(AID) double depleted embryos (p=0.011 Fisher’s exact test).   Although the number of single mutant embryos analyzed was too low to demonstrate a significant difference between single and double mutant embryos,  these results qualitatively support the hypothesis that limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria are both important for preventing capture events between the meiotic spindle and sperm DNA.”

      It looks like, in the double knockdown of ATX-2 and KLP-7, the spread of paternal mitochondria is less affected than when only ATX-2 is depleted. What effect does this result have on the observation that the incidence of sperm capture appears to increase in the double depletion? What does displacement of the ER ring look like in the double depletion? Is it additive, consistent with their interpretation that both limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria is required to keep the genomes separate? 

      We cannot show a significant difference between single a double knockdowns without increasing n by alot.  We did not analyze ER ring displacement in the double mutant.

      Is the increased incidence of capture in the double-depleted embryos significant? 

      We have added the sentence: “The frequency of sperm capture by the meiotic spindle (Fig. 9D) was significantly higher than wild-type controls in klp-7(RNAi) atx-2(AID) double depleted embryos (p=0.011 Fisher’s exact test).   Although the number of single mutant embryos analyzed was too low to demonstrate a significant difference between single and double mutant embryos,  these results qualitatively support the hypothesis that limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria are both important for preventing capture events between the meiotic spindle and sperm DNA.”

      What do the authors make of the cell cycle arrest observed when paternal chromosomes are captured? Is there an argument to be made that this arrest supports the idea that preventing this capture is actively regulated and therefore functionally important? 

      We chose not to discuss the mechanism of this arrest because considerably more work would be required to prove that it is not caused by a combination of imaging conditions and genotype.  The low frequency of these capture + arrest events would make it very difficult to show that the arrest does not occur after depleting a checkpoint protein.

      (2) Minor concerns: 

      Top of page 4: "streaming because depletion tubulin stops cytoplasmic streaming (7)" should be "streaming because depletion of tubulin stops cytoplasmic streaming (7)" 

      The ”of” has been inserted.

      Page 6: "This result indicated that the volume of paternal mitochondria excludes maternal mitochondria and yolk granules but not maternal ER." The authors have only shown this for maternal mitochondria, not yolk granules. 

      We have deleted the mention of yolk granules here.

      Page 7: "These results suggest that all maternal membranes are initially excluded from the sperm at fusion." Should be "These results show that maternal ER are initially excluded from the sperm at fusion. Since maternal mitochondria and yolk granules are excluded later, this suggests that all maternal membranes are initially excluded from the sperm at fusion." 

      We have changed this sentence as suggested.

      It's not clear why the authors show other types of movement that might be quantified when cytoplasmic streaming is affected in Figure 5A and only quantify long-axis and short-axis displacement. 

      We have deleted the other types of movement from the schematic.  Although these parameters were quantified, we did not include this data in the results so it would be confusing for the reader to have them in the schematic.

      Bottom of page 7: Mention that the GFP::BAF-1 was maternally provided. 

      We have added “Maternally provided..”

      Missing an Arrow on Figure 1A 9:20. 

      We removed the text citation to an arrow in Fig. 1A because we moved most of the description of the ER ring to Fig. 3 to address other reviewer suggestions.

      Supplemental videos should be labeled appropriately to indicate what structures are labeled. It is currently difficult to understand what is being shown. 

      (3) Issues with the Discussion section: 

      "The simplest explanation is that cytoplasm does not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm." - Citation page 12. 

      We have changed the sentence to: “The simplest hypothesis is that maternal and paternal cytoplasm might not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm.” 

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubule" - Pages 12-13 reference the figures. 

      This sentence has been rewritten in response to other comments but the new sentence now references revised Fig. 9.

      "ATX-2 is required to maintain the integrity of the ball of paternal mitochondria around the sperm DNA, but the mechanism is unknown." - Page 13 reference figure. 

      A reference to Figs 7 and 8 has been inserted.

      " In control embryos, the sperm contents rarely came near the meiotic spindle in agreement with a previous study that found that male and female pronuclei rarely form next to each other (6). Streaming of the sperm contents was most commonly restricted to a jostling motion with little net displacement, circular streaming in the short axis of the embryo, or long axis streaming in which the sperm turned away from the spindle before the halfway point of the embryo. Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)." - Page 13, the corresponding figures need to be referenced for these sentences. 

      We have inserted figure references.

      "In capture events observed after double depletion of ATX-2 and KLP-7, a bundle of microtubules was discernible extending from the spindle into the ER envelope surrounding the sperm DNA. Such bundles were not observed in mei-1(RNAi) capture events, likely because of the previously reported low density of microtubules in mei-1(RNAi) spindles (36, 37)." - Pages 13-14 references figures here. 

      We have inserted figure references.

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubules." - This should be toned down since this phenotype is not robust. 

      We have changed this to: “The capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos suggests that the integrity of the exclusion zone around the sperm DNA might insulate the sperm DNA from spindle microtubules.  However, a much larger number of klp-7(RNAi) singly depleted and atx-2(degron) singly depleted time-lapse sequences are needed to rigorously support this idea. “

      ATX-2 depletion alters ER morphology but does not impact the maternal ER envelope - could the authors provide a potential explanation for this? 

      In the discussion, we cite papers showing that ATX-2 depletion affects many different cellular processes so the effect we see on paternal mitochondria might have nothing to do with the ER ring.   We have been attempting to disrupt ER structures in the meiotic embryo for the last 5 years by depleting profilin, BiP, atlastin, ATX-2 and by optogenetically packing ER into a ball in the middle of the oocyte.  None of these treatments prevent envelopment of the sperm DNA by maternal ER.  None of these treatments remove ER from the spindle envelope and none remove ER from the plasma membrane.  These treatments mostly result in “large aggregates” of ER that we have not examined by EM.  Wild speculation: any disruption of the ER strong enough to prevent ER envelopment around chromatin would be sterile because the M to S transition in the mitotic zone of the germline would be blocked.  Rapid depletion of ATX-2 to the extent shown by rigorous data in this manuscript does not prevent ER envelopment around chromatin.  We chose not to speculate about the reasons for this because we do not know why.

      It would be good to have representative images of what the altered spindle looks like in MEI-1-depleted oocytes. 

      The structure of MEI-1-depleted spindles has been described in the cited references.

      "Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)" - It is intriguing that this does not happen in the double depletion experiments of kinesin-13 and ATX-2. The authors should perhaps discuss this. 

      This does happen in KLP-7 ATX-2 double depleted embryos as shown in Fig. 9.

      (4) Missing citations: 

      "This analysis was restricted to embryos from anaphase I through anaphase II because our streaming data and that of Kimura 2020 indicate that the sperm contents have not moved significantly before anaphase I." - This needs an appropriate citation. Page 10. 

      We have inserted citations here.

      " The simplest explanation is that cytoplasm does not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm." - Citation page 12. Not referencing figures in the discussion. 

      We have changed the sentence to: “The simplest hypothesis is that maternal and paternal cytoplasm might not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm.” 

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubule" - Pages 12-13 reference the figures. 

      A reference to the revised Fig. 9 has been inserted in the revised version of this sentence.

      "ATX-2 is required to maintain the integrity of the ball of paternal mitochondria around the sperm DNA, but the mechanism is unknown." 

      References to Figs. 7 and 8 have been inserted.

      Page 13 reference figure 

      " In control embryos, the sperm contents rarely came near the meiotic spindle in agreement with a previous study that found that male and female pronuclei rarely form next to each other (6). Streaming of the sperm contents was most commonly restricted to a jostling motion with little net displacement, circular streaming in the short axis of the embryo, or long axis streaming in which the sperm turned away from the spindle before the halfway point of the embryo. Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)." Page 13, the corresponding figures need to be referenced for these sentences. 

      We have inserted citations here.

      "In capture events observed after double depletion of ATX-2 and KLP-7, a bundle of microtubules was discernible extending from the spindle into the ER envelope surrounding the sperm DNA. Such bundles were not observed in mei-1(RNAi) capture events, likely because of the previously reported low density of microtubules in mei-1(RNAi) spindles (36, 37)." Pages 13-14 references figures here. 

      We have inserted citations here.

      (5) Referencing wrong figures in the text: 

      Figure 5 - In the figure legend there is a 5C but there is no 5C panel in the figure. 

      A C has been inserted in Fig. 5.

      Figure 6A - "Dark holes were observed suggesting exclusion from the lumens of larger membranous organelles (Fig. 6A; Fig. S2)." Page 10. 

      6A has been changed to 6C.

      Figure 6A is showing background autofluorescence in WT oocytes so I am not certain why it is cited here. 

      The Figure citation has been corrected to 6B, C.

      Figure 8 - I could not find the supplemental data file with the individual mitochondria distance measurements. 

      We are including the Excel file with the revised submission.

      The last sentence of the first paragraph should be re-worded to be more concise ". In C. elegans, the nucleus is positioned away from the site of future fertilization so that the meiosis I spindle assembles at the opposite end of the ellipsoid zygote from the site of fertilization (2-4). " 

      Every word of this sentence is important.

      Last sentence second paragraph typo "These microtubules are thought to drive meiotic cytoplasmic streaming because depletion tubulin stops cytoplasmic streaming (7) and depletion of the microtubule-severing protein katanin by RNAi results in an increased mass of cortical microtubules and an increase in cytoplasmic streaming (8)." Pages 3-4. 

      “of” has been inserted.

      (6) Typos in the introduction should be corrected: 

      Ataxin or kinesin-13 are not mentioned in the introduction but these are a big focus of the paper. 

      Gong et al 2024 written instead of number citation (page 5), no citation in References.

      This has been corrected. 

      Supplemental videos should be labeled appropriately to indicate what structures are labeled. It is currently difficult to understand what is being shown.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      Summary:

      The authors used four datasets spanning 30 countries to examine funding success and research quality score for various disciplines. They examined whether funding or research quality score were influenced by majority gender of the discipline and whether these affected men, women, or both within each discipline. They found that disciplines dominated by women have lower funding success and research quality score than disciplines dominated by men. These findings, are surprising because even the men in women-dominated fields experienced lower funding success and research quality score.

      Strengths:

      - The authors utilized a comprehensive dataset covering 30 countries to explore the influence of the majority gender in academic disciplines on funding success and research quality scores.

      - Findings suggest a systemic issue where disciplines with a higher proportion of women have lower evaluations and funding success for all researchers, regardless of gender.

      - The manuscript is notable for its large sample size and the diverse international scope, enhancing the generalizability of the results.

      - The work accounts for various factors including age, number of research outputs, and bibliometric measures, strengthening the validity of the findings.

      - The manuscript raises important questions about unconscious bias in research evaluation and funding decisions, as evidenced by lower scores in women-dominated fields even for researchers that are men.

      - The study provides a nuanced view of gender bias, showing that it is not limited to individuals but extends to entire disciplines, impacting the perception and funding and quality or worth of research.

      - This work underscores the need to explore motivations behind gender distribution across fields, hinting at deep-rooted societal and institutional barriers.

      - The authors have opened a discussion on potential solutions to counter bias, like adjusting funding paylines or anonymizing applications, or other practical solutions.

      - While pointing out limitations such as the absence of data from major research-producing countries, the manuscript paves the way for future studies to examine whether its findings are universally applicable.

      Weaknesses:

      - The study does not provide data on the gender of grant reviewers or stakeholders, which could be critical for understanding potential unconscious bias in funding decisions. These data are likely not available; however, this could be discussed. Are grant reviewers in fields dominated by women more likely to be women?

      - There could be more exploration into whether the research quality score is influenced by inherent biases towards disciplines themselves, rather than only being gender bias.

      - The manuscript should discuss how non-binary gender identities were addressed in the research. There is an opportunity to understand the impact on this group.

      - A significant limitation is absence of data from other major research-producing countries like China and the United States, raising questions about the generalizability of the findings. How comparable are the findings observed to these other countries?

      - The motivations and barriers that drive gender distribution in various fields could be expanded on. Are fields striving to reach gender parity through hiring or other mechanisms?

      - The authors could consider if the size of funding awards correlates with research scores, potentially overlooking a significant factor in the evaluation of research quality. Presumably there is less data on smaller 'pilot' funds and startup funds for disciplines where these are more common. Would funding success follow the same trend for these types of funds?

      - The language used in the manuscript at times may perpetuate bias, particularly when discussing "lower quality disciplines," which could influence the reader's perception of certain fields.

      - The manuscript does not clarify how many gender identities were represented in the datasets or how gender identity was determined, potentially conflating gender identity with biological sex.

      Reviewer #3 (Public Review):

      This study seeks to investigate one aspect of disparity in academia: how gender balance in a discipline is valued in terms of evaluated research quality score and funding success. This is important in understanding disparities within academia.

      This study uses publicly available data to investigate covariation between gender balance in an academic discipline and:

      i) Individual research quality scores of New Zealand academics as evaluated by one of 14 broader subject panels.

      ii) Funding success in Australia, Canada, Europe, UK.

      The study would benefit from further discussion of it limitations, and from the clarification of some technical points (as described in the recommendations for the authors).

      Recommendations For The Authors:

      Reviewer #2 (Recommendations For The Authors):

      This is a very nice study as-is. In the following comments, I have mainly put my thoughts as I was reading the manuscript. If there are practical ways to answer my questions, I think they could improve the manuscript but the data required for this may not be available.

      Are there any data on the gender of grant reviewers or stakeholders who make funding decisions?

      The research quality score metrics seem to be more related to unconscious bias. The funding metrics may also, but there are potentially simple fixes (higher paylines for women or remove gender identities from applications).

      We have included some details about PBRF funding panel gender diversity. These panels are usually more gender balanced than the field they represent, but in the extreme cases (Engineering, Education, Mathematics) they are skewed as would be expected. Panels for other award decision makers was not available.

      I wonder if the research score metric isn't necessarily reflecting on the gender bias in the discipline but rather on the discipline itself? Terms like "hard science" and "soft science" are frequently used and may perpetuate these biases. This is somewhat supported by the data - on line 402-403 the authors state that women in male-dominated fields like Physics have the same expected score as a man. Could it be that Physics has a higher score than Education even if Physics was woman-dominated and Education was man-dominated? Are there any instances in the data where traditionally male- or female-dominated disciplines are outliers and happen to be the opposite? If so, in those cases, do the findings hold up?

      Overall we would love to answer this question! But our data is not enough. We mention these points in the Discussion (Lines 472-466). We have extended this a little to cover the questions raised here.

      How are those with non-binary gender identities handled in this article? If there is any data on the subject, I would be curious to know how this effects research score and funding success.

      These data were either unavailable or the sample size was too small to be considered anonymously (Mentioned on Lines 74-76).

      A limitation of the present article is a lack of data on major research-producing countries like China and the United States. Is there any data relevant to these or other countries? Is there reason to believe the findings outlined in this manuscript would apply or not apply to those countries also?

      We would be very excited to see if the findings held up in other countries, particularly any that were less European based. Unfortunately we could not find any data to include. Maybe one day!

      What are the motivations or other factors driving men to certain fields and women to certain fields over others? What are the active barriers preventing all fields from 50% gender parity?

      Field choice is a highly studied area and the explanations are myriad we have included a few references in the discussion section on job choice. I usually recommend my students read the blog post at

      https://www.scientificamerican.com/blog/hot-planet/the-people-who-could-have-done-science-didnt/

      It is very thoughtful but unfortunately not appropriate to reference here.

      The authors find very interesting data on funding rates. Have you considered funding rates and the size of funding awards as a factor in research score? Some disciplines like biomedical science receive larger grants than others like education.

      A very interesting thought for our next piece of work. We would definitely like to explore our hypothesis further.

      There are instances where the authors writing may perpetuate bias. If possible these should be avoided. One example is on line 458-459 where the authors state "...why these lower quality disciplines are more likely..." This could be re-written to emphasize that some disciplines are "perceived" as lower quality. Certainly those in these discipline would not characterize their chosen discipline as "low quality".

      Well-spotted! Now corrected as you suggest.

      Similar to the preceding comment, the authors should use care with the term "gender". In the datasets used, how many gender identities were captured? How many gender identity options were given in the surveys or data intake forms? Could individuals in these datasets have been misgendered? Do the data truly represent gender identity or biological sex?

      We know that in the PBRF dataset gender was a binary choice and transgender individuals were able to choose which group they identified with. There was no non-binary option (in defence the latest dataset there is from 2018 and NZ has only recently started updating official forms to be more inclusive) and individuals with gender not-stated (a very small number) were excluded. ARC did mention that a small number of individuals were either non-binary or gender not stated, again these are not included here for reasons of anonymity. This is now mentioned on Lines 74-76. The effects on this group are important and understudied likely because, as here, the numbers are too small to be included meaningfully.

      Reviewer #3 (Recommendations For The Authors):

      Major revisions:

      Could you add line numbers to the Supplementary Materials for the next submission?

      Yes! Sorry for the omission.

      (1) In the main text L146 and Figure 1, it is not clear why the expected model output line is for a 50 year old male from University of Canterbury only, but the data points are from disciplines in all eight universities in New Zealand. I think it would be more clear and informative to report the trend lines that represent the data points. At the moment it is hard to visualise how the results apply to other age groups or universities.

      As age and institution are linear variables with no interactions they are only a constant adjustment above or below this line and the adjustment is small in comparison to the linear trend. Unfortunately, if they were included graphically they do not aid understanding. We agree that indluded raw data with an adjusted trend line can be confusing buy after a lor of between-author discussion this was the most informative compromise we could find (many people like raw data so we included it).

      (2) Does your logistic regression model consider sample size weighting in pmen? Weighting according to sample sizes needs to be considered in your model. At the moment it is unclear and suggests a proportion between 0 and 1 only is used, with no weighting according to sample size. If using R, you can use glm(cbind(nFem, nMalFem).

      Yes. All data points were weighted by group size exactly as you suggest. We have updated the text on Lines 317 to make this clear.

      (3) For PBRF, I think it is useful to outline the 14 assessment panels and the disciplines they consider. Did you include the assessment panel as an explanatory variable in your model too to investigate whether quality is assessed in the same manner between panels? If not, then suggest reasons for not doing so.

      We have now included more detail in main text on the gender split of the panels. They were not included as an explanatory variable. In theory there was some cross-referencing of panel scores to ensure consistency as part of the PBRF quality assurance guidelines.

      (4) There are several limitations which should be discussed more openly:

      Patterns only represent the countries studied, not necessarily academia worldwide.

      Mentioned on Line 485-487.

      Gender is described as a binary variable.

      Discussed on Line 74-76.

      The measure of research evaluation as a reflection of academic merit.

      This is acknowledged in the data limitations paragraph in the discussion, at the end of the discussion

      Minor revisions:

      (1) L186. Why do you analyse bibliometric differences between individuals from University of Canterbury only? It would be helpful to outline your reasons.

      Although bibliometric data is publicly available it is difficult to collect for a large number of individuals. You also need some private data to match bibliometrics with PBRF data which is anonymous. We were only able to do this for our own institution with considerable internal support.

      (2) How many data records did you have to exclude in L191 because they could not be linked? This is helpful to know how efficient the process was, should anyone else like to conduct similar studies.

      We matched over 80% of available records (384 individuals). We have mentioned this on Line 194.

      (3) Check grammar in the sentence beginning in L202.

      Thank-you. Corrected.

      (4) Please provide a sample size gender breakdown for "University of Canterbury (UC) bibliometric data", as you do for the preceding section. A table format is helpful.

      Included on Line 194.

      (5) L377 I think this sentence needs revision.

      Thank you, we have reworked that paragraph.

      (6) L389-392 Is it possible evaluation panels can score women worse than men and that because more women are present in female-biassed disciplines, the research score in these are worse? Women scoring worse between fields, may be a result of some scaling to the mean score.

      No.  This is not possible because women in male-dominated fields score higher.

      (7) L393 Could you discuss explanations for why men outperform women in research evaluation scores more when disciplines are female dominated?

      Unfortunately, we don’t have an explanation for this and can’t get one from our data. We hope it will be an interesting for future work.

      (8) Could the figures be improved by having the crosses, x and + scaled, for example, in thickness corresponding to sample size? Alternatively, some description of the sample size variation? Sorting the rows by order of pmen in Table E1 would also be helpful for the reader.

      As with the previous figure we have tried many ways of presenting it (including tis one). Unfortunately nothing helped.

      We have provided Table E1 as a spreadsheet to allow readers to do this themselves.

      (9) Please state in your methods section the software used to aid repeatability.

      This is now in Supplementary Materials (Matlab 2022b).

      (10) It is great to report your model findings into real terms for PBRF and ARC. Please can you extend this to CIHR and EIGE. i.e. describing how a gender skew increase of x associates with a y increase in funding success chance.

      We have added similar explanations for both these datasets comparing the advantage of being male with the advantage of working in a male dominated discipline.

      (11) I would apply care to using pronouns "his" and "her" in L322-L324 and avoid if at all possible, instead, replacing them with "men" and "women".

      We have updated the text to avoid there pronouns in most places.

      The article in general would benefit from a disclosure statement early on conceding that gender investigated here is only as a binary variable, discounting its spectrum.

      See Line 74-76.

      Please also report how gender balance is defined in the datasets as in the data summary in supplementary materials, within the main text.

      Our definition of gender balance (proportion of researchers who are men, ) is given on Line 103.

      (12) The data summary Table S1 could benefit from explaining the variables in the first column. It is currently unclear how granularity, size of dataset and quotas/pre-allocation? are defined.

      These lines have been removed as they information they contained is included elsewhere in the table with far better explanations!

      (13) There are only 4 data points for investigating covariation between gender balance and funding success in CIHR. This should be discussed as a limitation.

      The small size of the dataset is now mentioned on Line 348.

      (14) L455 "Research varies widely across disciplines" in terms of what?

      This sentence has been extended

      .

      (15) L456 Maybe I am missing something but I don't understand the relevance of "Physicists' search for the grand unified theory" to research quality.

      Removed.

      (16) Can you provide more discussion into the results of your bibliographic analysis and Figure 2? An explanation into the relationships seen in the figure at least would be helpful.

      Thank you we have clarified the relationships seen in each of figures 2A (Lines 226-235), 2B (Lines 236-252), and 2C (lines  260-268).

      (17) It would be helpful to include in the discussion a few more sentences outlining:

      - Potential future research that would help disentangle mechanisms behind the trends you find.

      - How this research could be applied. Should there be some effort to standardise?

      We have added a short paragraph to the discussion about implications/applications, and future research (Lines 481-484).

      (18) The introduction could benefit from discussing and explaining their a priori hypotheses for how research from female-biassed disciplines may be evaluated differently.

      While not discussed in the introduction, possible explanations for why and how research in female dominated fields might be evaluated differently are explored in some detail in the Discussion.  We think once is enough, and towards the end is more effective than at the beginning.

      (19) L16 "Our work builds on others' findings that women's work is valued less, regardless of who performs that work." I find this confusing because in your model, there is a significant interaction effect between gender:pmen. This suggests that for female-biassed disciplines, there is even more of a devaluation for women, which I think your lines in figure 1 suggest.

      Correct but men are still affected, so the sentence is correct.  What is confusing is that the finding is counter to what we might expect.

    1. Author response:

      eLife assessment

      This fundamental study provides a near-comprehensive anatomical description and annotation of neurons in a male Drosophila ventral nerve cord, based on large-scale circuit reconstruction from electron microscopy. This connectome resource will be of substantial interest to neuroscientists interested in sensorimotor control, neural development, and analysis of brain connectivity. However, although the evidence is extensive and compelling, the presentation of results in this very large manuscript lacks clarity and concision.

      We thank the reviewers for their detailed and thoughtful feedback and the time that they invested to provide it. Organising this manuscript (which is clearly not a standard research article) was quite challenging as it had to fulfil a number of functions: presenting a guide to the system of annotations and the associated online resources; providing an atlas for the annotated cell types; and showcasing various analyses to illustrate the value of the dataset as well as just a few of the many questions it can be used to address. We gave careful consideration to its structure and attempted to signpost the sections that would be most useful to particular types of readers. Nevertheless we can see that this was not completely successful and we thank the reviewers for their suggestions for improvement.

      We acknowledge that the resulting manuscript was very large and will endeavour to streamline our text in the revision without compromising the accessibility of the data. We do note that there is some precedent for comprehensive and lengthy connectome papers going all the way back to White et al. 1986 which took 340 pages to describe the 302 neurons of the C. elegans connectome. More recently, we can compare the “hemibrain papers” published in eLife: Scheffer et al., 2020, Li et al., 2020, Schlegel et al., 2021, Hulse et al., 2021. These papers would also be difficult to digest at a single sitting but were game-changing for the Drosophila neuroscience field and have already been cited hundreds of times, a testament to their utility. In the same way that these papers provided the first comprehensively proofread and annotated EM connectome for (a large part of) the adult fly brain, our work now provides the first fully proofread and annotated EM connectome for the nerve cord. Given the pioneering nature of this dataset we feel that the lengthy but highly structured atlas sections of the paper are justified and will prove impactful in the long term.

      Whilst no EM dataset is perfect, we have endeavoured to make this one as comprehensive as possible. We found 74.4 million postsynapses and 15,765 neurons of VNC origin, all of which have been carefully proofread, reviewed, annotated and typed. For comparison, the female adult nerve cord dataset (FANC, Azevedo et al., Nature, 2024) contains roughly 45 million synapses and 14,600 neuronal cell bodies of which at the time of writing 5576 have received preliminary proofreading and 222 high quality proofreading. We emphasise that these are highly complementary datasets, given the difference in sex and the fact that each dataset has different artefacts (MANC has poorer preservation of neurons in the leg nerves; FANC is missing part of the abdominal ganglion and has lower synapse recovery). We reconstructed 5484 sensory neurons from the thoracic nerves, 84% of the ~6500 estimated from FANC. The overall recovery rate was ~86.5% if we include the ~1100 sensory neurons from abdominal nerves, which were in excellent condition.

      Reviewer #1 (Public Review):

      Summary:

      The authors present a close to complete annotation of the male Drosophila ventral nerve cord, a critical part of the fly's central nervous system.

      Strengths:

      The manuscript describes an enormous amount of work that takes the first steps towards presenting and comprehending the complexity and organization of the ventral nerve cord. The analysis is thorough and complete. It also makes the effort to connect this EM-centric view of the nervous system to more classical analyses, such as the previously defined hemilineages, that also describe the organization of the fly nervous system. There are many, many insights that come from this work that will be valuable to the field for the foreseeable future.

      We thank the reviewer for acknowledging the enormous collaborative effort represented by this manuscript. We tried to synthesise decades of light-level work by neuroscientists and developmental biologists working in Drosophila and other insects in order to create a standard, systematic nomenclature for >22,000 neurons, most of which had not been typed at light level. We hope that the MANC dataset and this guide to its contents will prove to be useful resources to Drosophila neurobiologists and the wider neuroscience field.

      Weaknesses:

      With more than 60 primary figures, the paper is overwhelming and cannot be read and digested in a single sitting. The result is more like a detailed resource rather than a typical research paper.

      In writing this paper, we had two aims: first, to describe and validate our extensive biological annotation of the connectome and second, to provide interesting illustrative examples of the many analyses that could be carried out on this dataset using the atlas we generated. The resulting paper is intended primarily as a detailed reference rather than a typical research paper. At the end of the Introduction, we outline the structure of the paper and explicitly direct non-specialist readers to focus on the initial and concluding sections for orientation to the dataset so that they would not get bogged down in the details. We will review our section organisation and headings to try to make the paper more straightforward to navigate, and we will add specific figure numbers to the outline.

      Reviewer #2 (Public Review):

      Summary and strengths:

      This massive paper describes the identity and connectivity of neurons reconstructed from a volumetric EM image volume of the ventral nerve cord (VNC) of a male fruit fly. The segmentation of the EM data was described in one companion paper; the classification of the neurons entering the VNC from the brain (descending neurons or DNs) and the motor neurons leaving the VNC was described in a second companion paper. Here, the authors describe a system for annotating the remaining neurons in the VNC, which include intrinsic neurons, ascending neurons, and sensory neurons, representing the vast majority of neurons in the dataset. Another fundamental contribution of this paper is the identification of the developmental origins (hemilineage) of each intrinsic neuron in the VNC. These comprehensive hemilineage annotations can be used to understand the relationship between development and circuit structure, provide insight into neurotransmitter identity, and facilitate comparisons across insect species.Many sensory neurons are also annotated by comparison to past literature. Overall, defining and applying this annotation system provides the field with a standard nomenclature and resource for future studies of VNC anatomy, connectivity, and development. This is a monumental effort that will fundamentally transform the field of Drosophila neuroscience and provide a roadmap for similar connectomic studies in other organisms.

      We thank the reviewer for acknowledging the enormous collaborative effort represented by this manuscript. We tried to synthesise decades of light-level work by neuroscientists and developmental biologists working in Drosophila and other insects in order to create a standard, systematic nomenclature for >22,000 neurons, most of which had not been typed at light level. We hope that the MANC dataset and this guide to its contents will prove to be useful resources to Drosophila neurobiologists and the wider neuroscience field.

      Weaknesses:

      Despite the significant merit of these contributions, the manuscript is challenging to read and comprehend. In some places, it seems to be attempting to comprehensively document everything the authors found in this immense dataset. In other places, there are gaps in scholarship and analysis. As it is currently constructed, I worry that the manuscript will intimidate general readers looking for an entry point to the system, and ostracize specialized readers who are unable to use the paper as a comprehensive reference due to its confusing organization.

      In writing this paper, we had two aims: first, to describe and validate our extensive biological annotation of the connectome and second, to provide interesting illustrative examples of the many analyses that could be carried out on this dataset using the atlas we generated. The resulting paper is intended primarily as a detailed reference rather than a typical research paper. At the end of the Introduction, we outline the structure of the paper and explicitly direct non-specialist readers to focus on the initial and concluding sections for orientation to the dataset so that they would not get bogged down in the details. We will review our section organisation and headings to try to make the paper more straightforward to navigate, and we will add specific figure numbers to the outline.

      The bulk of the 559 pages of the submitted paper is taken up by a set of dashboard figures for each of ~40 hemilineages. Formatting the paper as an eLife publication will certainly help condense these supplemental figures into a more manageable format, but 68 primary figures will remain, and many of these also lack quality and clarity. Without articulating a clear function for each plot, it is hard to know what the authors missed or chose not to show. As an example, many of the axis labels indicate the hemilineage of a group of neurons, but are ordered haphazardly and so small as to be illegible; if the hemilineage name is too small, and in a bespoke order for that data, then is the reader meant to ignore the specific hemilineage labels?

      We will contact eLife professional editing staff to determine whether the paper can be streamlined by moving more material to supplemental without making it difficult to locate the detailed catalogues of neurons that will be of interest to specialist readers. Based on the typical eLife format, we suspect that retaining the dashboard main figures for each hemilineage will be necessary to maintain its utility as a reference. We will, however, shorten the associated main text by, for example, moving background material used to assign the hemilineages to the Methods section and moving specific results to the figure legends where possible.

      We articulated the function for each plot as follows: "Below we describe in more depth every hemilineage that produces more than one or two secondary neurons. For each of these 35 hemilineages, we show (A) the overall morphology of the secondary population, (B) representative individual neurons (as estimated by highest average NBLAST score to other members of the hemilineage), and (C) specific notable examples (which in some cases are primary). We then report (D) the locations of their connectors (postsynapses and presynapses), (E) their upstream and downstream partners by class, and (F) their upstream and downstream partners by finer subdivisions corresponding to their systematic types (secondary hemilineage, target, or sensory modality). We also provide supplementary figures showing the morphology and normalised up- and downstream connectivity of all systematic types for each hemilineage."

      We have plotted every secondary neuron in each hemilineage, every predicted synapse for those neurons with confidence >0.5, every connection to partner neurons by class (no threshold applied), and then the same information organised by hemilineage in a heatmap (and including partners from all birthtimes and partners of unknown hemilineage). Then the supplementary figures show all connectivity, organised in the same way, for every individual cell type assigned to the hemilineage, including both primary and early secondary neurons. We will add more detail to the figure legends to clarify these points.

      We apologise that you were unable to read some of the axis labels in the review copy of the manuscript; we did submit high resolution versions of the figures as a supplemental file, but perhaps this did not reach you; they can also be found at https://www.biorxiv.org/content/10.1101/2023.06.05.543407v2.supplementary-material. The hemilineages are in a conserved (alphanumerical) order for all hemilineage-specific plots and many others. The exceptions arise when neurons are clustered based on their connectivity to hemilineages, in which case the order of the labels necessarily follows the structure of the resulting clusters.

      The text has similar problems of emphasis. It is often meandering and repetitive. Overlapping information is found in multiple places, which causes the paper to be much longer than it needs to be. For example, the concept of hemilineages is introduced three times before the subtitle "Introduction to hemilineage-based organisation". When cell typing is introduced, it is unclear how this relates to serial motif, hemilineage, etc; "Secondary hemilineages" follow the Cell typing title. Like the overwhelming number of graphical elements, this gives the impression that little attention has been paid to curating and editing the text. It is unclear whether the authors intend for the paper to be read linearly or used as a reference. In addition, descriptions of the naming system are often followed by extensive caveats and exceptions, giving the impression that the system is not airtight and possibly fluid. At many points, the text vacillates between careful consideration of the dataset's limitations and overly grandiose claims. These presentation flaws overshadow the paper's fundamental contribution of describing a reasonable and useful cell-typing system and placing intrinsic neurons within this framework.

      Because we intended this paper to be read primarily as a reference, we tried to make each section stand on its own, which we agree resulted in some redundancy (with more details appearing where relevant). However, we will do our best to tighten the text for the version of record.

      Our description immediately under the Cell typing title includes the use of hemilineage, serial (not serial motif, which was not used), and laterality (left-right homologues) in the procedure to assign cell types. We will change this to “Cell typing of intrinsic, ascending, and efferent neurons” for clarity. The “Secondary hemilineages” title marks the start of a new section that serves as a reference for each of the secondary hemilineages; we will change this to “Secondary hemilineage catalogue” or similar for clarity.

      References to past Drosophila literature are inconsistent and references to work from other insects are generally not included; for example, the extensive past work on leg sensory neurons in locusts, cockroaches, and stick insects. Such omissions are understandable in a situation where brevity is paramount. However, this paper adopts a comprehensive and authoritative tone that gives the reader an impression of completeness that does not hold up under careful scrutiny.

      We did not attempt to review the sensory neuron literature in this manuscript but rather cited those specific papers which included the axon morphology data that informed our modality, peripheral origin, and cell type assignments. Most of these came from the Drosophila literature due to the availability of genetic tools used for sparse labelling of specific populations as well as the greatly increased likelihood of conserved morphology. However we certainly agree that decades of sensory neuron work in larger insects were foundational for this subfield and will add a sentence to this effect in the introduction to our sensory neuron typing.

      The paper accompanies the release of the MANC dataset (EM images, segmentation, annotations) through a web browser-based tool: clio.janelia.org. The paper would be improved by distilling it down to its core elements, and then encouraging readers to explore the dataset through this interactive interface. Streamlining the paper by removing extraneous and incomplete analyses would provide the reader with a conceptual or practical framework on which to base their own queries of the connectome.

      We certainly hope that this paper will encourage readers to explore the MANC dataset. Indeed, as we state in the Discussion, "Moreover, its ultimate utility depends on how widely it is leveraged in the future experimental and computational work of the entire neuroscience community. We have only revealed the tip of the iceberg in this report, with a wealth of opportunities now available in this publicly available dataset for forthcoming connectomic analyses that will feed into testable functional hypotheses." In the first few sections of the Results, we include a visual introduction to annotated features, a glossary of annotation terms, a visual guide to our cell typing nomenclature, and two video tutorials on the use of Clio Neuroglancer to query the dataset. To further encourage exploration, we have also included illustrative examples of just a few of the many analyses that can now be performed with this comprehensive and publicly available dataset.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1)  Regarding the cell studies of human pediatric bone-derived osteoblast-like cells (HBO), the authors should provide a rationale for their selection of specific cell lines (15,16, 17, 19, 20, 23, 24) in this study. As for animal studies, could the authors clarify which cell lines were utilized in the murine in vivo experiments?

      We appreciate the opportunity to address this. To reduce confusion, we have numbered the patient primary cell lines used in these studies sequentially from 1 – 7. Additionally, we have added “HBO cell lines used for experiments were selected based on the ability of the primary cell line to proliferate and mineralize in culture” to the Methods section. 

      In vivo experiments: “HBO cell lines 2, 6 and 7 from separate individuals were selected for these experiments based on similar growth and passage characteristics.” This statement is included in the Methods section.

      (2)  In this study, the authors performed the murine in vivo experiments using both male and female mice. Could the author clarify if any difference was observed between male and female mice in the findings? This information would contribute to a more comprehensive understanding of the study.

      We agree and have added the following to the Results section: “There was no sex-based difference in regenerated bone volume.”

      (3)  Although the histological results showed an elevated collagen expression in mice treated with BMP2, JAG1, and JAG1 + DAPT compared to those treated with the cells alone, the differences among groups were subtle. The authors should consider the immunohistochemical (IHC) staining for collagen 1 on the samples, allowing for a quantitative assessment of collagen 1 expression.

      Thank you for this comment. The differences between BMP2, JAG1, and JAG1 + DAPT are indeed subtle. We have added Supplementary Figure 5, showing collagen staining of sections from the same FFPE blocks that were sectioned and stained with Masson Trichrome in Figure 2C. 

      Minor Comments:

      (4)  Please specify which cell lines are represented in the staining results shown in Fig.1A and Fig. 5A, respectively.

      In Fig 1A the representative images are of HBO2. Fig 5A representative images are of HBO7. We have added this information to the figure legends for these figures. 

      (5)  There appears to be a discrepancy in the specified size of the critical defect. The manuscript states that the size is 4mm, while Supplemental Figure 3 indicates 3.5mm.

      Thank you for this catch! Yes, it should be 4mm. This has been corrected in Supplementary Figure 3.

      (6)  The scale bar for Figure 2 C is missing.

      Scale bars have been added which also gave us an opportunity to brighten the images equally, allowing for better distinction between the different colors of the Masson Trichrome staining.

      (7)  In the methodological section 2.5 for JAG1 delivery, it would be helpful if the authors could review the initial dosage of JAG1 delivery to confirm if HBO cells were included or not, given that the MicroCT results indicate that all groups incorporated HBO cells. 

      We appreciate this suggestion. In response to another question, we have added Supplementary Figure 4 which includes an “Empty Defect” condition with no HBO cells, making the original method statement accurate.

      Reviewer #2 (Recommendations For The Authors):

      In the current study, using in vitro and in vivo models the authors clearly show that JAG1 can enhance osteogenesis and thus can be helpful in designing new therapeutic approaches in the field of bone regenerative research. The in vivo mouse CF model is very convincing and shows that JAG1 promotes osteogenesis via non-canonical signaling. Mechanistically it seems that JAG1 activates STAT5, AKT, P38, JNK, NF-ĸB, and p70 S6K. However, additional evidence is needed to convincingly conclude that all the non-canonical pathways activated via JAG1 converge at p70 S6K activation. The following concerns need to be addressed.

      (1) In Fig 1A: Even though the Jag1-Fc shows a very significant increase in HBO mineralization, there are no significant increases in cells in osteogenic media when compared to control growth media. Even though the different conditions were subjected to RNAseq analysis in the later figures, qPCR analysis of some osteogenic genes in Figure 1 might be helpful. 

      We appreciate the opportunity to explore this question further. We conducted mineralization experiments in triplicate and performed qRT-PCR, assessing for gene expression of 5 osteogenic genes: ALPL, BGLAP (osteocalcin), COL1A1, RUNX2, and SP7. Results are shown in Figure 1C and this text was added to Results: “Additionally, PCR analysis of HBO1 cells from a repeat experiment collected at days 7, 14, and 21 showed significantly increased expression of osteogenic genes with JAG1-bds stimulation (Figure 1C). ALPL was significantly expressed at Day 7, with a 3.5-fold increase (p=0.0004) compared to HBO1 cells grown in growth media. In contrast, significant expression levels of COL1A1 and BGLAP were observed at 14 days, with a 5.1-fold increase (p=0.0021) of COL1A1 and a 12.3-fold increase (0.0002) of BGLAP when compared to growth media conditions. Interestingly, while some mineralization is observed in the osteogenic media and Fc-bds

      (Figure 1A) conditions, there were no significant increases in osteogenic gene expression (Figure

      1C). Expression of RUNX2 and SP7 was not significantly altered across all conditions and time points (not shown).”

      (2) In Fig 2: even though not needed in respect to the hypothesis, was there any Control group without any cells or JAG1 beads? What were the changes in between that group and cells cells-only group?

      We have not observed differences between the “Empty Defect” group and the “Cells alone” group.

      We have addressed the reviewer’s comments by adding this comparison in Supplementary Figure 4.

      (3) Transcriptional profiling and ELISA (Fig 3 and 4) show upregulation of NF-ĸB signaling in response to JAG1. In the discussion, the authors have referenced a previous study showing NF-ĸB as prosurvival in human OB cells. However, based on many published reports, NF-ĸB activation has been shown to inhibit OB function. Does JAG1 regulate HBO cell survival via NF-ĸB activation?

      Experimenting using NF-ĸB inhibitor can be helpful to show that JAG1 mediates NF-ĸB activation is anabolic in this experimental setup.

      We thank the reviewer for this excellent suggestion. We are eager to explore this new direction for our research in a subsequent study. We have added this to our future directions. 

      (4) Fig 5: 

      (A)  Condition showing JAG1+ DAPT is needed to compare between JAG1 canonical and noncanonical signaling. 

      Thank you for pointing this out. We have added Supplementary Figure 6, which includes a dose response experiment for JAG1 + DAPT.

      (B)  S6K18 alone seems to be increasing OB mineralization. Is that statistically significant?  

      No, and we have added the statistical analysis for S6K-18 to Figure 5B.

      (C)  Fc alone condition seems to have a very significant increase in OB mineralization. Does Fc alone upregulate OB function? 

      We do see some upregulation of mineralization with Fc in vitro, which we also observed in our previous studies with mouse neural crest cells, but we have not found it to be osteogenic in vivo. We have added a statement to this effect, with references. Additionally, osteogenic gene expression was not upregulated in our in vitro mineralization experiments with Fc.  See Revised Figure 1.

      (D)  Although overall quantification shows that S6K18 partially inhibits HBO mineralization, the representative images do not represent the quantification. Transcriptional analysis (qPCR) is required to validate these findings.

      We performed qRT-PCR on cells from a repeat mineralization assay, collecting cells at 9, 14, and 21 days. We have added the following to the Results:” While inhibition of NOTCH and p70 S6K decreased mineralization in our mineralization assay, there are no statistically significant changes in gene expression for ALPL, COL1A1, or BGLAP (Supplementary Figure 7). These results suggest that the HBO cells phenotypes are maturing into osteocytes and that inhibiting p70 S6K hinders the cellular ability to mineralize but not the cell phenotype progression.”

      (5) Finally, to convincingly conclude the data from Fig 5, the mouse CF model can be helpful to support the authors' claim that JAG1 acts via p70 S6K.

      Thank you for this feedback. We have modified our conclusions to reflect that p70 S6K is one of the non-canonical pathways that JAG1 may be activating in bone regeneration.

      Thank you very much for your consideration of our revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this paper, proteomics analysis of the plasma of human subjects that underwent an exercise training regime consisting of a combination of endurance and resistance exercise led to the identification of several proteins that were responsive to exercise training. Confirming previous studies, many exercise-responsive secreted proteins were found to be involved in the extra-cellular matrix. The protein CD300LG was singled out as a potential novel exercise biomarker and the subject of numerous follow-up analyses. The levels of CD300LG were correlated with insulin sensitivity. The analysis of various open-source datasets led to the tentative suggestion that CD300LG might be connected with angiogenesis, liver fat, and insulin sensitivity. CD300LG was found to be most highly expressed in subcutaneous adipose tissue and specifically in venular endothelial cells. In a subset of subjects from the UK Biobank, serum CD300LG levels were positively associated with several measures of physical activity - particularly vigorous activity. In addition, serum CD300LG levels were negatively associated with glucose levels and type 2 diabetes. Genetic studies hinted at these associations possibly being causal. Mice carrying alterations in the CD300LG gene displayed impaired glucose tolerance, but no change in fasting glucose and insulin. Whether the production of CD300LG is changed in the mutant mice is unclear.

      Strengths:

      The specific proteomics approach conducted to identify novel proteins impacted by exercise training is new. The authors are resourceful in the exploitation of existing datasets to gain additional information on CD300LG.

      Weaknesses:

      While the analyses of multiple open-source datasets are necessary and useful, they lead to relatively unspecific correlative data that collectively insufficiently advance our knowledge of CD300LG and merely represent the starting point for more detailed investigations. Additional more targeted experiments of CD300LG are necessary to gain a better understanding of the role of CD300LG and the mechanism by which exercise training may influence CD300LG levels. One should also be careful to rely on external data for such delicate experiments as mouse phenotyping. Can the authors vouch for the quality of the data collected. 

      Thank you for the valuable feedback on our manuscript. We recognize concerns about the specificity of correlative data from open-source datasets and the limitations it presents for understanding CD300LG's role. To address this, we have expanded the manuscript with a paragraph in the discussion regarding the need of targeted experiments confirm CD300LG’s functions and relationship with glucose metabolism. We also emphazise caution regarding external data reliance and we acknowledge the need for generating primary data including direct phenotyping of mice with CD300LG gene alterations to better understand its regulatory mechanisms and effects on glucose tolerance. Please see lines 446-456.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript from Lee-Odegard et al reports proteomic profiling of exercise plasma in humans, leading to the discovery of CD300LG as a secreted exercise-inducible plasma protein. Correlational studies show associations of CD300LG with glycemic traits. Lastly, the authors query available public data from CD300LG-KO mice to establish a causal role for CD300LG as a potential link between exercise and glucose metabolism. However, the strengths of this manuscript were balanced by the moderate to major weaknesses. Therefore in my opinion, while this is an interesting study, the conclusions remain preliminary and are not fully supported by the experiments shown so far.

      Strengths:

      (1) Data from a well-phenotyped human cohort showing exercise-inducible increases in CD300LG.

      (2) Associations between CD300LG and glucose and other cardiometabolic traits in humans, that have not previously been reported.

      (3) Correlation to CD300LG mRNA levels in adipose provides additional evidence for exercise-inducible increases in CD300LG.

      Weaknesses:

      (1) CD300LG is by sequence a single-pass transmembrane protein that is exclusively localized to the plasma membrane. How CD300LG can be secreted remains a mystery. More evidence should be provided to understand the molecular nature of circulating CD300LG. Is it full-length? Is there a cleaved fragment? Where is the epitope where the o-link is binding to CD300LG? Does transfection of CD300LG to cells in vitro result in secreted CD300LG?

      (2) There is a growing recognition of specificity issues with both the O-link and somalogic platforms. Therefore it is critical that the authors use antibodies, targeted mass spectrometry, or some other methods to validate that CD300LG really is increased instead of just relying on the O-link data.

      (3) It is insufficient simply to query the IMPC phenotyping data for CD300LG; the authors should obtain the animals and reproduce or determine the glucose phenotypes in their own hands. In addition, this would allow the investigators to answer key questions like the phenotype of these animals after a GTT, whether glucose production or glucose uptake is affected, whether insulin secretion in response to glucose is normal, effects of high-fat diet, and other standard mouse metabolic phenotyping assays.

      (4) I was unable to find the time point at which plasma was collected at the 12-week time point. Was it immediately after the last bout of exercise (an acute response) or after some time after the training protocol (trained state)?

      We acknowledge the importance of understanding the molecular form of CD300LG in circulation. We have expanded the discussion with a paragraph regarding the need of follow-up experiments on whether circulating CD300LG is full-length or a cleaved fragment, to identify the epitope for O-link binding, and assess CD300LG secretion in vitro through transfection experiments. We also discuss the need of targeted mass spectrometry and antibody-based validation of O-link measurements of CD300LG, and the need for more validation experiments on CD300LG-deficient mice. Please see lines 446-456.

      The plasma collected post-intervention is in a state that reflects the new baseline trained condition of the subjects, 3 days after the last exercise session during the intervention. We have clarified this in our manuscript. The information is updated in line 491-493.

      Reviewer #1 (Recommendations For The Authors):

      In the present form, the paper raises interest in the potential role of CD300LG in the response to exercise training but unfortunately does not provide clear answers. The authors should focus their efforts on firmly validating the status of CD300LG as an exercise biomarker in humans and carefully examine the function of CD300LG through mechanistic and animal-based studies.

      The authors are encouraged to acquire CD300LG-deficient mice and perform specific experiments to validate hypotheses forthcoming from the analysis of the open-source datasets. In addition, it needs to be validated that the cd300lgtm1a(KOMP)Wtsi mice are actually deficient in CD300LG. It is not uncommon that Tm1a mice have (almost) normal expression of the targeted gene.

      We have now revised the manuscript and added a new section to the discussion regarding the limitations with open-source data, cd300lgtm1a(KOMP)Wtsi mice and the need for more validation experiments on CD300LG-deficient mice. Please see lines 446-456.

      The value of the correlative data presented in Figure 5 is rather limited. The same can be argued for the data presented in Supplementary Figure 2. If CD300LG is expressed in endothelial cells, it stands to reason that its expression is correlated with angiogenesis. Hence, this observation does not really carry any additional value.

      We agree that correlations cannot imply causality. However, similar patterns were observed in several tissues and across different data sets, which at least suggest a role CD300LG related to angiogesis. We have included a section in the discussion were we clarify that our observations should only be regarded as indications and that follow-up studies are needed to confirm any causal role for CD300LG on angiogenesis/oxidativ capacity. Please see lines 446-456.

      Figure 6 may be better accommodated in the supplement.

      Figure 6 is now moved to the supplement.

      Figure 3A and B are a bit awkward. The description "no overlap" is confusing. Isn't it more accurate to say "no enrichment" or "no over-representation"? There will always be some overlap with certain pathways. However, there may be no enrichment. Furthermore, the use of arrows to indicate No overlap is visually not very appealing. Maybe the numbers can be given a specific color?

      We have now removed the arrows and text, and rather stated in the text that there were no enrichements other than for the proteins down-regulated in the overweight group.

      The description of the figure legend of figure 5E-H is incomplete.

      The description is now completed.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      Kroeg et al. describe a novel method for 2D culture human induced pluripotent stem cells (hiPSCs) to form cortical tissue in a multiwell format. The method claims to offer a significant advancement over existing developmental models. Their approach allows them to generate cultures with precise, reproducible dimensions and structure with a single rosette; consistent geometry; incorporating multiple neuronal and glial cell types (cellular diversity); avoiding the necrotic core (often seen in free-floating models due to limited nutrient and oxygen diffusion). The researchers demonstrate the method's capacity for long-term culture, exceeding ten months, and show the formation of mature dendritic spines and considerable neuronal activity. The method aims to tackle multiple key problems of in vitro neural cultures: reproducibility, diversity, topological consistency, and electrophysiological activity. The authors suggest their potential in high-throughput screening and neurotoxicological studies.

      Strengths: 

      The main advances in the paper seem to be: The culture developed by the authors appears to have optimal conditions for neural differentiation, lineage diversification, and long-term culture beyond 300 days. These seem to me as a major strength of the paper and an important contribution to the field. The authors present solid evidence about the high cell type diversity present in their cultures. It is a major point and therefore it could be better compared to the state of the art. I commend the authors for using three different IPS lines, this is a very important part of their proof. The staining and imaging quality of the manuscript is of excellent quality.

      We thank the reviewer for the positive comments on the potential of our novel platform to address key problems of in vitro neural culture, highlighting the longevity and reproducibility of the method across multiple cell lines.

      Weaknesses: 

      (1) The title is misleading: The presented cultures appear not to be organoids, but 2D neural cultures, with an insufficiently described intermediate EB stage. For nomenclature, see: doi: 10.1038/s41586-022-05219-6. Should the tissue develop considerable 3D depth, it would suffer from the same limited nutrient supply as 3D models - as the authors point out in their introduction. 

      We appreciate the opportunity to clarify this point. We respectfully disagree that the cultures do not meet the consensus definition of an organoid. In fact, a direct quote from the seminal nomenclature paper referenced by the reviewer states: “We define organoids as in vitro-generated cellular systems that emerge by self-organization, include multiple cell types, and exhibit some cytoarchitectural and functional features reminiscent of an organ or organ region. Organoids can be generated as 3D cultures or by a combination of 3D and 2D approaches (also known as 2.5D) that can develop and mature over long periods of time (months to years).” (Pasca et al, 2022 doi10.1038/s41586-022-05219-6). Therefore, while many organoid types indeed have a more spherical or globular 3D shape, the term organoid also applies to semi-3D or non-globular adherent organoids, such as renal (Czerniecki et al 2018, doi.org/10.1016/j.stem.2018.04.022) and gastrointestinal organoids (Kakni et al 2022, doi.org/10.1016/j.tibtech.2022.01.006). Accordingly, the adherent cortical organoids described in the manuscript exhibit self-organization to single radial structures consisting of multiple cell layers in the z-axis, reaching ~200um thickness (therefore remaining within the limits for sufficient nutrient supply), with consistent cytoarchitectural topology and electrophysiological activity, and therefore meet the consensus definition of an organoid.

      (2) The method therefore should be compared to state-of-the-art (well-based or not) 2D cultures, which seems to be somewhat overlooked in the paper, therefore making it hard to assess what the advance is that is presented by this work. 

      It was not our intention to benchmark this model quantitatively against other culture systems. Rather, we have attempted to characterize the opportunities and limitations of this approach, with a qualitative contrast to other culture methods. Compared to state-of-the-art 2D neural network cultures, adherent cortical organoids provide distinct advantages in:

      (1) Higher order self-organized structure formation, including segregation of deeper and upper cortical layers.

      (2) Longevity: adherent cortical organoids can be successfully kept in culture up to 1 year where 2D cultures typically deteriorate after 8-12 weeks.

      (3) Maturity, including the formation of dendritic mushroom spines and robust electrophysiological activity.

      (4) Cell type diversity including a more physiological ratio of inhibitory and excitatory neurons (10% GAD67+/NeuN+ neurons in adherent cortical organoids, vs 1% in 2D neural networks) and the emergence of oligodendrocyte lineage cells.

      On the other hand, limitations of adherent cortical organoids compared to 2D neural network cultures are:

      (1) Culture times for organoids are much longer than for 2D cultures and the method can therefore be more laborious and more expensive.

      (2) Whole cell patch clamping is not easily feasible in the organoids because of the restricting dimensions of the 384well plates.

      (3) Reproducibility is prominently claimed throughout the manuscript. However, it is challenging to assess this claim based on the data presented, which mostly contain single frames of unquantified, high-resolution images. There are almost no systematic quantifications presented. The ones present (Figure S1D, Figure 4) show very large variability. However, the authors show sets of images across wells (Figure S1B, Figure S3) which hint that in some important aspects, the culture seems reproducible and robust. 

      We made considerable efforts to establish quantitative metrics to assess reproducibility. We applied a quantitative scoring system of single radial structures at different time points for multiple batches of all three lines as indicated in Figure S1D. This figure represents a comprehensive dataset in which each dot represents the average of a different batch of organoids containing 10-40 organoids per batch. To emphasize this, we will adapt the graph to better reflect the breadth of the dataset. Additional quantifications are given in Figure S2 for progenitor and layer markers for Line 1 and in Figure S5 for interneurons across all three lines, showing relatively low variability. That being said, we acknowledge the reviewer’s concerns and will modify the text to reduce the emphasis of this point, pending more extensive data addressing reproducibility across a wide range of parameters.

      (4) What is in the middle? All images show markers in cells present around the center. The center however seems to be a dense lump of cells based on DAPI staining. What is the identity of these cells? Do these cells persist throughout the protocol? Do they divide? Until when? Addressing this prominent cell population is currently lacking. 

      A more comprehensive characterization of the cells in the center remains a significant challenge due to the high cell density hindering antibody penetration. However, dye-based staining methods such as DAPI and the LIVE/DEAD panel confirm a predominance of intact nuclei with very minimal cell death. The limited available data suggest that a substantial proportion of the cells in the center are proliferative neural progenitors, indicated by immunolabeling for SOX2 and Ki67. We will add additional figures to support these findings. Furthermore, we are currently optimizing the conditions to perform single cell / nuclear RNA sequencing to further characterize the cellular composition of the organoids.

      (5) This manuscript proposes a new method of 2D neural culture. However, the description and representation of the method are currently insufficient. <br /> (a) The results section would benefit from a clear and concise, but step-by-step overview of the protocol. The current description refers to an earlier paper and appears to skip over some key steps. This section would benefit from being completely rewritten. This is not a replacement for a clear methods section, but a section that allows readers to clearly interpret results presented later.

      We will revise the manuscript to include a more detailed step-by-step overview of the protocol.

      (b) Along the same lines, the graphical abstract should be much more detailed. It should contain the time frames and the media used at the different stages of the protocol, seeding numbers, etc. 

      As suggested, we will also adapt the graphical abstract to include more detail.

      Reviewer #2 (Public Review): 

      Summary: 

      In this manuscript, van der Kroeg et al have developed a method for creating 3D cortical organoids using iPSC-derived neural progenitor cells in 384-well plates, thus scaling down the neural organoids to adherent culture and a smaller format that is amenable to high throughput cultivation. These adherent cortical organoids, measuring 3 x 3 x 0.2 mm, self-organize over eight weeks and include multiple neuronal subtypes, astrocytes, and oligodendrocyte lineage cells.

      Strengths: 

      (1) The organoids can be cultured for up to 10 months, exhibiting mature dendritic spines, axonal myelination, and robust neuronal activity. 

      (2) Unlike free-floating organoids, these do not develop necrotic cores, making them ideal for high-throughput drug discovery, neurotoxicological screening, and brain disorder studies.

      (3) The method addresses the technical challenge of achieving higher-order neural complexity with reduced heterogeneity and the issue of necrosis in larger organoids. The method presents a technical advance in organoid culture.

      (4) The method has been demonstrated with multiple cell lines which is a strength. 

      (5) The manuscript provides high-quality immunostaining for multiple markers. 

      We appreciate the reviewer’s acknowledgement of the strengths of this novel platform as a technical advance in organoid culture that reduces heterogeneity and shows potential for higher throughput experiments.

      Weaknesses: 

      (1) Direct head-to-head comparison with standard organoid culture seems to be missing and may be valuable for benchmarking, ie what can be done with the new method that cannot be done with standard culture and vice versa, ie what are the aspects in which new method could be inferior to the standard.

      In our opinion, it would be extremely difficult to directly compare methods because of substantial differences. Most notably, whole brain organoids grow to large and irregular globular shapes, while adherent cortical organoids have a highly standardized shape confined by the limits of a 384-well. Moreover, it was not our intention to benchmark this model quantitatively against other culture systems. Rather, we have attempted to characterize the opportunities and limitations of this approach, with a qualitative contrast to other culture methods.

      (2) It would be important to further benchmark the throughput, ie what is the success rate in filling and successfully growing the organoids in the entire 384 well plate? 

      Figure S1D shows the success rate of organoid formation and stability of the organoid structures over time. In addition, we will add the number of wells that were filled per plate.

      (3) For each NPC line an optimal seeding density was estimated based on the proliferation rate of that NPC line and via visual observation after 6 weeks of culture. It would be important to delineate this protocol in more robust terms, in order to enable reproducibility with different cell lines and amongst the labs. 

      Figure S1C provides the relationship between proliferation rate and seeding density, allowing estimation of seeding densities based on the proliferation rate of the NPCs. However, we appreciate the reviewers feedback and will modify the methods to provide more detail.

      Reviewer #3 (Public Review): 

      Summary: 

      Kroeg et al. have introduced a novel method to produce 3D cortical layer formation in hiPSC-derived models, revealing a remarkably consistent topography within compact dimensions. This technique involves seeding frontal cortex-patterned iPSC-derived neural progenitor cells in 384-well plates, triggering the spontaneous assembly of adherent cortical organoids consisting of various neuronal subtypes, astrocytes, and oligodendrocyte lineage cells. 

      Strengths: 

      Compared to existing brain organoid models, these adherent cortical organoids demonstrate enhanced reproducibility and cell viability during prolonged culture, thereby providing versatile opportunities for high-throughput drug discovery, neurotoxicological screening, and the investigation of brain disorder pathophysiology. This is an important and timely issue that needs to be addressed to improve the current brain organoid systems. 

      We thank the reviewer for highlighting the strengths of our novel platform. We appreciate that all three reviewers agree that the adherent cortical organoids presented in this manuscript reliably demonstrate increased reproducibility and longevity. They also commend its potential for higher throughput drug discovery and neurotoxicological/phenotype screening purposes.

      Weaknesses: 

      While the authors have provided significant data supporting this claim, several aspects necessitate further characterization and clarification. Mainly, highlighting the consistency of differentiation across different cell lines and standardizing functional outputs are crucial elements to emphasize the future broad potential of this new organoid system for large-scale pharmacological screening.

      We appreciate the feedback and will add more detail on consistency and standardization of functional outputs.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Although the manuscript is well organized and written, it could be largely improved and therefore made more plausible and easier to read. See my point-by-point comments listed below:

      (1) The introduction section is a bit overloaded with some unnecessary information. For example, the authors discussed the relationship between neurotransmitters in the prefrontal and striatum and substance use/sustained attention. However, the results are related to neither the neurotransmitters nor the striatum. In addition, there is a contradictory description about neurotransmitters there, Nicotine/THC leads to increased neurotransmitters, and decreased neurotransmitters is related to poor sustained attention. Does that mean that the use of Nicotine/THC could increase sustained attention?

      Thanks for this insightful question. We understand your concern regarding the seemingly contradictory statements about neurotransmitters and sustained attention. Previous studies have shown that acute administration of nicotine can improve sustained attention (Lawrence et al., 2002; Potter and Newhouse, 2008; Valentine and Sofuoglu, 2018; Young et al., 2004). On the other hand, the acute effects of smoking cannabis on sustained attention are mixed and depend on factors such as dosage and individual differences (Crean et al., 2011). For instance, a previous study (Hart et al., 2001) found that performance on a tracking task, which requires sustained attention, was found to improve significantly after smoking cannabis with a high dose of THC, albeit in experienced cannabis users. However, chronic substance use, including nicotine and cannabis, has been associated with impaired sustained attention (Chamberlain et al., 2012; Dougherty et al., 2013).

      To address your concerns and improve clarity and succinctness of the Introduction, we have removed the description of neurotransmitters from the Introduction. This revision should make the introduction more concise and focus on the direct relationships pertinent to our study.

      (2) It is a bit hard to follow the story for the readers because the Results section went straight into detail. For example, the authors directly introduced that they used the ICV from the Go trials to index sustained attention without basic knowledge about the task. Why use the ICV of Go trials instead of other trials (i.e., successful stop trials) as an index of sustained attention? I suggest presenting the subjects and task details about the data before the detailed behavioral results. The results section should include enough information to understand the presenting results for the readers, rather than forcing the reader to find the answer in the later Methods section.

      We appreciate your suggestion to provide more context about the task and ICV before diving into the detailed behavioural results.

      We used the ICV derived from the Go trials instead of Success stop trials as an index of sustained attention, based on the nature of the stop-signal task and the specific data it generates. Previous studies have indicated that reaction time (RT) variability is a straightforward measure of sustained attention, with increasing variability thought to reflect poorer ability to sustain attention (Esterman and Rothlein, 2019). RT variability is defined as ICV, calculated as the standard deviation of mean Go RT divided by the mean Go RT from Go trials (O'Halloran et al., 2018). The stop signal task includes both Go trials and stop trials. During Go trials, participants are required to respond as quickly and accurately as possible to a Go signal, allowing for the recording of RT for calculating ICV. In contrast, stop trials are designed to measure inhibitory control, where successful response inhibition results in no RT or response recorded in the output. Therefore, Go trials are specifically used to assess sustained attention, while Stop trials primarily assess inhibitory control (Verbruggen et al., 2019).

      We acknowledge the importance of providing this contextual information within the Results section to enhance reader understanding. We have added this information before presenting the behavioural results on Page 6.

      Results

      (1) Behavioural changes over time

      Reaction time (RT) variability is a straightforward measure of sustained attention, with increasing variability thought to reflect poor sustained attention. RT variability is defined as intra-individual coefficient of variation (ICV), calculated as the standard deviation of mean Go RT divided by the mean Go RT from Go trials in the stop signal task. Lower ICV indicates better sustained attention.

      (3) The same problem for section 2 in the Results. What are the predictive networks? Are the predictive networks the same as the networks constructed based on the correlation with ICV? My intuitive feeling is that they are the circular analyses here. The positive/negative/combined networks are calculated based on the correlation between the edges and ICV. Then the author used the network to predict the ICV again. The manipulation from the raw networks (I think they are based on PPI) to the predictive network, and the calculation of the predicted ICV are all missing. The direct exposure of the results to the readers without enough detailed knowledge made everything hard to digest.

      We thank the Reviewer for the insightful comment. We agree with the need for more clarity regarding the predictive networks and the CPM analysis before presenting results. CPM, a data-driven neuroscience approach, is applied to predict individual behaviour from brain functional connectivity (Rosenberg et al., 2016; Shen et al., 2017). The CPM analysis used the strength of the predictive network to predict the individual difference in traits and behaviours. CPM includes several steps: feature selection, feature summarization, model building, and assessment of prediction significance (see Fig. S1).

      During feature selection, we assessed whether connections between brain areas (i.e., edges) in a task-related functional connectivity matrix (derived from general psychophysiological interaction analysis) were positively or negatively correlated with ICV using a significance threshold of P < 0.01. These positively or negatively correlated connections are regarded as positive or negative network, respectively. The network strength of the positive network (or negative network) was determined in each individual by summing the connection strength of each positively (or negatively) correlated edge. The combined network was determined by subtracting the strength of the negative network from the positive network. Next, CPM built a linear model between the network strength of the predictive network and ICV. This model was initially developed using the training set. The predictive networks were then applied to the test set, where network strength was calculated again, and the linear model was used to predict ICV using k-fold cross-validation. Following your advice, we have updated it in the Results section to include these details on Page 7.

      Results

      (2) Cross-sectional brain connectivity

      This study employed CPM, a data-driven neuroscience approach, to identify three predictive networks— positive, negative, and combined— that predict ICV from brain functional connectivity. CPM typically uses the strength of the predictive networks to predict individual differences in traits and behaviors. The predictive networks were obtained based on connectivity analyses of the whole brain. Specifically, we assessed whether connections between brain areas (i.e., edges) in a task-related functional connectivity matrix derived from generalized psychophysiological interaction analysis were positively or negatively correlated with ICV using a significance threshold of P < 0.01. These positively or negatively correlated connections were regarded as positive or negative network, respectively. The network strength of positive networks (or negative networks) was determined for each individual by summing the connection strength of each positively (or negatively) correlated edge. The combined network was determined by subtracting the strength of the negative network from the positive network. We then built a linear model between network strength and ICV in the training set and applied these predictive networks to yield network strength and a linear model in the test set to calculate predicted ICV using k-fold cross validation.

      (4) The authors showed the positive/negative/combined networks from both Go trials and successful stop trials can predict the ICV. I am wondering how the author could validate the specificity of the prediction of these positive/negative/combined networks. For example, how about the networks from the failed stop trials?

      We appreciate the opportunity to clarify the specificity of the predictive networks identified in our study. Here is a more detailed explanation of our findings and their implications.

      To validate the specificity of the sustained attention network identified from CPM analysis, we calculated correlations between the network strength of positive and negative networks and performances from a neuropsychology battery (CANTAB) at each timepoint separately. CANTAB includes several tasks that measure various cognitive functions, such as sustained attention, inhibitory control, impulsivity, and working memory. We found that all positive and negative networks derived from Go and Successful stop trials significantly correlated with a behavioural assay of sustained attention – the rapid visual information processing (RVP) task – at ages 14 and 19 (all P values < 0.028). Age 23 had no RVP task data in the IMAGEN study. There were sporadic significant correlations between constructs such as delay aversion/impulsivity and negative network strength, for example, but the correlations with the RVP were always significant. This demonstrates that the strength of the sustained attention brain network was specifically and robustly correlated with a typical sustained attention task, rather than other cognitive measures. The results are described in the main text on Page 8 and shown in Supplementary materials (Pages 1 and 3) and Table S12.

      In addition, we conducted a CPM analysis to predict ICV using gPPI under Failed stop trials. Our findings showed that positive, negative, and combined networks derived from Failed stop trials significantly predicted ICV: at age 14 (r = 0.10, P = 0.033; r = 0.19, P < 0.001; and r = 0.17, P < 0.001, respectively), at age 19 (r = 0.21; r = 0.18; and r = 0.21, all P < 0.001, respectively), and at age 23 (r = 0.33, r = 0.35, and r = 0.36, respectively, all P < 0.001). Similar results were obtained using a 5-fold CV and leave-site-out CV.

      Our analysis further showed that task-related functional connectivity derived from Go trials, Successful Stop trials, and Failed Stop trials could predict sustained attention across three timepoints. However, the predictive performances of networks derived from Go trials were higher than those from Successful Stop and Failed Stop trials. This suggests that sustained attention is particularly crucial during Go trials when participants need to respond to the Go signal. In contrast, although Successful Stop and Failed Stop trials also require sustained attention, these tasks primarily involve inhibitory control along with sustained attention.

      Taken together, these findings underscore the specificity of the predictive networks of sustained attention. We have updated these results in the Supplementary Materials (Pages 3-5 and Page 7 ):

      Method

      CPM analysis using Failed stop trials

      We performed another CPM analysis using Failed stop trials using gPPI matrix obtained from the second GLM, described in the main text. The CPM analysis was conducted using 10-fold CV, 5-fold CV and leave-site-out CV.

      Results

      CPM predictive performance under Failed stop trials

      Positive, negative, and combined networks derived from Failed stop trials significantly predicted ICV: at age 14 (r = 0.10, P = 0.033; r = 0.19, P < 0.001; and r = 0.17, P < 0.001, respectively), at age 19 (r = 0.21; r = 0.18; and r = 0.21, all P < 0.001, respectively), and at age 23 (r = 0.33, r = 0.35, and r = 0.36, respectively, all P < 0.001). We obtained similar results using a 5-fold CV and leave-site-out CV (Table S6).

      Discussion

      Specificity of the prediction of predictive networks

      We found that task-related function connectivity derived from Go trials, Successful stop trials, and Failed stop trials successfully predicted sustained attention across three timepoints. However, predictive performances of predictive networks derived from Go trials were higher than those derived from Successful stop trials and Failed stop trials. These results suggest that sustained attention is particularly crucial during Go trials when participants need to respond to the Go signal. In contrast, although Successful Stop and Failed Stop trials also require sustained attention, these tasks primarily involve inhibitory control along with sustained attention.

      (5) The author used PPI to define the connectivity of the network. I am not sure why the author used two GLMs for the PPI analysis separately. In the second GLM, Go trials were treated as an implicit baseline. What does this exactly mean? And the gPPI analysis across the entire brain using the Shen atlas is not clear. Normally, as I understand, the PPI/gPPI is conducted to test the task-modulated connectivity between one seed region and the voxels of the whole rest brain. Did the author perform the PPI for each ROI from Shen atlas? More details about how to use PPI to construct the network are required.

      Thank you for your insightful questions. Here, we’d like to clarify how we applied generalized PPI across the whole brain using the Shen atlas and why we used two separate GLMs for the gPPI analysis.

      Yes, PPI is conducted to test the task-modulated connectivity between one seed region and other brain areas. This method can be both voxel-based and ROI-based. In our study, we performed ROI-based gPPI analysis using Shen atlas with 268 regions. Specifically, we performed the PPI on each seed region of interest (ROI) to estimate the task-related FC between this ROI and the remaining ROI (267 regions) under a specific task condition. By performing this analysis across each ROI in the Shen atlas, we generated a 268 × 268 gPPI matrix for each task condition. The matrices were then transposed and averaged with the original matrices, which yielded symmetrical matrices, which were subsequently used for CPM analysis.

      Regarding the use of two separate GLMs for the gPPI analysis, our study aimed to define the task-related FC under two conditions: Go trials and Successful stop trials. The first GLM including Go trials was built to estimate the gPPI during Go trials. However, due to the high frequency of Go trials in the stop signal task, it is common to regard the Go trials as an implicit baseline, as in previous IMAGEN studies (D'Alberto et al., 2018; Whelan et al., 2012). Therefore, to achieve a more accurate estimation of FC during Successful stop trials, we built a second GLM specifically for these trials. Accordingly, we have updated it in the Method Section in the main text on Page 16.

      Method

      2.5 Generalized psychophysiological interaction (gPPI) analysis

      In this study, we adopted gPPI analysis to generate task-related FC matrices and applied CPM analysis to investigate predictive brain networks from adolescents to young adults. PPI analysis describes task-dependent FC between brain regions, traditionally examining connectivity between a seed region of interest (ROI) and the voxels of the whole rest brain. However, this study conducted a generalized PPI analysis, which is on ROI-to-ROI basis (Di et al., 2021), to yield a gPPI matrix across the whole brain instead of just a single seed region.

      Given the high frequency of Go trials in SST, it is common to treat Go trials as an implicit baseline in previous IMAGEN studies (D'Alberto et al., 2018; Whelan et al., 2012). Hence, we built a separate GLM for Successful stop trials, which included two task regressors (Failed and Successful stop trials) and 36 nuisance regressors.

      (6) Why did the author use PPI to construct the network, rather than the other similar methods, for example, beta series correlation (BSC)?

      Thanks for your question. PPI is an approach used to calculate the functional connectivity (FC) under a specific task (i.e., task-related FC). Although most brain connectomic research has utilized resting-state FC (e.g., beta series correlation), FC during task performance has demonstrated superiority in predicting individual behaviours and traits,  due to its potential to capture more behaviourally relevant information (Dhamala et al., 2022; Greene et al., 2018; Yoo et al., 2018). Specifically, Zhao et al. (2023) suggested that task-related FC outperforms both typical task-based and resting-state FC in predicting individual differences. Therefore, we chose to use task-related FC to predict sustained attention over time. We have updated it in the Introduction on Page 5.

      Introduction

      Although most brain connectomic research has utilized resting-state fMRI data, functional connectivity (FC) during task performance has demonstrated superiority in predicting individual behaviours and traits, due to its potential to capture more behaviourally relevant information (Dhamala et al., 2022; Greene et al., 2018; Yoo et al., 2018). Specifically, Zhao et al. (2023) suggested that task-related FC outperforms both typical task-based and resting-state FC in predicting individual differences. Hence, we applied task-related FC to predict sustained attention over time.

      (7) In the section of 'Correlation analysis between the network strength and substance use', the author just described that 'the correlations between xx and xx are shown in Fig5X', and repeated it three times for three correlation results. What exactly are the results? The author should describe the results in detail. And I am wondering whether there are scatter plots for these correlation analyses?

      We’d like to clarify the results in Fig. 5. Fig. 5 illustrates the significant correlations between behaviour and brain activity associated with sustained attention and Cigarette and cannabis use (Cig+CB) after FDR correction. Panel A shows the significant correlation between behaviour level of sustained attention and Cig+CB. Panels B and C show the correlations between brain activity associated with sustained attention and Cig+CB. While Panel B presents the brain activity derived from Go trials, Panel C presents brain activity derived from Successful stop trials. In response to your suggestion, we have described these results in detail on Page 9. We also have included scatter plots for the significant correlations, which are shown in Fig. 5 in Supplementary materials (Fig. S10).

      Results

      (6) Correlation between behaviour and brain to cannabis and cigarette use

      Figs. 5A-C summarizes the results showing the correlation between ICV/brain activity and Cig+CB per timepoint and across timepoints. Fig. 5A shows correlations between ICV and Cig+CB (Tables S14-15). ICV was correlated with Cig+CB at ages 19 (Rho = 0.13, P < 0.001) and 23 (Rho = 0.17, P < 0.001). ICV at ages 14 (Rho = 0.13, P = 0.007) and 19 (Rho = 0.13, P = 0.0003) were correlated with Cig+CB at age 23. Cig+CB at age 19 was correlated with ICV at age 23 (Rho = 0.13, P = 9.38E-05). Fig. 5B shows correlations between brain activity derived from Go trials and Cig+CB (Tables S18-19). Brain activities of positive and negative networks derived from Go trials were correlated with Cig+CB at age 23 (positive network: Rhop = 0.12, P < 0.001; negative network: Rhon = -0.11, P < 0.001). Brain activity of the negative network derived from Go trials at age 14 was correlated with Cig+CB at age 23 (Rhon = -0.16, P = 0.001). Cig+CB at age 19 was correlated with brain activity of the positive network derived from Go trials at age 23 (Rhop = 0.10, P = 0.002). Fig. 5C shows the correlations between brain activity derived from Successful stop and Cig+CB (Tables S18-19). Brain activities of positive and negative networks derived from Successful stop were correlated with Cig+CB at ages 19 (positive network: Rhop = 0.10, P = 0.001; negative network: Rhon = -0.08, P = 0.013) and 23 (positive network: Rhop = 0.13, P < 0.001; negative network: Rhon = -0.11, P = 0.001).

      (8) Lastly, the labels of (A), (B) ... in the figure captions are unclear. The authors should find a better way to place the labels in the caption and keep them consistent throughout all figures.

      Thank you for this valuable comment. We have revised the figure captions in the main text to ensure the labels (A), (B), etc., are placed more clearly and consistently across all figures.

      Reviewer #2 (Public Review):

      While the study largely achieves its aims, several points merit further clarification:

      (1) Regarding connectome-based predictive modeling, an assumption is that connections associated with sustained attention remain consistent across age groups. However, this assumption might be challenged by observed differences in the sustained attention network profile (i.e., connections and related connection strength) across age groups (Figures 2 G-I, Fig. 3 G_I). It's unclear how such differences might impact the prediction results.

      Thank you for your insightful comment. We’d like to clarify that we did not assume that connections associated with sustained attention remain completely consistent across age groups. Indeed, we expected that connections would change across age groups, due to the developmental changes in brain function and structure from adolescence to adulthood. Our focus was on the consistency of individual differences in sustained attention networks over time, recognising that the actual connections within those networks may change. However, we did show that there is some consistency in the specific connections associated with sustained attention over time. Notably, this consistency markedly increases when comparing ages 19 and 23, when developmental factors are less relevant. We support our reasoning above with the following analyses:

      (1) Supplementary materials (Pages 2 and 5), relevant sections highlighted here for emphasis.

      Method

      Comparison of predictive networks identified at one timepoint versus another

      Steiger’s Z value was employed to compare predictive performances of networks identified at different timepoints. This analysis involved comparing the R values derived from networks defined at distinct ages to predict ICV at the same age. For example, we compared the r values of brain networks defined at age 14 when predicting ICV at 19 (i.e., positive network: r = 0.25, negative network: r = 0.25, combined network: r = 0.28) with those R values of brain networks defined at age 19 itself (i.e., positive network: r = 0.16, negative network: r = 0.14, combined network: r = 0.16) derived from Go trials using Steiger's Z test (age 14 → age 19 vs. age 19 → 19). Similarly, comparisons were made between networks defined at age 14 predicting ICV at age 23 and those at age 23 predicting ICV at age 23 (age 14 → age 23 vs. age 23 → 23), as well as between networks defined at age 19 predicting ICV at age 23 and those at age 23 predicting ICV at age 23 (age 19 -> age 23 vs. age 23 -> age 23). These comparisons were performed separately for Go trials and Successful Stop trials.

      Results

      Comparison of predictive performance at different timepoints

      For positive, negative, and combined networks predicting ICV derived from Go trials at age 19, the R values were higher when using predictive networks defined at 19 than those defined at 14 (Z = 3.79, Z = 3.39, Z = 3.99, all P < 0.00071). Similarly, the R values for positive, negative, and combined networks predicting ICV derived from Go trials at age 23 were higher when using predictive networks defined at age 23 compared to those defined at ages 14 (Z = 6.00, Z = 5.96, Z = 6.67, all P < 3.47e-9) or 19 (Z = 2.80, Z = 2.36, Z = 2.57, all P < 0.005).

      At age 19, the R value for the positive network predicting ICV derived from Successful stop trials was higher when using predictive networks defined at 19 compared to those defined at 14 (Z = 1.54, P = 0.022), while the negative and combined networks did not show a significant difference (Z = 0.85, P = 0.398; Z = 2.29, P = 0.123). At age 23, R values for the positive and combined networks predicting ICV derived from Successful stop trials were higher when using predictive networks defined at 23 compared to those defined at 14 (Z = 3.00, Z = 2.48, all P < 3.47e-9) or 19 (Z = 2.52, Z = 1.99, all P < 0.005). However, the R value for the negative network at age 23 did not significantly differ when using predictive networks defined at 14 (Z = 1.80, P = 0.072) or 19 (Z = 1.48, P = 0.138).

      These results indicate that some specific pairwise connections associated with sustained attention at earlier ages, such as 14 and 19, are still relevant as individuals grow older. However, some connections are not optimal for good sustained attention at older ages. That is, the brain reorganizes its connection patterns to maintain optimal functionality for sustained attention as it matures.

      (2) Consistency of Individual Differences:

      We found individual differences in ICV were significantly correlated between the three timepoints (Fig. 1B). In addition, we calculated the correlations of network strength of predictive networks predicting sustained attention derived from Go trials and Successful trials between each timepoints. We found that the correlations of network strength for predictive networks (derived from Go trials and Successful trials) were also significant (all P < 0.003). We have updated these results in the main text (Pages 7-8) and Supplementary Materials (Table S7).

      (2) Cross-sectional brain connectivity

      In addition, we found that network strength of positive, negative, and combined networks derived from Go trials was significantly correlated between the three timepoints (Table S7, all P < 0.003).

      In addition, we found that network strength of positive, negative, and combined networks derived from Successful stop trials was significantly correlated between the three timepoints (Table S7, all P < 0.001).

      (3) Predictive networks across timepoints: Predictive networks defined at age 14 were successfully applied to predict ICV at ages 19 and 23. Similarly, predictive networks defined at age 19 were successfully applied to predict ICV at age 23 (Fig. 4). These results reflect the robustness of the brain network associated with sustained attention over time.

      (4) Dice coefficient analysis: We calculated the Dice coefficient to quantify the similarity of predictive networks across the three timepoints. Connections in the sustained attention networks were significantly similar from ages 14 to 23 (Table S13), despite relatively few overlapping edges over time (as discussed in Supplementary Materials on Page 6).

      (5) Global brain activation: Based on these findings, we indicate that sustained attention relies on global brain activation (i.e., network strength) rather than specific regions or networks (see also (Zhao et al., 2021)).

      In summary, brain network connections undergo change and are not completely consistent across time. However, individual differences in sustained attention and its network are consistent across time, as we found that 1) the brain reorganizes its connection patterns to maintain optimal functionality for sustained attention as it matures. 2) ICV and network strength of sustained attention network were significantly correlated between each timepoint. 3) Sustained attention networks identified from previous timepoints could predict ICV in the subsequent timepoint. 4) Dice coefficient analysis indicated that the edges in the sustained attention networks were significantly similar from ages 14 to 23. 5) Sustained attention networks function as a global activation, rather than specific regions or networks.

      (2) Another assumption of the connectome-based predictive modeling is that the relationship between sustained attention network and substance use is linear and remains linear over development. Such linear evidence from either the literature or their data would be of help.

      Thanks for your valuable suggestion. We'd like to clarify that while CPM assumes a linear relationship between brain and behaviour (Shen et al., 2017), it does not assume that the relationship between the sustained attention network and substance use remains linear over development.

      Our approach in applying CPM to predict sustained attention across different timepoints was based on previous neuroimaging studies (Rosenberg et al., 2016; Rosenberg et al., 2020), which indicated linear associations between brain connectivity patterns and sustained attention using CPM analysis. These findings support the notion of a linear relationship between brain connectivity and sustained attention. In this study, we performed CPM analysis to identify predictive networks predicting sustained attention, not substance use and used the network strength of these predictive networks to represent sustained attention activity.

      To examine the relationship between substance use and sustained attention, as well as its associated brain activity, we conducted correlation analyses and utilized a latent change score model instead of CPM analysis. This decision was informed by cross-sectional studies (Broyd et al., 2016; Lisdahl and Price, 2012) that consistently reported linear associations between substance use and impairments in sustained attention. Additionally, longitudinal research by (Harakeh et al., 2012) indicated a linear relationship between poorer sustained attention and the initiation and escalation of substance use over time.

      Given these previous findings, we assumed a linear relationship between sustained attention and substance use. Our analyses included calculating correlations between substance use and sustained attention, as well as its associated brain activity at each timepoint and across timepoints (Fig. 5). Furthermore, we employed a three-wave bivariable latent change score model, a longitudinal approach, to assess the relationship between substance use and behavirour and brain activity associated with sustained attention (Figs. 6-7). We have added more information in the Introduction to make it more clear on Page 6.

      Introduction

      Additionally, previous cross-sectional and longitudinal studies (Broyd et al., 2016; Harakeh et al., 2012; Lisdahl and Price, 2012) have shown that there are linear relationships between substance use and sustained attention over time. We therefore employed correlation analyses and a latent change score model to estimate the relationship between substance use and both behaviours and brain activity associated with sustained attention.

      (3) Heterogeneity in results suggests individual variability that is not fully captured by group-level analyses. For instance, Figure 1A shows decreasing ICV (better-sustained attention) with age on the group level, while there are both increasing and decreasing patterns on the individual level via visual inspection. Figure 7 demonstrates another example in which the group with a high level of sustained attention has a lower risk of substance use at a later age compared to that in the group with a low level of sustained attention. However, there are individuals in the high sustained attention group who have substance use scores as high as those in the low sustained attention group. This is important to take into consideration and could be a potential future direction for research.

      Thanks for this valuable comment. We appreciate your observation regarding the individual variability that is not fully captured by group-level analyses to some degree. Fig. 1A shows the results from a linear mixed model, which explains group-level changes over time while accounting for the random effect within subjects. Similarly, Fig. 7 shows the group-level association between substance use and sustained attention. We agree that future research could indeed consider individual variability. For example, participants could be categorized based on their consistent trajectories of ICV or substance use (i.e., keep decreasing/increasing) over multiple timepoints. We agree that incorporating individual-level analyses in the future could provide valuable insights and are grateful for your suggestion, which will inform our future research directions.

      The above-mentioned points might partly explain the significant but low correlations between the observed and predicted ICV as shown in Figure 4. Addressing these limitations would help enhance the study's conclusions and guide future research efforts.

      We have updated the text in the Discussion on Page 13:

      Discussion

      However, there are still some individual variabilities not captured in this study, which could be attributed to the diversity in genetic, environmental, and developmental factors influencing sustained attention and substance use. Future research should aim to explore these variabilities in greater depth to gain better understanding of the relationship between sustained attention and substance use.

      Reviewer #3 (Public Review):

      Weaknesses: It's questionable whether the prediction approach (i.e., CPM), even when combined with longitudinal data, can establish causality. I recommend removing the term 'consequence' in the abstract and replacing it with 'predict'. Additionally, the paper could benefit from enhanced rigor through additional analyses, such as testing various thresholds and conducting lagged effect analyses with covariate regression.

      Thank you for your comment. We have replaced “consequence” by “predict” in the abstract.

      Abstract

      Previous studies were predominantly cross-sectional or under-powered and could not indicate if impairment in sustained attention was a predictor of substance-use or a marker of the inclination to engage in such behaviour.

      Reviewer #3 (Recommendations For The Authors):

      (1) The connectivity analysis predicts both baseline and longitudinal attention measures. However, given the high correlation in attention abilities across the three time-points, it's unclear whether the connectivity predicts shared variations of attention across three time points. It would be insightful to assess if predictions at the 2nd and 3rd-time points remained  significant after controlling for attention abilities at the initial time point.

      Thanks for your comments. We performed the CPM analysis to predict ICV at the 2nd and 3rd timepoint, controlling for ICV at age 14 as a covariate. We found that controlling for ICV at age 14, positive, negative, and combined networks derived from Successful stop trials defined at age 14 still predicted ICV at ages 19 and 23. In addition, positive, negative, and combined networks derived from Successful stop trials defined at age 19 predicted ICV at age 23. In addition, positive, negative, and combined networks derived from Go trials defined at age 19 still predicted ICV at age 23, after controlling for ICV at age 14. However, positive, negative, and combined networks derived from Go trials defined at age 14 had lower predictive performances in predicting ICV at ages 19 and 23, after controlling for ICV at age 14. Notably, controlling for ICV at the initial timepoint did not significantly impact the performances of predictive networks derived from Successful stop trials. Accordingly, we have added this analysis and the results in the Supplementary Materials (Pages 3 and 5).

      Method

      Prediction across timepoints controlling for ICV at age 14

      To examine whether connectivity predictors shared variations of sustained attention across timepoints, we applied predictive models developed at ages 14 and 19 to predict ICV at subsequent timepoints controlling for ICV at age 14. Specifically, we used predictive models (including parameters and selected edges) developed at age 14 to predict ICV at ages 19 and 23 separately. First, we calculated the network strength using the gPPI matrix at ages 19 and 23 based on the selected edges identified from CPM analysis at age 14. We then estimated the predicted ICV at ages 19 and 23 by applying the linear model parameters (slope and intercept) obtained from CPM analysis at age 14 to the network strength. Finally, we evaluated the predictive performance by calculating the partial correlation between the predicted and observed values at ages 19 and 23, controlling for ICV at age 14. Similarly, we applied models developed at age 19 to predict ICV at age 23, also controlling for ICV at age 14. To assess the significance of the predictive performance, we used a permutation test, shuffling the predicted ICV values and calculating partial correlation to general a random distribution over 1,000 iterations.

      Results

      Predictions across timepoints controlling for ICV at age 14

      Positive and combined networks derived from Go trials defined at age 14 predicted ICV at ages 19 (r = 0.10, P = 0.028; r = 0.08, P = 0.047) but negative network did not (r = 0.06, P = 0.119). Positive network derived from Go trials defined at age 14 predicted ICV at age 23 (r = 0.11, P = 0.013) but negative and combined networks did not (r = 0.04, P = 0.187; r = 0.08, P = 0.056).  Positive, negative, and combined networks derived from Go trials defined at age 19 predicted ICV at age 23 (r = 0.22, r = 0.19, and r = 0.22, respectively, all P < 0.001).

      Positive, negative, and combined networks derived from Successful stop trials defined at age 14 predicted ICV at age 19 (r = 0.08, P = 0.036; r = 0.10, P = 0.012; r = 0.11, P = 0.009) and 23 (r = 0.11, P = 0.005; r = 0.13, P = 0.005; r = 0.13, P = 0.017) respectively. Positive, negative, and combined networks derived from Successful stop trials defined at age 19 predicted ICV at age 23 (r = 0.18, r = 0.18, and r = 0.17, respectively, all P < 0.001).

      (2) In the Results section, a significance threshold of p = 0.01 was used for the CPM analysis. It would be beneficial to test the stability of these findings using alternative thresholds such as p = 0.05 or p = 0.005.

      We appreciate this insightful comment. We appreciate the suggestion to test the stability of our findings using alternative significance thresholds. Indeed, we have already conducted CPM analyses using a range of thresholds, including 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, and 0.0001 (see Table S8 in supplementary Materials). The results were similar across different thresholds. Following prior studies (Feng et al., 2024; Ren et al., 2021; Yoo et al., 2018) which used P < 0.01 for feature selection, we chose to focus on the threshold of P < 0.01 for our main analysis. Following your suggestion, we have highlighted this in the Method section on Pages 17-18.

      Method

      2.6.1 ICV prediction

      The r value with an associated P value for each edge was obtained, and a threshold P = 0.01 (Feng et al., 2024; Ren et al., 2021; Yoo et al., 2018) was set to select edges.

      2.6.2 Three cross-validation schemes

      In addition, we conducted the CPM analysis using a range of thresholds for feature selection and observed similar results across different thresholds (See Supplementary Materials Table S8).

      (3) Could you clarify if you used one sub-sample to extract connectivity related to sustained attention and then used another sub-sample to predict substance use with attention-related connectivity?

      Thank you very much for the question. We used the same sample to extract the brain network strength and estimated the correlation with substance use using both the Spearman correlation and latent change score model across three timepoints. We controlled for covariates including sex, age, and scan site at the same time. Accordingly, we have clarified this in the Method section on Page 20. We note that the CPM analyses were conducted using cross-validation, plus a leave-site-out analysis.

      Method

      2.7.3 Correlation between network strength and substance use

      It is worth noting that all the correlations between substance use and sustained attention were conducted using the same sample across three timepoints.

      (4) Could you clarify whether you have regressed covariates in the lagged effects analysis of part 7?

      Thanks for this question. Yes, we confirmed that we controlled the covariates including age, sex and scan sites in the latent change score model. We have described them more clearly now in the Method section (Page 18).

      Method

      2.7.3 Correlation between network strength and substance use

      Additionally, cross-lagged dynamic coupling (i.e., bidirectionality) was employed to explore individual differences in the relationships between substance use and linear changes in ICV/brain activity, as well as the relationship between ICV/brain activity and linear change in substance use. The model accounted for covariates such as age, sex and scan sites.

      References:

      Broyd, S.J., van Hell, H.H., Beale, C., Yucel, M., Solowij, N., 2016. Acute and Chronic Effects of Cannabinoids on Human Cognition-A Systematic Review. Biol Psychiatry 79, 557-567.

      Chamberlain, S.R., Odlaug, B.L., Schreiber, L.R.N., Grant, J.E., 2012. Association between Tobacco Smoking and Cognitive Functioning in Young Adults. The American Journal on Addictions 21, S14-S19.

      Crean, R.D., Crane, N.A., Mason, B.J., 2011. An evidence based review of acute and long-term effects of cannabis use on executive cognitive functions. J Addict Med 5, 1-8.

      D'Alberto, N., Chaarani, B., Orr, C.A., Spechler, P.A., Albaugh, M.D., Allgaier, N., Wonnell, A., Banaschewski, T., Bokde, A.L.W., Bromberg, U., Buchel, C., Quinlan, E.B., Conrod, P.J., Desrivieres, S., Flor, H., Frohner, J.H., Frouin, V., Gowland, P., Heinz, A., Itterman, B., Martinot, J.L., Paillere Martinot, M.L., Artiges, E., Nees, F., Papadopoulos Orfanos, D., Poustka, L., Robbins, T.W., Smolka, M.N., Walter, H., Whelan, R., Schumann, G., Potter, A.S., Garavan, H., 2018. Individual differences in stop-related activity are inflated by the adaptive algorithm in the stop signal task. Hum Brain Mapp 39, 3263-3276.

      Dhamala, E., Yeo, B.T.T., Holmes, A.J., 2022. Methodological Considerations for Brain-Based Predictive Modelling in Psychiatry. Biological Psychiatry.

      Di, X., Zhang, Z.G., Biswal, B.B., 2021. Understanding psychophysiological interaction and its relations to beta series correlation. Brain Imaging and Behavior 15, 958-973.

      Dougherty, D.M., Mathias, C.W., Dawes, M.A., Furr, R.M., Charles, N.E., Liguori, A., Shannon, E.E., Acheson, A., 2013. Impulsivity, attention, memory, and decision-making among adolescent marijuana users. Psychopharmacology (Berl) 226, 307-319.

      Esterman, M., Rothlein, D., 2019. Models of sustained attention. Curr Opin Psychol 29, 174-180.

      Feng, Q., Ren, Z., Wei, D., Liu, C., Wang, X., Li, X., Tie, B., Tang, S., Qiu, J., 2024. Connectome-based predictive modeling of Internet addiction symptomatology. Soc Cogn Affect Neurosci 19.

      Greene, A.S., Gao, S., Scheinost, D., Constable, R.T., 2018. Task-induced brain state manipulation improves prediction of individual traits. Nature Communications 9, 2807.

      Harakeh, Z., de Sonneville, L., van den Eijnden, R.J., Huizink, A.C., Reijneveld, S.A., Ormel, J., Verhulst, F.C., Monshouwer, K., Vollebergh, W.A., 2012. The association between neurocognitive functioning and smoking in adolescence: the TRAILS study. Neuropsychology 26, 541-550.

      Hart, C.L., van Gorp, W., Haney, M., Foltin, R.W., Fischman, M.W., 2001. =. Neuropsychopharmacology 25, 757-765.

      Lawrence, N.S., Ross, T.J., Stein, E.A., 2002. Cognitive mechanisms of nicotine on visual attention. Neuron 36, 539-548.

      Lisdahl, K.M., Price, J.S., 2012. Increased marijuana use and gender predict poorer cognitive functioning in adolescents and emerging adults. J Int Neuropsychol Soc 18, 678-688.

      O'Halloran, L., Cao, Z.P., Ruddy, K., Jollans, L., Albaugh, M.D., Aleni, A., Potter, A.S., Vahey, N., Banaschewski, T., Hohmann, S., Bokde, A.L.W., Bromberg, U., Buchel, C., Quinlan, E.B., Desrivieres, S., Flor, H., Frouin, V., Gowland, P., Heinz, A., Ittermann, B., Nees, F., Orfanos, D.P., Paus, T., Smolka, M.N., Walter, H., Schumann, G., Garavan, H., Kelly, C., Whelan, R., 2018. Neural circuitry underlying sustained attention in healthy adolescents and in ADHD symptomatology. Neuroimage 169, 395-406.

      Potter, A.S., Newhouse, P.A., 2008. Acute nicotine improves cognitive deficits in young adults with attention-deficit/hyperactivity disorder. Pharmacol Biochem Behav 88, 407-417.

      Ren, Z., Daker, R.J., Shi, L., Sun, J., Beaty, R.E., Wu, X., Chen, Q., Yang, W., Lyons, I.M., Green, A.E., Qiu, J., 2021. Connectome-Based Predictive Modeling of Creativity Anxiety. Neuroimage 225, 117469.

      Rosenberg, M.D., Finn, E.S., Scheinost, D., Papademetris, X., Shen, X., Constable, R.T., Chun, M.M., 2016. A neuromarker of sustained attention from whole-brain functional connectivity. Nat Neurosci 19, 165-171.

      Rosenberg, M.D., Scheinost, D., Greene, A.S., Avery, E.W., Kwon, Y.H., Finn, E.S., Ramani, R., Qiu, M., Constable, R.T., Chun, M.M., 2020. Functional connectivity predicts changes in attention observed across minutes, days, and months. Proc Natl Acad Sci U S A 117, 3797-3807.

      Shen, X., Finn, E.S., Scheinost, D., Rosenberg, M.D., Chun, M.M., Papademetris, X., Constable, R.T., 2017. Using connectome-based predictive modeling to predict individual behavior from brain connectivity. Nat Protoc 12, 506-518.

      Valentine, G., Sofuoglu, M., 2018. Cognitive Effects of Nicotine: Recent Progress. Curr Neuropharmacol 16, 403-414.

      Verbruggen, F., Aron, A.R., Band, G.P.H., Beste, C., Bissett, P.G., Brockett, A.T., Brown, J.W., Chamberlain, S.R., Chambers, C.D., Colonius, H., Colzato, L.S., Corneil, B.D., Coxon, J.P., Dupuis, A., Eagle, D.M., Garavan, H., Greenhouse, I., Heathcote, A., Huster, R.J., Jahfari, S., Kenemans, J.L., Leunissen, I., Li, C.S.R., Logan, G.D., Matzke, D., Morein-Zamir, S., Murthy, A., Pare, M., Poldrack, R.A., Ridderinkhof, K.R., Robbins, T.W., Roesch, M.R., Rubia, K., Schachar, R.J., Schall, J.D., Stock, A.K., Swann, N.C., Thakkar, K.N., van der Molen, M.W., Vermeylen, L., Vink, M., Wessel, J.R., Whelan, R., Zandbelt, B.B., Boehler, C.N., 2019. A consensus guide to capturing the ability to inhibit actions and impulsive behaviors in the stop-signal task. Elife 8.

      Whelan, R., Conrod, P.J., Poline, J.B., Lourdusamy, A., Banaschewski, T., Barker, G.J., Bellgrove, M.A., Buchel, C., Byrne, M., Cummins, T.D., Fauth-Buhler, M., Flor, H., Gallinat, J., Heinz, A., Ittermann, B., Mann, K., Martinot, J.L., Lalor, E.C., Lathrop, M., Loth, E., Nees, F., Paus, T., Rietschel, M., Smolka, M.N., Spanagel, R., Stephens, D.N., Struve, M., Thyreau, B., Vollstaedt-Klein, S., Robbins, T.W., Schumann, G., Garavan, H., Consortium, I., 2012. Adolescent impulsivity phenotypes characterized by distinct brain networks. Nat Neurosci 15, 920-925.

      Yoo, K., Rosenberg, M.D., Hsu, W.T., Zhang, S., Li, C.R., Scheinost, D., Constable, R.T., Chun, M.M., 2018. Connectome-based predictive modeling of attention: Comparing different functional connectivity features and prediction methods across datasets. Neuroimage 167, 11-22.

      Young, J.W., Finlayson, K., Spratt, C., Marston, H.M., Crawford, N., Kelly, J.S., Sharkey, J., 2004. Nicotine improves sustained attention in mice: evidence for involvement of the alpha7 nicotinic acetylcholine receptor. Neuropsychopharmacology 29, 891-900.

      Zhao, W., Makowski, C., Hagler, D.J., Garavan, H.P., Thompson, W.K., Greene, D.J., Jernigan, T.L., Dale, A.M., 2023. Task fMRI paradigms may capture more behaviorally relevant information than resting-state functional connectivity. Neuroimage, 119946.

      Zhao, W., Palmer, C.E., Thompson, W.K., Chaarani, B., Garavan, H.P., Casey, B.J., Jernigan, T.L., Dale, A.M., Fan, C.C., 2021. Individual Differences in Cognitive Performance Are Better Predicted by Global Rather Than Localized BOLD Activity Patterns Across the Cortex. Cereb Cortex 31, 1478-1488.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Reviews):

      Summary:

      This paper by Schommartz and colleagues investigates the neural basis of memory reinstatement as a function of both how recently the memory was formed (recent, remote) and its development (children, young adults). The core question is whether memory consolidation processes as well as the specificity of memory reinstatement differ with development. A number of brain regions showed a greater activation difference for recent vs. remote memories at the long versus shorter delay specifically in adults (cerebellum, parahippocampal gyrus, LOC). A different set showed decreases in the same comparison, but only in children (precuneus, RSC). The authors also used neural pattern similarity analysis to characterize reinstatement, though I have substantive concerns about how this analysis was performed and as such will not summarize the results. Broadly, the behavioural and univariate findings are consistent with the idea that memory consolidation differs between children and adults in important ways, and takes a step towards characterizing how.

      Strengths:

      The topic and goals of this paper are very interesting. As the authors note, there is little work on memory consolidation over development, and as such this will be an important data point in helping us begin to understand these important differences. The sample size is great, particularly given this is an onerous, multi-day experiment; the authors are to be commended for that. The task design is also generally well controlled, for example as the authors include new recently learned pairs during each session.

      Weaknesses:

      As noted above, the pattern similarity analysis for both item and category-level reinstatement was performed in a way that is not interpretable given concerns about temporal autocorrelation within the scanning run. Below, I focus my review on this analytic issue, though I also outline additional concerns.

      We thank the reviewer for both the positive and critical appraisal of our paper.

      (1) The pattern similarity analyses were not done correctly, rendering the results uninterpretable (assuming my understanding of the authors' approach is correct).

      a. First, the scene-specific reinstatement index: The authors have correlated a neural pattern during a fixation cross (delay period) with a neural pattern associated with viewing a scene as their measure of reinstatement. The main issue with this is that these events always occurred back-to-back in time. As such, the two patterns will be similar due simply to the temporal autocorrelation in the BOLD signal. Because of the issues with temporal autocorrelation within the scanning run, it is always recommended to perform such correlations only across different runs. In this case, the authors always correlated patterns extracted from the same run, which moreover have temporal lags that are perfectly confounded with their comparison of interest (i.e., from Fig 4A, the "scene-specific" comparisons will always be back-to-back, having a very short temporal lag; "set-based" comparisons will be dispersed across the run, and therefore have a much higher lag). The authors' within-run correlation approach also yields correlation values that are extremely high - much higher than would be expected if this analysis was done appropriately. The way to fix this would be to restrict the analysis to only cross-run comparisons, but I don't believe this is possible unfortunately given the authors' design; I believe the target (presumably reinstated) scene only appears once during scanning, so there is no separate neural pattern during the presentation of this picture that they can use. For these reasons, any evidence for "significant scene-specific reinstatement" and the like is completely uninterpretable and would need to be removed from the paper.

      We thank the reviewer for this important input. We acknowledge that our study design leads to temporal autocorrelation in the BOLD signal when calculating RSA between fixation and scene time windows. We also recognize that we cannot interpret the significance of scene-specific reinstatement compared to zero and have accordingly removed this information. Nevertheless, our primary objective was to investigate changes in scene-specific reinstatement in relation to the different time delays of retrieval. Given that the retrieval procedure is the same over time and presumably similarly influenced by temporal autocorrelations, we argue that our results must be attributed to the relative differences in reinstatement across recent and remote trials. Bearing this in mind, we argue that our results can be interpreted in terms of delay-related changes in reinstatement. This information is discussed in pp. 21, 40 of the manuscript.

      We agree with the reviewer that cross-run comparisons would be extremely interesting. This could be achieved by introducing the same items repeatedly across different runs, which was not possible in our current setup since we were interested in single exposure retrieval and practical time restriction in scanning children. We have  introduced this idea in Limitations and Discussion sections (pp. 40, 44) of the manuscript to inform future studies.

      Finally, thanks to the reviewer’s comment, we identified a bug in the final steps of our RSA calculation. Fischer’s z-transformation was incorrectly applied to r-1 values, resulting in abnormally high values. We apologize for this error. We have revised the scripts and rectified the bug by correctly applying Fischer’s z-transformation to the r similarity values. We also adjusted the methods description figure accordingly (Figure 5, p. 22). This adjustment led to slightly altered reinstatement indices. Nevertheless, the overall pattern of delay-related attenuation in the scene-specific reinstatement index, observed in both children and adults, remains consistent. Similarly, we observed gist-like reinstatement uniquely in children.

      b. From a theoretical standpoint, I believe the way this analysis was performed considering the fixation and the immediately following scene also means that the differences between recent and remote could have to do with either the reactivation (processes happening during the fixation, presumably) or differences in the processing of the stimulus itself (happening during the scene presentation). For example, people might be more engaged with the more novel scenes (recent) and therefore process those scenes more; such a difference would be interpreted in this analysis as having to do with reinstatement, but in fact could be just related to the differential scene processing/recognition, etc.

      Thank you for your insightful comments. We acknowledge the theoretical concerns raised about distinguishing between the effects of reactivation processes occurring during fixation and differential processing of the stimulus itself during scene presentation. Specifically, the notion that engagement levels with recent scenes could result in enhanced processing, which might be misattributed to memory reinstatement mechanisms.

      We argue, however, that during scene presentation, scenes are processed more “memory-wise” rather than “perception-wise”, since both recent and remote memories are well-learned, as we included only correctly recalled memories in the analysis.

      We concur that scene presentations entail perceptual processing; however, such processing would be consistent across all items, given that they were presented with the same repeated learning procedure, rendering them equally familiar to participants. In addition, we would argue that distinct activation patterns elicited during varying delays are more likely attributable to memory-related processing, since participants actively engaged in a memory-based decision-making task during these intervals. We have incorporated this rationale into the discussion section of our manuscript (p. 40).

      With this in mind, we hypothesized that in case of “memory-wise” processing, the neural engagement during the scene time window should be higher for remote compared to recent  items, and this increases with passing time as more control and effort should be exhibited during retrieval due to reorganized and distributed nature of memories. If the scenes are processed more “perception-wise”, we would expect higher neural engagement during the retrieval of recent compared to remote items. Our exploratory analysis (detailed overview in supplementary materials, Figure S3, Table S9) revealed a higher neural activation for remote compared to recent items in medial temporal, prefrontal, occipital and cerebellar brain regions, supporting the notion of “memory-wise” processes during scene time window. However, this exploratory analysis cannot provide a direct solution to the reviewer’s concern as our paradigm per se cannot arbitrate between “memory-wise” and “perception-wise” nature of retrieval. We added the point to the discussion (see p. 40).

      c. For the category-based neural reinstatement:

      (1) This suffers from the same issue of correlations being performed within the run. Again, to correct this the authors would need to restrict comparisons to only across runs (i.e., patterns from run 1 correlated with patterns for run 2 and so on). With this restriction, it may or may not be possible to perform this analysis, depending upon how the same-category scenes are distributed across runs. However, there are other issues with this analysis, as well.

      (2) This analysis uses a different approach of comparing fixations to one another, rather than fixations to scenes. The authors do not motivate the reason for this switch. Please provide reasoning as to why fixation-fixation is more appropriate than fixation-scene similarity for category-level reinstatement, particularly given the opposite was used for item-level reinstatement. Even if the analyses were done properly, it would remain hard to compare them given this difference in approach.

      (3) I believe the fixation cross with itself is included in the "within category" score  Is this not a single neural pattern correlated with itself, which will yield maximal similarity (pearson r=1) or minimal dissimilarity (1-pearson r=0)? Including these comparisons in the averages for the within-category score will inflate the difference between the "within-category" and "between-category" comparisons. These (e.g., forest1-forest1) should not be included in the within-category comparisons considered; rather, they should be excluded, so the fixations are always different but sometimes the comparisons are two retrievals of the same scene type (forest1-forest2), and other times different scene types (forest1-field1)

      (4) It is troubling that the results from the category reinstatement metric do not seem to conceptually align with past work; for example, a lot of work has shown category-level reinstatement in adults. Here the authors do not show any category-level reinstatement in adults (yet they do in children), which generally seems extremely unexpected given past work and I would guess has to do with the operationalization of the metric.

      Thank you for this important input regarding category-based reinstatement.

      (1) The distribution of within-category items across runs was approximately similar and balanced. Additionally, within runs, they were presented randomly without close temporal proximity. Based on this arrangement, we believe that the issue of close temporal autocorrelation, as pointed out by the reviewer in the context of scene-specific reinstatement, may not apply to the same extent here. Again, our focus is not on the absolute level of category-based reinstatement, but the relative difference across conditions (recent vs. remote short delay vs. remote long delay) which are equally impacted by the autocorrelations.  

      (2) We apologize for not motivating this analysis further. Whereas the scene-reinstatement index (i.e., fixation to scene correlation) gives us a measure of the pre-activation of a concrete scene (e.g., a yellow forest in autumn), the gist-like reinstatement gives us a measure of the pre-activation of a whole category of scenes (e.g., forests). Critically, our window of interest is the fixation period for both sets of analysis (in the absence of any significant visual input). The scene-specific reinstatement uses the scene window as a neural template against which the fixation period can be compared, while the gist-like reinstatement compares similarity of reactivation pattern for trials from the same category but differ in the exact memory content. The reinstatement of more generic, gist-like memory (e.g., forest) across multiple trials should yield more similar neural activation patterns. Significant gist-like reinstatement would suggest that neural patterns for scenes within the same category are more generic, as indicated by higher similarity among them. On the other hand, a more detailed reinstatement of specific types of forests (e.g., a yellow forest in autumn, green pine trees, a bare-leaved forest in spring, etc.) that differ in various dimensions could result in neural activation patterns that are as dissimilar as those seen in the reinstatement of scenes from entirely different categories. Through this methodology, we could distinguish between more generic, gist-like reinstatement and more specific, detailed reinstatement. This is now clarified in the manuscript, see p. 25.

      (3) We apologize for the confusion caused by the figure and analysis description. In our analysis, we indeed excluded the correlation of the fixation cross with itself. Consequently, the diagonal in the figure should be blank to indicate this. This is now revised in the manuscript (Figure 7B and in Methods).

      (4) We appreciate your concern and recognize that the terminology we used might not align perfectly with the conventional understanding of category-based reinstatement. Typically, category-level neural representations (as discussed in Polyn et al., 2005; Jafarpour et al., 2014; among others) are investigated to identify specific brain areas associated with encoding/perception of scenes or faces. Our aim, however, was to explore the mnemonic reinstatement of highly detailed scenes that were elaborately encoded, with the hypothesis that substantial representational transformations would occur over time and vary with age. This hypothesis is based on the memory literature, including the Fuzzy-Trace Theory, the Contextual Binding Theory, and the Trace Transformation Theory (Brainerd & Reyna, 1998; Yonelinas, 2019; Moscovitch & Gilboa, 2023). Therefore, we renamed 'category-based' reinstatement to 'gist-like' reinstatement, which clarifies our concept and better aligns it with existing literature.

      We anticipated that young adults, having the ability to retain detailed narratives post-encoding, would demonstrate a reinstatement of scenes with distinct details, making these scenes dissimilar from each other (see similar findings in Sommer et al., 2021). In contrast, given the anticipated lesser strategic elaboration during learning in children, we hypothesized that they would demonstrate a shallower, more gist-like reinstatement (for instance, children recalling a forest or a field in a general sense without specific details or vivid imagery). This could result in higher category-based similarity, as children might reinstate a more generic forest concept.

      We did not gather additional data on the verbal quality of reinstatement due to the limited scanning time available for children, so these assumptions remain unverified. However, anecdotal observations post-retrieval indicated that adults often reported very vivid scenes associated with clear narrative recall. In contrast, children frequently described more vague memories (e.g., “I know it was a forest”) without specific details. Future studies should include measures to assess the quality of reinstatement, potentially outside the scanning environment.

      (2) I did not see any compelling statistical evidence for the claim of less robust consolidation in children.

      Specifically in terms of the behavioral results of retention of the remote items at 1 vs 14 days, shown in Figure 2B, the authors conclude that memory consolidation is less robust in children (line 246). Yet they do not report statistical evidence for this point, as there was no interaction of this effect with the age group. Children had worse memory than adults overall (in terms of a main effect - i.e. across recent and remote items). If it were consolidation-specific, one would expect that the age differences are bigger for the remote items, and perhaps even most exaggerated for the 14-day-old memories. Yet this does not appear to be the case based on the data the authors report. Therefore, the behavioral differences in retention do not seem to be consolidation specific, and therefore might have more to do with differences in encoding fidelity or retrieval processes more generally across the groups. This should be considered when interpreting the findings.

      Thank you for highlighting this important issue. We acknowledge that our initial description and depiction of our behavioral findings may not have effectively conveyed the main message about memory consolidation. Therefore, we have revised the behavioral results section (see pp. 12-14) to communicate our message more clearly.

      As detailed in the methods section, we reported retention rates only for those items that were correctly (100%) learned on day 0, day 1, and day 14. This approach meant that different participants had varying numbers of items learned correctly. However, this strategy allowed us to address our primary question: whether memory consolidation, based on all items initially encoded successfully, is comparably robust between groups.

      To illustrate the change in retention rate slopes over time for recently learned items (i.e., immediately 30 minutes after learning), short delay remote, and long delay remote items, relative to the initially correctly learned items more clearly and straightforward, we conducted the following analysis: after observing no differences between sessions in both age groups for recent items on days 1 and 14, we combined the recent items. This approach enabled us to investigate how the slope of memory retention for initially correctly learned items (with a baseline of 100%) changes over time. We observed a significant interaction between item type (recent, short delay remote, long delay remote) and group (F(3,250) = 17.35, p < .001, w2 = .16). The follow up of this interaction revealed significantly less robust memory consolidation across all delay times in children compared to young adults. This information is added in the manuscript in pp. 12-14. We have also updated the figures, incorporating the baseline of 100% correct performance.

      (3) Please clarify which analyses were restricted to correct retrievals only. The univariate analyses states that correct and incorrect trials were modelled separately but does not say which were considered in the main contrast (I assume correct only?). The item specific reinstatement analysis states that only correct trials were considered, but the category-level reinstatement analysis does not say. Please include this detail.

      Thank you for bringing this to our attention. We indeed limited our analysis – including univariate, specific reinstatement, and gist-like analyses – to only correctly remembered items. This decision was made because our goal was to observe delay-related changes in the neural correlates of correct memories, which are potentially stronger. We have incorporated this information into the manuscript.

      (4) To what extent could performance differences be impacting the differences observed across age groups? I think (see prior comment) that the analyses were probably limited to correct trials, which is helpful, but still yields pretty big differences across groups in terms of the amount of data going into each analysis. In general, children showed more attenuated neural effects (e.g., recent/remote or session effects); could this be explained by their weaker memory? Specifically, if only correct trials are considered that means that fewer trials would be going into the analysis for kids, especially for the 14-day remote memories, and perhaps pushing the remove > recent difference for this condition towards 0. The authors might be able to address this analytically; for example, does the remote > recent difference in the univariate data at day 14 correlate with day 14 memory?

      Thank you for pointing this out. Indeed, there was a significant relationship between remote > recent difference in the univariate data and memory performance at day 14 across both age group (see Figure 4C-D). The performance of all participants including children was above chance level for remote trial on day 14. In addition, although number of remote trials was lower in children (18 trials on average) in comparison to adults (22 trials on average), we believe that the number of remote trials was not too low or different across groups for the contrast.

      (5) Some of the univariate results reporting is a bit strange, as they are relying upon differences between retrieval of 1- vs. 14-day memories in terms of the recent vs. report difference, and yet don't report whether the regions are differently active for recent and remote retrieval. For example, in Figure 3A, neither anterior nor posterior hippocampus seem to be differentially active for recent vs. remote memories for either age group (i.e., all data is around 0). This difference from zero or lack thereof seems important to the message - is that correct? If so, can the authors incorporate descriptions of these findings?

      Thank you for this valuable input. When examining recent and remote retrieval separately, indeed both the anterior and posterior regions of the hippocampus exhibited significant activation from zero in adults (all p < .0003FDRcorr) and children (all p < .014FDRcorr, except for recent posterior hippocampus) during all delays. We include this information in the manuscript (see p. 17) and add it to the supplementary materials (Figure S2, Table S7).

      (6) Please provide more details about the choices available for locations in the 3AFC task. (1) Were they different each time, or always the same? If they are always the same, could this be a motor or stimulus/response learning task? (2) Do the options in the 3AFC always come from the same area - in which case the participant is given a clue as to the gist of the location/memory? Or are they sometimes randomly scattered across the image (in which case gist memory, like at a delay, would be sufficient for picking the right option)? Please clarify these points and discuss the logic/impact of these choices on the interpretation of the results.Response: Thank you for pointing this out. During learning and retrieval, we employed the 3AFC (Three-Alternative Forced Choice) task.

      The choices for locations varied across scenes while remained the same across time within individuals. There were 18 different key locations for the objects, distributed across the stimulus set. This means the locations of the objects were quite heterogeneous and differed between objects. The location of the object within the task was presented once during encoding and remained consistent throughout learning. Given the location heterogeneity, we believe our task cannot be reduced to a mere “stimulus/response learning task” but is more accurately described as an object-location associations task.

      Similar to the previous description, the options for the 3AFC task did not originate from the same area, as there were 18 different areas in total. The three choice options were distributed equally: so sometimes the “correct” answer was the left option, sometimes in the middle option, or sometimes the right option. Therefore, we believe that the 3AFC task did not provide clues to the location but required detailed and precise memory of the location. Moreover, the options were not randomly scattered but rather presented close together in the scene, demanding a high level of differentiation between choices.

      Taking all the above into consideration, we assert that precise object-location associative memory is necessary for a correct answer. We have added this information to the manuscript (p. 9).

      (7) Often p values are provided but test statistics, effect sizes, etc. are not - please include this information. It is at times hard to tell whether the authors are reporting main effects, interactions, pairwise comparisons, etc.

      Thank you for bringing this to our attention. We realize that including this information in the Tables may not be the most straightforward approach. Therefore, we have incorporated the test statistics, effect sizes, and related details into the text of the results section for clarity.

      (8) There are not enough methodological details in the main paper to make sense of the results. For example, it is not clear from reading the text that there are new object-location pairs learned each day.

      Thank you for pointing this out. We have added this information to the main manuscript. Additionally, we have emphasized this information in the text referring to Figure 1B.

      (9) The retrieval task does not seem to require retrieval of the scene itself, and as such it would be helpful for the authors to both explain their reasoning for this task to measure reinstatement. Strictly speaking, participants could just remember the location of the object on the screen. Was it verified that children and adults were recalling the actual scene rather than just the location (e.g. via self-report)? It's possible that there may be developmental differences in the tendency to reinstate the scene depending on e.g., their strategy.

      Thank you for highlighting this important point. Indeed, the retrieval task included explicit instructions for participants to recall and visualize the scene associated with the object presented during the fixation time window. Participants were also instructed to recollect the location of the object within the scene. Since the location was contextually bound to the scene and each object had a unique location in each scene, the location of the object was always embedded in the specific scene context. We have added this information to both the Methods and Results sections.

      From the self-reports of the participants (which unfortunately were not systematically collected on all occasions), they indicated that when they could recall the scene and the location due to the memory of stories created during strategic encoding, it aided their memory for the scene and location immensely. We also concur with your observation that children and young adults may differ in their ability to reinstate scenes, depending on the success of their employed recall strategies. This task was conducted with an awareness of potential developmental differences in the ability to form complex contextual memories. Our elaborative learning procedure was designed to minimize these differences. It is important to note though we did not expect children to achieve performance levels fully comparable to adults. There may indeed be developmental differences in reinstatement, such as due to differences in knowledge availability and accessibility (Brod, Werkle-Bergner, & Shing, 2013). We think that these differences may underlie our findings of neural reinstatement. This is now discussed in p. 34-35, 39-43 of the manuscript.

      (10) In general I found the Introduction a bit difficult to follow. Below are a few specific questions I had.

      a. At points findings are presented but the broader picture or take-home point is not expressed directly. For example, lines 112-127, these findings can all be conceptualized within many theories of consolidation, and yet those overarching frameworks are not directly discussed (e.g., that memory traces go from being more reliant on the hippocampus to more on the neocortex). Making these connections directly would likely be helpful for many readers.

      Thank you for bringing this to our attention. We have incorporated a summary of the general frameworks of memory consolidation into the introduction. This addition outlines how our summarized findings, particularly those related to memory consolidation for repeatedly learned information, align with these frameworks (see lines 126-138, 146-150).

      b. Lines 143-153 - The comparison of the Tompary & Davachi (2017) paper with the Oedekoven et al. (2017) reads like the two analyses are directly comparable, but the authors were looking at different things. The Tompary paper is looking at organization (not reinstatement); while the Oedekoven et al. paper is measuring reinstatement (not organization). The authors should clarify how to reconcile these findings.

      Thank you for highlighting this aspect. We have revised how we present the results from Tompary & Davachi (2017). This study examined memory reorganization for memories both with and without overlapping features, and it observed higher neural similarity for memories with overlapping features over time. The authors also explored item-specific reinstatement for recent and remote memories by assessing encoding-retrieval similarity. Since Oedekoven et al. (2017) utilized a similar approach, their results are comparable in terms of reinstatement. We have updated and expanded our manuscript to clarify the parallels between these studies (see lines 157-162).

      c. Line 195-6: I was confused by the prediction of "stable involvement of HC over time" given the work reviewed in the Introduction that HC contribution to memory tends to decrease with consolidation. Please clarify or rephrase.

      Drawing on the Contextual Binding Theory (Yonelinas et al., 2019), as well as the Multiple Trace Theory (Nadel et al., 2000) and supported for instance by evidence from Sekeres et al. (2018), we hypothesized that detailed contextual memories formed through repeated and strategic learning would strengthen the specificity of these memories, resulting in consistent hippocampal involvement for successfully recalled contextualized detailed memories. We have included additional explanatory information in the manuscript to clarify this hypothesis (see lines 217-219).

      d. Lines 200-202: I was a bit confused about this prediction. Firstly, please clarify whether immediate reinstatement has been characterized in this way for kids versus adults. Secondly, don't adults retain gist more over long delays (with specific information getting lost), at least behaviourally? This prediction seems to go against that; please clarify.

      Thank you for raising this important point. Indeed, there are no prior studies that examined memory reinstatement over extended durations in children. The primary existing evidence suggests that neural specificity or patterns of neural representations in children can be robustly observed, while neural selectivity or univariate activation in response to the same stimuli tends to mature later (i.e., Fandakova et al., 2019). Bearing this in mind and recognizing that such neural patterns can be observed in both children and adults, we hypothesized that adults may form stronger detailed contextual memories compared to children. By employing strategies such as creating stories, adults might more easily recall scenes without the need to resort to forming generic or gist-like memories (for example, 'a red fox was near the second left pine tree in a spring green forest'). This assumption aligns with the Fuzzy Trace Theory (Reyna & Brainerd, 1995), which posits that verbatim memories can be created without the extraction of a gist.

      Conversely, we hypothesized that children, due to the ongoing maturation of associative and strategic memory components (as discussed in Shing et al., 2008 and 2010), which are dependent respectively on the hippocampus (HC) and the prefrontal cortex (PFC), would be less adept at creating, retaining, and extracting stories to aid their retrieval process. This could result in them remembering more generic integrated information, like the relationship between a fox and some generic image of a forest. We have added explanatory information to the manuscript to elucidate these points (see lines 225-230).

      Reviewer #1 (Recommendations For The Authors):

      (1) For Figure 3, I would highly recommend changing the aesthetics for the univariate data - at least on my screen they appear to be open boxes with solid vs. dashed lines, and as such look identical to the recent vs. remove distinction in Figure 2B. It also doesn't match the legend for me, which shows the age groups having purple vs. yellow coloring.

      Thank you for this observation. We have adjusted Figure 2 (now Figure 3) (please refer to p. 14) accordingly, now utilizing purple and yellow colors to distinguish between the age groups.

      (2) Lines 329-330, it is not true that "all" indices were significant from zero but this is only apparent if you read the next sentence. Please rephrase to clarify. e.g., "All ... indices with a few exceptions ... were significantly..."?

      Based on the above suggestions and considering our primary focus on time-related changes in scene-specific reinstatement, we will refrain from further interpreting the relative expression of individual scene-specific indices against 0. Consequently, we have removed this information from our analysis.

      (3) It is challenging to interpret some of the significance markers, such as those in Figure 3. For example what effects are being denoted by the asterisks and bars above vs. below the data on panel D? Please clarify and/or note in the legend.

      We have included a note in the legend to clarify the meaning of all significance markers. In addition, we decided to state any significant main and interaction effects in the figure rather that to use significance markers.

      (4) For Figures 2 and 3, only the meaning of error bars is described in the caption. It is not explained in the caption what the boxes, lines, and points denote. Please clarify.

      Thank you for highlighting this. We have added explanations to the figure's annotation for clarity. Please note, that considering other review’s suggestions figure plots may have been adjusted or changed, resulting in adjustment of the explanations in the figure annotation.

      (5) How were recent and remote interspersed relative to one another? The text says that each run had 10 recent and 10 remote pairs, presented in a "pseudo-random order" - not clear what that (pseudo) means in this case. Please clarify.

      Thank you for raising this point. We provide this information in the Methods section “Materials and Procedure”: 'The jitters and the order of presentation for recent and remote items were determined using OptimizeXGUI (Spunt, 2016), following an exponential distribution (Dale, 1999). Ten unique recently learned pairs (from the same testing day) and ten unique remotely learned items (from Day 0) were distributed within each run (in total three runs) in the order as suggested by the software as the most optimal. There were three runs with unique sets of stimuli each resulting in thirty unique recent and thirty unique remote stimuli overall.'

      (6) Figure 1A, second to last screen on the learning cycles row - what would be presented to participants here, one of these three emojis? What does the sleepy face represent? I see some of these points were mentioned in the methods, but additional clarification in the caption would be helpful.

      Thank you for highlighting this. We have included this information in the figure caption. Specifically, the sleepy face symbol in the figure denotes a 'missed response'.

      (7) Not clear how the jittered fixation time between object presentation and scene test is dealt with in representational similarity analyses.

      Thank you for pointing this out. Beta estimates were obtained from a Least Square Separate (LSS) regression model. Each event was modeled with their respective onset and duration and, as such, one beta value was estimated per event (with the lags between events differing from trial to trial). We have edited the corresponding section (see p. 53).  

      (8) It was a little bit strange to have used anterior vs posterior HPC ROIs separately in univariate analysis but then combined them for multivariate. There are many empirical and theoretical motivations for looking at item-specific and category reinstatement in anterior and posterior HPC separately, so I was surprised not to see this. Please explain this reasoning.

      Thank you for pointing this out. We agree with the reviewer and included the anterior and posterior HC ROIs into the multivariate analysis. Please see the revised results section (pp. 13-15).

      (9) The term "neural specificity" is introduced (line 164) without explanation; please clarify.

      Thank you for bringing this to our attention. The term ‘neural specificity’ refers to the neural representational distinctiveness of information. In other words, ‘neural specificity,’ as defined by Fandakova et al. (2019), refers to the distinctiveness of neural representations in the regions that process that sensory input. We decided, however to refrain from using this term and instead to use neural representational distinctiveness, which is more self-explaining and was also introduced in the manuscript.

      (10) Age range is specified as 5-7 years initially (line 187) and then 6-7 years (line 188).

      We have corrected the age range in line 188 to '5 to 7 years.'

      Reviewer #2 (Public Reviews):

      Schommartz et al. present a manuscript characterizing neural signatures of reinstatement during cued retrieval of middle-aged children compared to adults. The authors utilize a paradigm where participants learn the spatial location of semantically related item-scene memoranda which they retrieve after short or long delays. The paradigm is especially strong as the authors include novel memoranda at each delayed time point to make comparisons across new and old learning. In brief, the authors find that children show more forgetting than adults, and adults show greater engagement of cortical networks after longer delays as well as stronger item-specific reinstatement. Interestingly, children show more category-based reinstatement, however, evidence supports that this marker may be maladaptive for retrieving episodic details. The question is extremely timely both given the boom in neurocognitive research on the neural development of memory, and the dearth of research on consolidation in this age group. Also, the results provide novel insights into why consolidation processes may be disrupted in children. Despite these strengths, there are quite a few important design and analytical choices that derail my enthusiasm for the paper. If the authors could address these concerns, this manuscript would provide a solid foundation to better understand memory consolidation in children.

      We thank the reviewer for both the positive and critical appraisal of our paper.

      Reviewer #2 (Recommendations For The Authors):

      (1) My greatest concern is the difference in memory accuracy that emerges as soon as immediate learning, which undermines the interpretation of any consolidation-related differences. This concern is two-fold. The authors utilize an adaptive learning approach in which participants learn to criteria or stop after 4 repetitions. This type of approach leads to children seeing the stimuli more often during learning compared to adults, which on its own could have consequences for consolidation-related neural markers. Specifically, within adults theoretical and empirical work this shows that repeating information can actually lead to more gist-like representations, which is the exact profile the children are showing. While there could be a strength to this approach because it allows for equivocal memory, the decision to stop repetitions before criteria means that memory performance is significantly lower in the children, which again could have consequences to consolidation-related neural markers. First, the authors do not show any of the learning-related data which would be critical to assess the impact of this design choice. Second, there are likely differences in memory strength at the delay, making it extremely difficult to determine if the neural markers reflect development, worse memory strength, or both. This issue is compounded by the use of a 3-AFC paradigm, wherein "correct responses" included in the analysis could contain a significant amount of guessing responses. I think a partial solution to this problem is to analyze the RT data and include them in the analyses or use a drift-diffusion modeling approach to get more precise estimates of memory strength to control for this feature. An alternative is to sub-select participants in each group to have a sample matched on performance (including # of repetitions) and re-run all the analyses in this sub-sample. Without addressing these concerns it is near impossible to interpret the presented data.

      Thank you for highlighting this point.

      Firstly, we believe that our approach, involving strategic and repeated learning coupled with feedback, enhances the formation of detailed contextual memories. The retrieval procedure also emphasized the need for detailed memory for location. These are critical differences in experimental procedure from previous studies, which enhanced the importance of detailed representations and likely reduced the likelihood of forming gist-like memories.

      Indeed, we ceased further learning after the fourth repetition. Extensive piloting, where we initially stopped after the seventh repetition, showed no improvement beyond the fourth repetition. In fact, performance tended to decline due to fatigue. Therefore, we limited the number of repetition cycles to the point where an improvement of performance was still feasible. Even though children exhibited lower final learning performance overall, we believe our procedure facilitated them to reach their maximal performance within the experimental setup.

      To address the reviewer’s concern, we included learning data to illustrate the progression of learning (see Fig. 1C, pp. 9-10 in Results).

      When interpreting the retention rates, it is important to note that we reported retention rates only for items that were correctly learned (100%) on day 0, day 1, and day 14. This approach meant that different participants had varying numbers of items learned correctly. However, this method enabled us to address our primary question: whether memory consolidation, based on all items initially encoded successfully, is comparably robust between the groups. To simultaneously examine the change in retention rate slopes over time for recent (30 minutes after learning), short delay (one night after) remote, and long delay (two weeks after) remote items, we conducted a separate analysis of retention rates for recent items on days 1 and 14. After observing no differences between sessions in both age groups, we combined the data for recent items. This allowed us to investigate how the slope of memory retention for initially correctly learned items (with a baseline of 100%) changes over time. We observed a significant interaction between item type (recent, short delay remote, long delay remote) and group. Analysis of this interaction revealed significantly less robust memory consolidation across all delay times for children compared to young adults. The figures have been adjusted accordingly to incorporate the baseline of 100% correct performance.

      Following your suggestion, we also employed the drift diffusion model approach to characterize memory strength, calculating drift rate, boundary and non-decision time parameters. We added the results to the Supplementary Materials (section S2.1, Figure S1).

      Generally, our findings indicate lower overall drift rate in children when considering all items that had to be learned. We also observed that adults show higher slope of decline in drift rate in short and long delay, which, however, are characterized still by higher memory strength compared to children. Both age groups required similar amount of evidence to make decision, which declined with delay. It may indicate an adaptation of weaker memory. Further, we observed lesser non-decision time in children compared to adults, potentially suggesting less error checking or less thorough processing and memory access through strategy in children.

      Overall, these results indicate weaker memory strength in children as a quantitative measure. It may nevertheless stem from qualitatively different memory representations that children form, as our RSA findings suggest. We believe that our neural effect reflects the effect of interest (i.e., worse memory due to lower memory strength in children). When controlled for, it will take away variance of interest in the neural data. Therefore, we will refrain from including memory strength into the model. However, we will include mean RT as the indicator of general response tendencies.

      Given that the paper is already very complex and long, we opted to add the diffusion model results to the Supplementary Materials (section S2.1, Fig. S1), while discussing the results in the discussion (p. 35).

      (2) More discussion of the behavioral task should be included in the results, in particular the nature of the adaptive learning paradigm including the behavioral results as well as the categorical nature of the memoranda. Without this information, it is difficult for the reader to understand what category-level versus item-level reinstatement reflects.

      Thank you for this valuable input. We have incorporated this information into the results section. Please refer to pp. 9-10, 12, 14, 21, 25-26 for the added details.

      (3) Some of the methods for the reinstatement analysis were unclear to me or warranted further adjustment. I believe the authors compared the scene against all other scenes. I believe it would be more appropriate to only compare this against scenes drawn from the same category as opposed to all scenes. Secondly, from my reading, it seems like the reinstatement was done during the scene presentation, rather than the object presentation in which they would retrieve the scene. I believe the reinstatement results would be much stronger if it was captured during the object presentation rather than the re-presentation of the scene. Or perhaps both sets of analyses should be included.

      We apologize for the confusion regarding the analysis method.

      During the review process we have improved the description of this analysis and hope it is easier to follow now. In short, we used both approaches (within and between categories) to suit different goals (I.e., measuring scene-reinstatement and gist-like reinstatement).

      Both types of reinstatement were assessed during the fixation cross to avoid confounds with the object itself being on the screen. We only used the scene window in one analysis (scene-reinstatement index) as a neural template to track its pre-activation during the fixation. So, as the reviewer suggests, our rationale is that the reinstatement indeed starts taking place at the short object presentation window, but importantly, extends to the fixation window. We added this clarifying information to the results section (see p. 21-27).

      (4) For the univariate results, it was unclear to me when reading the results whether they were focusing on the object presentation portion of the trial or the scene presentation portion of the trial. Again, I think the claims of reinstatement related activity would be stronger if they accounted for the object presentation period.

      Thank you for pointing this out. Indeed, the univariate results were based on the object presentation time window. We added this information to the results section (Fig. 3, pp. 14, 16).

      (5) Further, given the univariate differences shown across age groups, the authors should re-run all analyses for the RSA controlling for mean activation within the ROI.

      Thank you for highlighting this. We re-ran all analysis for the RSA controlling for the mean activation within the ROI. The results remained unchanged. We have added this information to the results section as well as in Table S8 and S11 in the Supplementary Materials for further details.

      (6) The authors should include explicit tests across groups for their brain-behavior analyses if they want to make any developmentally relevant interpretations of the data. Also, It would be helpful to include similar analyses to those using the univariate signals, and not just the RSA results.

      Following reviewer’s suggestion, we included brain-behavior analyses for univariate data as well as RSA data with explicit tests across groups. These can be found in the Results Section pp. 18-20, 28-32. Due to the interdependence of predefined ROIs and to avoid running a high number of correlation tests, we employed the partial least square correlation analysis for this purpose. This approach focuses on multivariate links between specified Regions of Interest (ROIs) and fluctuations in memory performance over short and long delays across different age cohorts. We argue that this multivariate strategy offers a more comprehensive understanding of the relationships between brain metrics across various ROIs and memory performance, given their mutual dependence and connectivity (refer to Genon et al. (2022) for similar discussions).

      (7) There could be dramatic differences in memory processing across 5-7 year olds. I know the sample is a little small for this, but I would like to see regressions done within the middle childhood group in addition to the across-group comparisons.

      We have included information detailing the relationship between memory retention rate and age within the child group (refer to p. 13). In the child group, both recent and short delay remote memory improved with age. However, the retention rate for long-delayed memory did not show a significant improvement with increasing age in children.

      (8) I am concerned that the authors used global-signal as a regressor in their first-level analyses, given that there could be large changes in the amount of univariate activation that occurs across groups. This approach can lead to false positives and negatives that obscure localized differences. The authors should remove this term, and perhaps use the mean sum of the white matter or CSF to achieve the noise regressor they wanted to include.

      We understand the reviewers' concerns. However, we believe that our approach is recommended for the pediatric population. Specifically, Graff et al., 2021, found that global signal regression is a highly efficacious denoising technique in their study of 4 to 8-year-old children. This technique was previously suggested for adults by Ciric et al., 2017, and the benefits in terms of motion and physiological noise removal outweigh the potential costs of removing some signal of interest, as indicated by Behzadi et al., 2007. Additionally, we incorporated the six anatomic component-based noise correction (CompCor) to account for WM and CSF signals, as recommended in the pediatric literature.

      (9) The authors discuss the relationship between hippocampal reactivation and worse memory through the lens of Schapiro et al., but a new paper by Tanriverdi et al came out in JOCN recently that is more similar to the authors' findings.

      Thank you for highlighting the recent paper by Tanriverdi et al. in JOCN, which aligns closely with our findings. We appreciate the suggestion and agree that exploring this alignment could further enrich our discussion on the relationship between hippocampal reactivation and memory retention. We incorporated this work in our revised manuscript .

      Minor Comments

      - I was surprised that the authors did not see any differences in univariate signals for memory retrieval as a function of development, as much of the prior work has shown differences (for example work by Tracy Riggins). I believe this contrast should be highlighted in the discussion.

      - Given the robust differences in sleep patterns across childhood and the role of sleep in systems consolidation framework, I think this feature should be highlighted in either the introduction or discussion.

      - Could the authors report on differences (or lack of differences) in head motion across the groups, and if they are different whether they could include them as a confounding variable.

      I believe we included six motion parameters and their derivatives into the model

      Thank you for your comments.

      First, prior works on univariate signals of memory retrieval focused mostly on remembered vs forgotten contrasts, while in our study we focused on remote vs recent in short and long delay only for correctly remembered items. This can partially explain the results. We highlighted this information in the discussion session.

      Second, we agree with the reviewer that sleep patterns across childhood should be addressed in the analysis. Therefore, we incorporated them in the discussion section.

      Third, indeed head motion were included in the analysis as confounding variables, as adding them is highly recommended for the developmental population (e.g., Graff et al. 2021). As an example, we observed higher framewise displacement in children compared to adults, t = -16(218), p <. 001, as well as in translational y, t = -2.33(288), p = .02.

      Reviewer #3 (Public Reviews):

      Summary:

      This study aimed to understand the neural correlates of memory recall over short (1-day) and long (14-days) intervals in children (5-7 years old) relative to young adults. The results show that children recall less than young adults and that this is accompanied by less activation (relative to young adults) in brain networks associated with memory retrieval.

      Strengths:

      This paper is one of few investigating long-term memory (multiple days) in a developmental population, an important gap in the field. Also, the authors apply a representational similarity analysis to understand how specific memories evolve over time. This analysis shows how the specificity of memories decreases over time in children relative to adults. This is an interesting finding.

      We thank the reviewer for the appraisal of our manuscript.

      Weaknesses:

      Overall, these results are consistent with what we already know: recall is worse in children relative to adults (e.g., Cycowicz et al., 2001) and children activate memory retrieval networks to a lesser extent than adults (Bauer et al, 2017).

      It seems that the reduced activation in memory recall networks is likely associated with less depth of memory encoding in children due to inattentiveness, reduced motivation, and documented differences in memory strategies. In regard to this, there was consideration of IQ, sex, and handedness but these were not included as covariates as they were not significant although I note p<.16 suggests there was some level of association nonetheless. Also, IQ is measured differently for the children and adults so it's not clear these can be directly contrasted. The authors suggest the instructed elaborative encoding strategy is effective for children and adults but the reference in support of this (Craik & Tulving, 1975) does not seem to support this point.

      Thank you for your review, and we appreciate your valuable feedback. Here are our responses and clarifications:

      Regarding the novelty of the results in terms of mentioned existent literature, we believe that in contrast to Cycowicz et al. (2001) and Bauer et al (2017), etc, we assess not only immediate memory after encoding with semantic judgement of abstract associations, but add to these findings investigating consolidation-related changes in complex associative and contextual information in much under investigated sample of 5-to-7-year-old preschoolers. With this we are able to infer also how neural representations of children change over time, providing invaluable insights into knowledge formation in this developmental cohort.

      With this, the observed age differences are not so of primary importance, as time-related changes in mnemonic representations observed in children.

      Regarding the assumption of inattentiveness in children, we want to emphasize that the experimenter was present throughout the learning process, closely supervising the children. We observed prompt responses to every trial in children and noted an increase in accuracy over the encoding-learning cycles, leading us to conclude that the children were indeed attentive to the task. The observed accuracy improvement across learning cycles  indicates increase in remembered information. Furthermore, we took measures to ensure their engagement, including extensive training in both verbal and computerized versions to ensure that they understood and actively created stories to support their learning.

      We collected motivation data after each task execution in children, and the results indicated that they scored high in motivation. Children not only completed the tasks but also expressed their willingness to participate in subsequent appointments, highlighting their active involvement in the study.

      The observed differences in the efficiency of strategy utilization were expected, given developmental differences in the associative and strategic components of memory in children, as noted in prior research (Shing, 2008, 2010).

      We appreciate your point about IQ, sex, and handedness. These variables were indeed included in the behavioral models, and mean brain activation was also included in the brain data models, addressing the potential influence of these factors on our results.

      While it's true that we applied different tests to measure IQ in children and adults, these tests targeted comparable subtests that addressed similar cognitive constructs. As the final IQ values are standardized, we believe it is appropriate to compare them between the two groups.

      Lastly, we agree that the citation Craik & Tulving, 1975 supports the notion of effectiveness of instructed elaborative learning only in adults, but not in children. For this purpose, we added relevant literature for the child cohort (i.e., Pressley, 1982; Pressley et al., 1981; Shing et al., 2008).

      Reviewer #3 (Recommendations For The Authors):

      An additional point for the authors to consider is that the hypotheses were uncertain. The first is that prefrontal, parietal, cerebellar, occipital, and PHG brain regions would have greater activation over time in adults and not children - which is very imprecise as this is basically the whole brain. Moreover, brain imaging data may be in opposition to this prediction: e.g., the hippocampus has a delayed maturational pattern beyond 5-yrs (e.ge., Canada 2019; Uematsu 2012) and some cortical data predicts earlier development in these regions.

      Thank you for your feedback, and we appreciate your insights regarding our hypotheses.

      The selection of our regions of interest (ROIs) was guided by prior literature that has demonstrated the interactive involvement of multiple brain areas in memory retrieval and consolidation processes. Additionally, our recent work utilizing multivariate partial least square correlation analysis (Schommartz, 2022, Developmental Cognitive Neuroscience) has indicated that unique profiles derived from the structural integrity of multiple brain regions are differentially related to short and long-delay memory consolidation.

      Indeed, the literature suggests that the hippocampus may exhibit a more delayed maturational pattern extending into adolescence, as supported by studies such as Canada (2019) and Uematsu (2012), etc. We added this information as well as findings from the literature on cortical development to be more balanced in our review of the literature.

      Given this complexity, we believe it is important to emphasize in our discussion that both the medial temporal lobe, including the hippocampus, and cortical structures, as well as the cerebellum, undergo profound neural maturation. We highlight these nuances in our revised manuscript to provide a more comprehensive perspective on the developmental differences in memory retention over time.

      The writing was challenging to follow - consider as an example on page 9 the sentence that spans 10 lines of text.

      Thank you for bringing this to our attention. We have carefully reviewed the manuscript and have made efforts to streamline the text, ensuring that sentences are not overly long or complex to improve readability and comprehension.

      I found the analysis (and accompanying figures) a bit of a data mine - there are so many results that are hard to digest and in other cases highly redundant one from the other. This may be resolved in part by moving redundant findings to the supplemental. Some were hard to follow - so when there is a line between recent and recent data, that seems confusing to connect data that, I believe, are different sets of items. Later scatterplots (Fig 7) have pale yellow dots that I had a hard time seeing.

      Thank you for bringing up your concerns regarding the analysis and figures in our manuscript. We have carefully considered your feedback and made several improvements to address these issues.

      To alleviate the challenge of digesting numerous results, we have taken steps to enhance clarity and reduce redundancy. Specifically, we have moved some of the redundant findings to the supplementary sections, which should help streamline the main manuscript and make it more reader friendly.

      Regarding the line between 'recent' and 'recent data,' figure were transformed to a clearer version. Furthermore, we have improved the visibility of certain elements, such as the pale-yellow dots in the scatterplots (Fig 1, 2, 4, etc. ), to ensure that readers can better discern the data points.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      […] 

      Weaknesses: 

      The question of the physiological relevance of short bouts of ischemia remains.

      The chemical ischemia protocol induces a duration-dependent ATP depletion in acute slices on a time scale of minutes (Pape and Rose 2023). This is about the same time scale as the peri-infarct depolarisation (Lauritzen et al. 2011) that the protocol attempts to model. Of course, such models do not completely replicate the complex situation in vivo. However, the presented analyses of synapse function cannot be performed in vivo. We discuss this now in the manuscript.

      The precise mechanisms underlying the shift between ischemia-induced long-term potentiation and long-term failure of synaptic responses were not addressed. Could this be cell death?

      Thank you for the comment. Yes, we indeed believe that the persistent failure of synaptic transmission is because of neuronal cell death (i.e., of CA1 pyramidal cells) or at least persistent depolarisation. We did not explicitly state that in the original submission but do so in the revised manuscript. It is supported by the unquantified observation of swelling and/or loss of integrity of CA1 pyramidal cell bodies in parallel to postsynaptic failure. It is also in line with many reports from the literature, of which we now cite two (lines 186-198).

      Sex differences are not addressed or considered.

      We have performed all experiments on male mice, as indicated in Material and Methods. We have indeed not addressed sex differences of the observed effects. We consider this, and many other important factors, to be interesting topics for follow-up studies. This is now discussed (lines 413-424).

      Reviewer #2 (Public Review): 

      […]

      Weaknesses: 

      The weaknesses are minor and only relate to the interpretation of some of the data regarding the presynaptic mechanisms causing the potentiation of release. The authors measured the fiber volley, which reflects the extracellular voltage of the compound action potential of the fiber bundle. The half-duration of the fiber volley was increased, which could be due to the action potential broadening of the individual axons but could also be due to differences in conduction velocity. We are therefore skeptical whether the conclusion of action broadening is justified.

      These are excellent points. We have added an analysis demonstrating that axonal conduction velocity is unlikely to be affected. Nonetheless, the fiber volley is indeed an indirect measure of what happens in individual axons. We have adjusted our interpretation accordingly and now also discuss alternative explanations of our findings (lines 363-379).

      Reviewer #3 (Public Review): 

      […]

      Weaknesses: 

      The data on fiber volley duration should be supported by more direct measurements to prove that chemical ischemia increases presynaptic Ca2+ influx due to a presynaptic broadening of action potentials. Given the influence that positioning of the stimulating and recording electrode can have on the fiber volley properties, I found this data insufficient to support the assumption of a relationship between increased iGluSnFR fluorescence, action potential broadening, and increased presynaptic Ca2+ levels.

      We have added a new analysis showing that the latency of the fiber volley is unaffected and relatively constant, which strengthens our conclusion. But the fiber volley is indeed an indirect measure of action potential firing in individual axons. The suggested experiment, which would require simultaneous recording of Ca2+ and action potentials in single axons in combination with chemical ischemia, is extremely difficult, if possible at all. Instead, we have extended the discussion and include now further alternative mechanistic explanations (lines 363-379).

      The results are obtained in an ex-vivo preparation, it would be interesting to assess if they could be replicated in vivo models of cerebral ischemia. 

      This would certainly be very interesting but also extremely challenging technically. For a detailed analysis of synaptic changes as presented here, the main difficulty will be to stimulate and visualise glutamate release exclusively in an isolated population of synapses while recording postsynaptic responses in a stroke model.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors): 

      […]

      Labelling of experimental groups of 2-minute and 5-minute chemical ischemia is more accurate than "metabolic stress" and "with postsynaptic failure". The critical difference between these two conditions is lost with this nomenclature. The reader could be misled to believe that the two groups form a heterogenous population of responses from the same experimental manipulation which is incorrect.

      We had stated in the manuscript that we ‘ … grouped combined iGluSnFR and electrophysiological recordings according to the effect of chemical ischemia on the synaptic response: ‘chemical ischemia with postsynaptic failure’ if the postsynaptic response did not recover to above 50% of the baseline level and ‘chemical ischemia’ when it did (as indicated in Fig. 1H). …’. The recordings were not grouped according to chemical stress duration but according to the effect on the postsynaptic response. We have revised the text explaining this (lines 125-135) and illustrate that now also in Fig. 1H. We hope this is easier to follow now.

      More details on the long-term impact of 5-minute ischemia on cell viability would be enlightening regarding the specific mechanism separating these two conditions. With 2 minutes it would appear that cells remain alive (i.e. intact post-synaptic responses), 5 minutes however, inducing cell death. 

      Yes, our observations, although not quantified, are in line with cell death as CA1 pyramidal cell bodies appeared swollen and/or lost their integrity when chemical ischemia was followed by postsynaptic failure. This is also in line with reports from the literature. We have revised the results section accordingly (lines 186-201).

      In the paragraph titled "glutamate uptake is unaffected after acute chemical ischemia", there are two erroneous citations of Figure S3 that should be Figure S4.

      Thank you. We corrected this mistake.

      The sex of animals is not given. This is essential information. 

      We used male mice as indicated in the initial version of the manuscript (Material and Methods). We have added a statement regarding the role of sex to the final section of the Discussion.

      Reviewer #2 (Recommendations For The Authors):

      We propose addressing the weaknesses mentioned in the public review. As said, the fibre volley is a very indirect measure of action potential broadening. Based on the iGluSnFR data, the authors predict that the potentiation is mediated by depolarization, action potential broadening, and increased presynaptic calcium influx. The latter could be tested experimentally, but this does not seem necessary if the data are interpreted more cautiously. For example, other explanations for the broadened fiber volley could be mentioned, such as a slowing and/or dispersion of the action potential propagation speed. Furthermore, depolarization could cause elevated resting calcium concentrations, which could potentiate release independently of action potential broadening. Finally, classical forms of presynaptic potentiation of the release machinery that occur during homeostatic plasticity or Hebbian plasticity may operate independently of calcium dynamics.

      Thank you for this comment. The discussion of the mechanism was indeed too short. We have added an analysis of the fiber volley delay after stimulation, which was not affected. Presynaptic action potential broadening is, in our opinion, a very likely explanation for our observations but we did not perform direct experiments. Directly recording presynaptic action potentials and Ca2+ transients in the chemical ischemia model over extended periods of time is a major technical challenge and certainly of interest in the future. As suggested, we have expanded the discussion section and now mention various alternative explanations (lines 363-379).

      There are the following minor suggestions:

      Add line numbers.

      We have added line numbers.

      We would suggest providing exact P values instead of asterisks in the figures. 

      We agree that having exact P values in the figure panels can be very helpful. However, in the present figures they are hard to integrate without overcrowding the already complex panels and thereby obscuring other important details. All p-values are included in the figure legends and/or main text.

      Abstract: "We also observed an unexpected hierarchy of vulnerability of the involved mechanisms and cell types." This sentence is hard to understand and cell types were not directly compared (i.e. axons of CA3 and axons of CA1 neurons were not compared).

      We have revised this statement and removed the reference to cell types.

      In Figure 1G there seems to be an increase in the fiber volley. Is this significant? Could this be due to swelling of the slice during chemical ischemia? Or an increase in excitability? Maybe this could be discussed. 

      The effect was analysed in the context of Fig. 2. A significant increase of the fiber volley amplitude was detected in chemical ischemia (Fig. 2H) but also under control conditions (Fig. 2F). We therefore consider this a change that is detectable but not related to chemical ischemia and not a potential explanation for increased glutamate release (lines 157-160). Also, no significant fiber volley increase was detected in chemical ischemia with postsynaptic failure (Fig. 2H) and in the experiments illustrated in Fig. 4E. Our interpretation is that the fiber volley unspecifically increases in some experiments over the time course of the experiment (~ 60 min) but this is unrelated to chemical ischemia.

      In the results: "A fully separate set of experiments..." Please explain better what this means. 

      We have revised the entire section to explain more clearly how recordings were grouped (lines 125135).

      In the results: "...(Syková and Nicholson, 2008) (Figure S3). However, this was not observed for chemical ischemia without postsynaptic failure (Figure S3), in which the increased glutamate transients were observed." This should probably refer to Figure S4. 

      Thank you for spotting this mistake. We corrected it.

      The last sentence in the results "... most likely by increased presynaptic Ca2+ influx, and, at the same time, the postsynaptic response." This is difficult to understand. Does "at the same time" refer to another mechanism or the consequence of more Ca2+? 

      We revised this part of the results section to improve clarity and toned down our conclusions (lines 328-335 and 363-379).

      Reviewer #3 (Recommendations For The Authors): 

      There are a few points that the author needs to clarify: 

      The authors do not discuss the different behaviour of iGlu F0 during chemical ischemia and chemical ischemia with postsynaptic failure shown in Figure 2, panels D and E. In the first case, during the application of the solution to induce ischemia, iGluF0 decreases while in the other case, it strongly increases before falling down. In both cases, the fEPSP slope is decreased. How does the author explain this observation? 

      We attribute the transient increase of extracellular glutamate during prolonged chemical ischemia to the increase of synaptic glutamate release observed previously under such conditions (Hershkowitz et al. 1993; Tanaka et al. 1997) and other mechanisms reviewed by us (Passlick et al. 2021) (e.g., glial glutamate release, transiently reduced glutamate uptake), which we could not detect during shorter chemical ischemia. The initial drop of the fEPSP slope is most likely due to postsynaptic depolarisation, which is followed by a repolarisation if the chemical stress duration is short. We now explain this in more detail in lines 185-200 of the revised manuscript. Although we focussed on the bi-directional effect on longer timescales in this manuscript, this transient phase during chemical ischemia is very interesting for further investigations.

      On page 8, first line, I think that the authors meant Figure S4, not Figure S3 when they mentioned results on ECS diffusivity and ECS fraction. 

      Yes, thank you for spotting this. We corrected the mistake.

      In Supplementary Figure 5 panel B It seems that PPR is significantly reduced upon chemical ischemia (asterisk on columns green) but the authors claimed in the paper at page 10 that "Analysing the paired-pulse ratio (PPR) of postsynaptic response and iGluSnFR transients revealed no consistent changes after chemical ischemia (Figure S5).". Did the authors refer to the data normalized in panel D? In this case, I do not see the need to normalize raw data that have been already shown in a previous panel and that give different statistical results, probably due to the different tests used (paired in panel B and not paired in panel D). 

      We have clarified this point in the supplementary material (Figure S5, legend). There is a relevant difference between the analyses presented in panel B and D. The paired test presented in B analyses the change of the electrophysiological PPR in response to chemical ischemia. The test in D on the electrophysiologically PPR asks if the reduction in B is significantly different from the changes seen under control conditions. Because it is not, we conclude that chemical ischemia has no relevant effect on the electrophysiological PPR and, in combination with the results on the iGluSnFR PPR, also not on short-term plasticity, as tested here.

      References

      Hershkowitz N, Katchman AN, Veregge S. Site of synaptic depression during hypoxia: a patch-clamp analysis. Journal of Neurophysiology 69: 432–441, 1993.

      Lauritzen M, Dreier JP, Fabricius M, Hartings JA, Graf R, Strong AJ. Clinical Relevance of Cortical Spreading Depression in Neurological Disorders: Migraine, Malignant Stroke, Subarachnoid and Intracranial Hemorrhage, and Traumatic Brain Injury. J Cereb Blood Flow Metab 31: 17–35, 2011.

      Pape N, Rose CR. Activation of TRPV4 channels promotes the loss of cellular ATP in organotypic slices of the mouse neocortex exposed to chemical ischemia. The Journal of Physiology 601: 2975–2990, 2023.

      Passlick S, Rose CR, Petzold GC, Henneberger C. Disruption of Glutamate Transport and Homeostasis by Acute Metabolic Stress. Front Cell Neurosci 15: 637784, 2021.

      Tanaka E, Yamamoto S, Kudo Y, Mihara S, Higashi H. Mechanisms Underlying the Rapid

      Depolarization Produced by Deprivation of Oxygen and Glucose in Rat Hippocampal CA1 Neurons In Vitro. Journal of Neurophysiology 78: 891–902, 1997.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Review:

      Reviewer #1 (Public Review):

      In 'Systems analysis of miR-199a/b-5p and multiple miR-199a/b-5p targets during chondrogenesis', Patel et al. present a variety of analyses using different methodologies to investigate the importance of two miRNAs in regulating gene expression in a cellular model of cartilage development. They first re-analysed existing data to identify these miRNAs as one of the most dynamic across a chondrogenesis development time course. Next, they manipulated the expression of these miRNAs and showed that this affected the expression of various marker genes as expected. An RNA-seq experiment on these manipulations identified putative mRNA targets of the miRNAs which were also supported by bioinformatics predictions. These top hits were validated experimentally and, finally, a kinetic model was developed to demonstrate the relationship between the miRNAs and mRNAs studied throughout the paper.

      I am convinced that the novel relationships reported here between miR-199a/b-5p and target genes FZD6, ITGA3, and CAV1 are likely to be genuine. It is important for researchers working on this system and related diseases to know all the miRNA/mRNA relationships but, as the authors have already published work studying the most dynamic miRNA (miR-140-5p) in this biological system I was not convinced that this study of the second miRNA in their list provided a conceptual advance on their previous work.

      We believe this study is an enhancement on our previous work for two reasons, which have been alluded to in new text within the introduction. Firstly, our previous work used experimental and bioinformatic analysis to identify microRNAs with significant regulatory roles during chondrogenesis. This new manuscript additionally uses  a systems biology approaches to identify novel miRNA-mRNA interactions and capture these within an in silico model. Secondly, this work was initiated by the analysis of our previously generated data – using a novel tool we developed for this type of data (Bioconductor - TimiRGeN).  

      I was also concerned with the lack of reporting of details of the manipulation experiments. The authors state that they have over-expressed miR-199a-5p (Figure 2A) and knocked down miR-199b-5p (Figure 2B) but they should have reported their proof that these experiments had worked as predicted, e.g. showing the qRT-PCR change in miRNA expression. Similarly, I was concerned that one miRNA was over-expressed while the other was knocked down - why did the authors not attempt to manipulate both miRNAs in both directions? Were they unable to achieve a significant change in miRNA expression or did these experiments not confirm the results reported in the manuscript?

      We agree with the reviewer that some additional data were needed to demonstrate the effective regulation of miR-199-5p.  Hence, Supplementary Figure 1 is now included which provides validation of the effects of miR-199a-5p overexpression

      (Supplementary Figure 1A) and inhibition of miR-199a/b-5p (Supplementary Figure 1B). Within the main manuscript, Figure 2B has been amended to include the consequences of inhibition of miR-199a-5p, with 2C showing the consequences of miR-199b-5p inhibition. Further, we include new data with regards to miR-199a/b-5p inhibition on CAV1 (Figure 4A). 

      I had a number of issues with the way in which some of the data was presented. Table 1 only reported whether a specific pathway was significant or not for a given differential expression analysis but this concealed the extent of this enrichment or the level of statistical significance reported. Could it be redrawn to more similarly match the format of Figure 3A? The various shades of grey in Figure 2 and Figure 4 made it impossible to discriminate between treatments and therefore identify whether these data supported the conclusions made in the text. It also appeared that the same results were reported in Figure 3B and 3C and, indeed, Figure 3B was not referred to in the main text. Perhaps this figure could be made more concise by removing one of these two sets of panels.

      We agree with all points made here and have amended these within the manuscript. Figure 1A is now pathway enrichment plots from the TimiRGeN R Bioconductor package, and the table which previously showed the pathways enriched at each time point is now in the supplementary materials (supp. Table 1). Figure 2 and 4 now have color instead of shades of grey. Figure 3C has now been moved to supplementary materials (Supplementary Figure 2) and is referenced in the text. 

      Overall, while I think that this is an interesting and valuable paper, I think its findings are relatively limited to those interested in the role of miRNAs in this specific biomedical context.

      Reviewer #2 (Public Review):

      Summary:

      This study represents an ambitious endeavor to comprehensively analyze the role of miR199a/b-5p and its networks in cartilage formation. By conducting experiments that go beyond in vitro MSC differentiation models, more robust conclusions can be achieved.

      Strengths:

      This research investigates the role of miR-199a/b-5p during chondrogenesis using bioinformatics and in vitro experimental systems. The significance of miRNAs in chondrogenesis and OA is crucial, warranting further research, and this study contributes novel insights.

      Weaknesses:

      While miR-140 and miR-455 are used as controls, these miRNAs have been demonstrated to be more relevant to Cartilage Homeostasis than chondrogenesis itself. Their deficiency has been genetically proven to induce Osteoarthritis in mice. Therefore, the results of this study should be considered in comparison with these existing findings.

      We agree with the reviewers comments. miR-455-null mice develop normally but miR-140-null (or mutated) mice and humans do have skeletal abnormalities (e.g. Nat Med. 2019 Apr;25(4):583-590. doi: 10.1038/s41591-019-0353-2), indicating a role in chondrogenesis.  We have made an addition in the description to point towards the need to assess the roles miR-199a/b-5p may play during skeletogenesis and OA. We anticipate miR-199a/b-5p to be relevant in OA and have ongoing additional work for this – but this beyond the scope of this manuscript. 

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Beyond the issues raised in the public review, I had a few minor recommendations that are largely designed to help improve the understanding of the manuscript as it is currently written.

      (1) Please provide the statistical tests used to obtain p-values in the Figure 2 and 4 legends.

      We have now added statistical test information to the figure legends of figures 2 and 4.

      (2) It is stated on p. 9 that both miRNAs may share a functional repertoire because 25 and 341 genes are interested between their inhibition experiments. Please provide statistical support that this overlap is an enrichment over the null background in this experiment. Total DE genes – chi squared. Expected / Observed. 

      A chi-squared test is now presented in the manuscript which shows that the number of significant genes which were found in common between miR-199a-5p knockdown and miR-199b-5p knockdown were significantly more than expected for day 0 or day 1 of the experiments. 

      (3) The final sentence on p. 12 (beginning 'Size of the points reflect...') seemed out of place - is it part of a legend?

      Thank you for pointing out this mistake - it was part of figure 3C and now is in the supplementary materials.

      (4) A sentence on p. 14 reads that 'FZD6 and ITGA3 levels increased significantly' but this should read decreased, rather than increased. Quite an important typo!

      Thank you for pointing this error out. It has been corrected.

      (5) Theoretical transcripts are mentioned in the legend of Figure 5A but these were not present in the figure. Please include these or remove them from the legend.

      This error has been removed form Figure 5A.

      (6) On p 20, the references 22 and 27 should I think be moved to earlier in the sentence (after 'miR-199a-5p-FZD6 has been predicted previously'). Currently, it reads as if these references support your luciferase assays which you claim are the first evidence for this target relationship.

      We agree with this change and have corrected the manuscript.

      (7) The reference to Figure 5D on p. 20 should be a reference to Figure 5C.

      Thank you for pointing this error out – this has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      (1) The paper is based on the importance of miR-140 and miR-455 as miRNAs in chondrogenesis, citing only Barter, M. J. et al. Stem Cells 33, (2015). Considering the scope and results of this study, this citation is insufficient.

      We agree with this reviewers comments. For many year miR-140 and miR-455 have been experimented on and their importance in OA research has become apparent. We included additional references within the introduction to address this.

      (2) Analyzing chondrogenesis solely through differentiation experiments from MSCs is inadequate. It is essential to perform experiments involving the network within normal cartilage tissue and/or the generation of knockout mice to understand the precise role of miR199a/b-5p in chondrogenesis.

      We have added an additional paragraph in the discussion to state this, and do believe it is highly important that miR-199a/b-5p be tested in OA samples – however this would be beyond the intended scope of this article.

      (3) In light of the above points, it is imperative to investigate the role of miR-199a/b-5p beyond the in vitro differentiation model from MSCs, encompassing mouse OA models or human disease samples.

      In tangent with the previous address, we agree with the pretense and believe additional experiments should be performed to gain more insight to the mechanism of how miR-199a/b-5p regulate OA. But development of a new mouse line to investigate this is not in the scope of this manuscript.

    1. Author response:

      eLife assessment

      This important study describes the crystallographic screening of a number of small molecules against a viral enzyme critical for the 5' capping of SARS-CoV-2 RNA and viral replication. While the high-quality crystal structures and complementary biophysical assays in this study provide solid evidence to support the major claims regarding how these small molecule compounds bind to the viral enzyme, the mismatch between the antiviral activity and binding to the viral enzyme of several small molecule compounds could have been more thoroughly investigated or discussed. This paper would be of interest to the fields of coronavirus biology, structural biology, and drug discovery.

      We do fully agree that the antiviral assay results could be brought better into context clarifying that the antiviral effects of tubercine and its derivates are due to off-target effects.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript describes the crystallographic screening of a number of small molecules derived from the natural substrates S-adenosyl methionine (SAM) and adenine, against the SARS-CoV-2 2'-O-methyltransferase NSP16 in complex with its partner NSP10. High-quality structures for several of these are presented together with efforts to evaluate their potential biophysical binding and antiviral activities. The structures are of high quality and the data are well presented but do not yet show potency in biophysical binding. They only offer limited insights into the design of inhibitors of NSP16/10.

      Strengths:

      The main strengths of the study are the high quality of the structural data, and associated electron density maps making the structural data highly accurate and informative for future structure-based design. These results are clearly presented and communicated in the manuscript. Another strength is the authors' attempts to probe the binding of the identified fragments using biophysical assays. Although in general the outcome of these experiments shows negative data or very weak binding affinities the authors should be commended for attempting several techniques and showing the data clearly. This study is also useful as an example of the complexities associated with drug discovery on a bi-substrate target such as a methyltransferase, several of the observed binding poises were unexpected with compounds that are relatively similar to substrates binding in different parts of the active site or other unexpected orientations. This serves as an example of how experimental structural information is still of crucial importance to structure-based drug design. In general, the claims in the manuscript are well supported by the data.

      Weaknesses:

      The main limitations of the study are that the new structures generated in the study are fairly limited in terms of chemical space being similar to either SAM or RNA-CAP analogues. It feels a little bit of a lost opportunity to expand this to more diverse ligands which may reveal potential inhibitors that are distinct from current methyltransferase inhibitors based on SAM analogues and truly allow a selective targeting of this important target.

      It is true that it makes sense to screen for more diverse compounds to expand to a more diverse ligand set and we do hope our study motivates to do so. Given the limited number of crystal structures of nsp10-16 with potential drug molecules, the aim of this study was to upgrade the data base with new complex structures to have a pool of complex structures for future compound designs with increased selectivity. Furthermore, some of the hits are known inhibitors of similar enzymes and most prominent and potent methyltransferase inhibitors are structurally related to SAM, like sinefungin and tubercidine. We do think that knowing which SAM compounds or fragments of SAM are able to bind in the nsp10-16 active site is highly valuable for further specific and optimized inhibitor design.

      Another limitation is the potentially misleading nature of the antiviral assays. It is not possible to say if these compounds display on-target activity in these assays or even if the inhibition of NSP16/10 would have any effect in these assays. Whilst the authors do mention these points I think this should be emphasized more strongly.

      That is a very valid point and we do not believe that the antiviral activity is based on on-target effects. We do agree that the way it is currently presented can be considered misleading and we indeed clarify this point in the revised version.

      Minor critical points:

      The authors state that their crystals and protein preps have co-purified SAM occupying the active site of the crystals. Presumably, this complicates the interpretation of electron density maps as many of the ligands share overlap with the existing SAM density making traditional analysis of difference maps challenging. The authors did not utilize the PanDDA analysis for this step, perhaps this is related to the presence of SAM in the ground state datasets? Also, occupancies are reported in the manuscript in some cases to two significant figures, this seems to be an overestimation of the ability of refinement to determine occupancy based on density alone and the authors should clarify how these figures were reached.

      We have used PanDDA in parallel for hit finding. We however did not see any advantages for this target over the hit finding results from the visual inspection. This is probably as mentioned because of SAM being present is the “ground state” which complicates the PanDDA map calculations.

      Regarding the occupancies, we fully agree with this comment and change it to reasonable digits and clarify how the figures were reached.  

      The molecular docking approach to pre-selection of library compounds to soak did not appear to be successful. Could the authors make any observations about the compounds selected by docking or the docking approach used that may explain this?

      Yes, it is a good point to give possible explanations why the docking approach was not successful to facilitate similar approaches in future studies.

      Reviewer #2 (Public Review):

      Summary:

      The study by Kremling et al. describes a study of the nsp16-nsp10 methyl transferase from SARS CoV-2 protein which is aimed at identifying inhibitors by x-ray crystallography-based compound screening.<br /> A set of 234 compounds were screened resulting in a set of adenosine-containing compounds or analogues thereof that bind in the SAM site of nsp16-nsp10. The compound selection was mainly based on similarity to SAM and docking of commercially available libraries. The resulting structures are of good quality and clearly show the binding mode of the compounds. It is not surprising to find that these compounds bind in the SAM pocket since they are structurally very similar to portions of SAM. Nevertheless, the result is novel and may be inspirational for the future design of inhibitors. Following up on the crystallographic screen the identified compounds were tested for antiviral activity and binding to np16-nsp10. In addition, an analysis of similar binding sites was presented.

      Strengths:

      The crystallography is solid and the structures are of good quality. The compound binding constitutes a novel finding.

      Weaknesses:

      The major weakness is the mismatch between antiviral activity and binding to the target protein. Only one of the compounds could be demonstrated to bind to the nsp16-nsp10 protein. By performing a displacement experiment using ITC Sangivamycin is concluded to bind with a Kd > 1mM. However, the same compound displays antiviral activity with an EC50 of 0.01 microM. Even though the authors do not make specific claims that the antiviral effect is due to inhibition of nsp16-nsp10, it is implicit. If the data is included, it should state specifically that the effect is not likely due to nsp16-nsp10 inhibition.

      We do believe that the antiviral data are valuable and should be published within this work. We also agree with the comment that it should be clearly stated that the antiviral effect is not likely because of nsp10-16 inhibition and we will optimize that accordingly.

      The structure of the paper and the language needs quite a lot of work to bring it to the expected quality.

      We will go through the manuscript again and further improve the structure and language as much as possible

      Technical point:

      Refinement of crystallographic occupancies to single digit percentage is not normally supported by electron density.

      We agree with that point and correct it in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study defines a fundamental aspect of protein kinase signalling in the protist parasite Toxoplasma gondii that is required for acute and chronic infections. The authors provide compelling evidence for the role of SPARK/SPARKEL kinases in regulating cAMP/cGMP signalling, although evidence linking the loss of these kinases to changes in the phosphoproteome is incomplete. Overall, this study will be of great interest to those who study Toxoplasma and related apicomplexan parasites.

      We thank the reviewers for their thoughtful and positive evaluation of our work. Below, we have addressed all of the public reviews and recommendations for the authors in point-by-point responses. Additionally, we include with this resubmission RT-qPCR data where we observe no significant change in transcript levels for the relevant AGC kinases, supporting the hypothesis that SPARK/SPARKEL–regulation is post-translational.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Herneisen et al characterise the Toxoplasma PDK1 orthologue SPARK and an associated protein SPARKEL in controlling important fate decisions in Toxoplasma. Over recent years this group and others have characterised the role of cAMP and cGMP signalling in negatively and positively regulating egress, motility, and invasion, respectively. This manuscript furthers this work by showing that SPARK and SPARKEL likely act upstream, or at least control the levels of the cAMP and cGMP-dependent kinases PKA and PKG, respectively, thus controlling the transition of intracellular replicating parasites into extracellular motile forms (and back again).

      The authors use quantitative (phospho)proteomic techniques to elegantly demonstrate the upstream role of SPARK in controlling cAMP and cGMP pathways. They use sophisticated analysis techniques (at least for parasitology) to show the functional association between cGMP and cAMP signalling pathways. They therefore begin to unify our understanding of the complicated signalling pathways used by Toxoplasma to control key regulatory processes that control the activation and suppression of motility. The authors then use molecular and cellular assays on a range of generated transgenic lines to back up their observations made by quantitative proteomics that are clear in their design and approach.

      The authors then extend their work by showing that SPARK/SPARKEL also control PKAc3 function. PKAc3 has previously been shown to negatively regulate differentiation into bradyzoite forms and this work backs up and extends this finding to show that SPARK also controls this. The authors conclude that SPARK could act as a central node of regulation of the asexual stage, keeping parasites in their lytic cell growth and preventing differentiation. Whether this is true is beyond the scope of this paper and will have to be determined at a later date.

      Strengths:

      This is an exceptional body of work. It is elegantly performed, with state-of-the-art proteomic methodologies carefully being applied to Toxoplasma. Observations from the proteomic datasets are masterfully backed up with validation using quantitative molecular and cellular biology assays.

      The paper is carefully and concisely written and is not overreaching in its conclusions. This work and its analysis set a new benchmark for the use of proteomics and molecular genetics in apicomplexan parasites.

      Weaknesses:

      This reviewer did not identify any weaknesses.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Herneisen et al. examines the Toxoplasma SPARK kinase orthologous to mammalian PDK1 kinase. The extracellular signals trigger cascades of the second messengers and play a central role in the apicomplexan parasites' survival. In Toxoplasma, these cascades regulate active replication of the tachyzoites, which manifests as acute toxoplasmosis, or the development into drug-resilient bradyzoites characteristic of the chronic stage of the disease. This study focuses on the poorly understood signaling mechanisms acting upstream of such second messenger kinases as PKA and PKG. The authors showed that similar to PDK1, Toxoplasma SPARK appears to regulate several AGC kinases.

      Strengths:

      The study demonstrated a strong association of the SPARK kinase with an elongin-like SPARKEL factor and an uncharacterized AGC kinase. Using a set of standard assays, the authors determined the SPARK/SPARKEL role in parasite egress and invasion. Finally, the study presented evidence of the SPARK/SPARKEL involvement in the bradyzoite differentiation.

      Weaknesses:

      Although the study can potentially uncover essential sensing mechanisms operating in Toxoplasma, the evidence of the SPARK/SPARKEL mechanisms is weak. Specifically, due to incomplete data analysis, the SPARK/SPARKEL-dependent phosphoregulation of AGC kinases cannot be evaluated. The manuscript requires better organization and lacks guidance on the described experiments. Although the study is built on advanced genetics, at times, it is unnecessarily complicated, raising doubts rather than benefiting the study.

      The evidence for the SPARK/SPARKEL interaction is demonstrated through diverse experimental approaches that are internally consistent. Five separate mass spectrometry experiments, with replicates and appropriate controls, with tags on either SPARK or SPARKEL, showed that SPARK and SPARKEL form a strong interaction (Figure 1A, 1D, 1E; Figure 1—figure supplement 1). Global mass spectrometry experiments assessing the impact of  SPARK or SPARKEL depletion showed similar features (a reduction in PKG and PKA abundance and up-regulation of bradyzoite-associated proteins; Figure 3C–D). The phenotypes associated with SPARK and SPARKEL depletion phenocopy one another in all cell biological assays we tested (Figure 2A, 2D and PMID: 35484233; Figure 2E–J; Figure 4E–F; Figure 6A–B). Measuring the abundance of SPARK and SPARKEL in unenriched samples was challenging, but immunoblotting and proteomics suggest that depletion of one factor leads to down-regulation of the other (Figure 2B, 2C; Figure 3—figure supplement 1), which explains the genetic and cell biological phenocopying described above. We note that “further biochemical studies are required to discern the regulatory interactions between SPARK and SPARKEL” (first submission lines 590-591) and are beyond the scope of this work.

      The evidence for SPARK/SPARKEL regulation of AGC kinase activity is demonstrated through diverse experimental approaches that are also internally consistent. PKA C1 and PKG abundance levels decrease in parasites depleted of SPARK/SPARKEL, as measured by mass spectrometry (Figure 3A and 3C) and cell-based assays for PKA C1/R (Figure 4D–F). Comparisons of the global SPARK-, PKA R-, PKG-, and PKA C3-depleted phosphoproteomes suggest that PKA and PKG activity is reduced upon SPARK depletion whereas the activity of an unrelated factor (PP1) is unaffected (Figure 4G–H, Figure 4—figure supplement 1, Figure 5D–E, Figure 7I–J). Parasites depleted of SPARK are hypersensitized to a PKG inhibitor (Figure 5B–C). SPARK, PKA, and PKG are proximal in cellulo (Figure 3I) and SPARK co-purifies with PKA C3 (Figure 7A). The kinetic-phase phenotypes associated with SPARK and SPARKEL depletion (PMID: 32379047, Figure 2A, 2D–2J) are consistent with reduced PKG activity (PMID: 28465425) and only develop after PKG has been depleted as shown by proteomics experiments (Figure 2E-J and Figure 3C). Other studies have shown that the effects of reduced PKG activity are dominant to reduced PKA C1 activity (PMID: 29030485). The replicative-phase phenotypes associated with SPARK and SPARKEL depletion are consistent with reduced PKA C3 activity (PMID: 27247232 and herein). Mechanistically, PKG and PKA C1 activity must be lower in SPARK-depleted parasites because the abundances of these kinases are lower (Figure 3A, 3C). The mechanism of regulation may be more complex in the case of PKA C3, as SPARK depletion did not cause a reduction in PKA C3 abundance as measured by cellular assays (Figure 7B–F), but PKA C3 activity decreased (Figure 7I–K). We concede that multiple mechanisms may lead to the reduction in PKA C1 and PKG abundances, such as decreased activation loop phosphorylation and autophosphorylation at other stabilizing sites or enhanced ubiquitin ligase activity leading to active degradation of the kinases; we have moved speculation regarding such mechanisms to the Discussion.

      Although the reviewer commented that the manuscript “requires better organization” in the public review, no specific recommendations were provided to the authors. Therefore, we did not change the organization of the manuscript. We added an additional paragraph to the Discussion to reiterate key findings: “A prior study identified SPARK as a regulator of parasite invasion and egress following 24 hours of kinase depletion (Smith et al., 2022). Unexpectedly, we observed that three hours of SPARK or SPARKEL depletion were insufficient to impact T. gondii motility or calcium-dependent signaling, indicating that the phenotypes associated with SPARK and SPARKEL depletion develop over time. Quantitative proteomics revealed that PKA and PKG abundances began to decrease after more than three hours of SPARK depletion. Proximity labeling experiments also suggested that SPARK, PKA, and PKG are spatially associated within the parasite cell. We propose a model in which SPARK down-regulation coincides with reduced PKG and PKA activity due to diminished protein levels.” This work built upon genetic and proteomic approaches recently described by our group, which we cited in the text and extensive methods section. We added additional experimental detail where noted in the reviewer’s recommendations to the authors.

      The study utilizes advanced genetics because biochemical tools for eukaryotic parasites are limited. For example, no antibodies for T. gondii SPARK, PKA subunits, or PKG exist; to say nothing of phosphosite-specific antibodies, which are common in the mammalian cell signaling field. Therefore, to measure the relationship between SPARK, SPARKEL, and PKA subunits, we had to generate strains in which multiple proteins were tagged with epitopes for downstream analysis. The genetic experiments included appropriate controls and were internally consistent with results obtained using orthogonal approaches, such as mass spectrometry.

      Reviewer #3 (Public Review):

      Summary:

      This paper focuses on the roles of a toxoplasma protein (SPARKEL) with homology to an elongin C and the kinase SPARK that it interacts with. They demonstrate that the two proteins regulate the abundance of PKA and PKG, and that depletion of SPARKEL reduces invasion and egress (previously shown with SPARK), and that their loss also triggers spontaneous bradyzoite differentiation. The data are overall very convincing and will be of high interest to those who study Toxoplasma and related apicomplexan parasites.

      Strengths:

      The study is very well executed with appropriate controls. The manuscript is also very well and clearly written. Overall, the work clearly demonstrates that SPARK/SPARKEL regulate invasion and egress and that their loss triggers differentiation.

      Weaknesses:

      (1) The authors fail to discriminate between SPARK/SPARKEL acting as negative regulators of differentiation as a result of an active role in regulating stage-specific transcription/translation or as a consequence of a stress response activated when either is depleted

      We demonstrate a novel function for SPARK and SPARKEL as negative regulators of differentiation. The pathways leading to differentiation are being actively studied. Up-regulation of a positive transcriptional regulator of chronic differentiation, BFD1, is sufficient to trigger differentiation in vitro in the absence of other stressful growth conditions (PMID: 31955846). SPARK or SPARKEL depletion results in up-regulation of proteins that are up-regulated upon BFD1 overexpression. Whether BFD1 overexpression or SPARK and SPARKEL depletion triggers cellular stress pathways is beyond the scope of the current work, which focused instead on the immediate effect of these pathways on AGC kinases. Study of the effect of the various kinases on the parasite phosphoproteome shows that the putative targets of PKA C3 are specifically downregulated upon SPARK knockdown, indicating PKA C3 activity is indeed decreased in the latter condition.

      (2) The function of SPARKEL has not been addressed. In mammalian cells, Elongin C is part of an E3 ubiquitin ligase complex that regulates transcription and other processes. From what I can tell from the proteomic data, homologs of the Elongin B/C complex were not identified. This is an important issue as the authors find that PKG and PKA protein levels are reduced in the knockdown strains

      Our experiments suggest that SPARK and SPARKEL form a complex, and down-regulation of one complex member leads to down-regulation of the other. Thus in all tested assays, knockdown of SPARK and SPARKEL phenocopy one another. Further biochemical and structural work will be required to determine the mechanism by which SPARKEL regulates SPARK.

      Nearly all studies of the function of elongin C have been conducted in mammalian cells. Proteins with elongin C domains may serve alternative and unexplored functions in unicellular eukaryotes. We searched for the presence of Elongin A/B and known Elongin C complex members in the T. gondii genome and were unable to identify orthologs, explaining why these proteins were not identified in mass spectrometry experiments. Please see our response in Recommendations for the Authors, Reviewer 3 point 2.

      Beyond the concerns raised by the review team, we have identified and corrected the following errors or omissions in the first submission of the manuscript:

      - Line 176 of the first submission referred to a “peptide sequence match (PSM)”, which we have changed to “peptide-spectrum match”.

      - We recolored and relabeled the lines in Figure 5A so that it is easier to match a specific peptide with a specific line; and also corrected a mislabeling.

      - Figure 7B SPARK panel was incorrectly centered. The raw files can be viewed in Figure 7—source data 2.

      - Figure 7—figure supplement 1D was missing an x-axis label.

      - Line 1172 referred to “Supplementary File X”, which we corrected to “Supplementary File 3”.

      - We have updated references to preprints that have since been published, including PMID: 38093015, 37933960, 37966241, and 37610220.

      Editors comments:

      The proteomics data reported in this study underpin the major findings and are very comprehensive. As noted in the reviews, it is strongly recommended that the authors normalize the levels of detected phosphopeptides against the levels of the parent protein in the different mutant lines in order to identify changes in protein phosphorylation that are linked to protein kinase activity rather than protein degradation. A focus on changes that occur at early time points following protein knock-down may also help to identify the main targets of each kinase.

      Please see our response to Reviewer 2 Recommendations for the Authors, points 1 and 2.

      Reviewer #1 (Recommendations For The Authors):

      During my reading, I only found one small mistake. In Figure 7F, the x-axis is missing the word 'PKA'.

      We have updated the x-axis to read “SPARK-AID/PKA C3-mNG (h. + IAA)”.

      All information, code, and reagents are clearly explained.

      Reviewer #2 (Recommendations For The Authors):

      How the phosphoproteome was analyzed needs to be clarified. The normalization step, computing the ratio of the phosphopeptide to the protein (peptide) intensity, appears omitted. It is the most critical step of the analysis. The minor shifts between protein and phosphosite intensity seem negligible, as seen in Figure 4 AB. The significant changes can only be deduced by calculating this ratio. In the current state, the presented results are inconclusive. The manuscript contains overreaching and often unsupported statements because the data has not been appropriately filtered. Related to this topic, it is advisable to use well-accepted terminology and complete words when describing proteome and phosphoproteome. The interexchange of a "peptide" and a "phosphopeptide" in the text confuses and misleads.

      To clarify the phosphoproteome analysis:

      We cite a previous description of the phosphoproteomics sample preparation workflow (lines 1124-1125 of the first submission for example). Our quantitative phosphoproteomics experiments comprise two datasets generated from the same multiplexed samples. The samples were split at the point of phosphopeptide enrichment. Ninety-five percent of the samples were subjected to phosphopeptide enrichment (titanium dioxide followed by nickel affinity chromatography; “enriched samples”). Five percent of the samples were reserved as a reference for the non-enriched proteome (“non-enriched samples”). To clarify this point, we have added the sentences “Approximately 95% of the proteomics sample was used for phosphopeptide enrichment” and “The remaining 5% of the sample was not subjected to the phosphopeptide enrichment protocol” to the Methods sections, after describing the multiplexing steps.

      The samples were fractionated separately and run separately on an LC-MS system, which is described in the Methods section, for example lines 1130-1149 of the first submission. Raw files of the phosphopeptide-enriched and unenriched samples were analyzed separately, which is described in the Methods section, for example lines 1151-1158 of the first submission. To clarify this point, we have added the sentence “Raw files of the phosphopeptide-enriched and unenriched samples were analyzed separately” to the Methods sections. Many of the search parameters and descriptions of normalization and protein abundances were described in lines 1085-1093 of the first submission in reference to the 24h SPARK depletion proteome. We added this information to the description of the SPARK depletion time course phosphoproteome data analysis: “The allowed mass tolerance for precursor and fragment ions was 10 ppm and 0.02 Da, respectively. False discovery was assessed using Percolator with a concatenated target/decoy strategy using a strict FDR of 0.01, relaxed FDR of 0.05, and maximum Delta CN of 0.05. Only unique peptide quantification values were used. Co-isolation and signal-to-noise thresholds were set to 50% and 10, respectively. Normalization was performed according to total peptide amount. In the case of the unenriched samples, protein abundances were calculated from summation of non-phosphopeptide abundances.”

      We hope that this clarifies how the unenriched sample protein-level abundances were calculated. When we discuss “protein abundance”, we are referencing the unenriched sample summed non-phosphopeptide abundance. Our phosphoproteome analysis was based only on phosphopeptides, as our phosphopeptide enrichment resulted in 99% efficiency, and peptides lacking phosphorylation sites were filtered out before subsequent analyses. We used “peptide” and “phosphopeptide” interchangeably because the only peptide-level analysis performed was based on phosphopeptide abundances. We have changed any mention of “peptide” to “phosphopeptide” in the main text. 

      “The normalization step, computing the ratio of the phosphopeptide to the protein (peptide) intensity, appears omitted. It is the most critical step of the analysis.”:

      Unlike common differential gene expression analysis pipelines, proteomics analysis pipelines are not settled. Many analyses do not perform peptide-to-parent-protein corrections; some normalize phosphopeptide abundances to parent protein abundances calculated from summing non-phosphopeptides or a combination of phosphopeptide and non-phosphopeptides on an ad hoc basis; some calculate global normalization factors based on regressions of protein and phosphopeptide abundances or other pairwise comparisons. A caveat of protein normalization of phosphopeptides is that it over-corrects cases in which protein abundance and phosphorylation are interdependent, as is the case for auto-phosphorylation and some activation loop phosphorylations (PMID: 37394063). We used the approach that retained the greatest complexity of the data, which is to not normalize abundances across different mass spectrometry experiments and discard information that was not in the overlap. We have updated Supplementary File 3.3 to include protein-level quantification values (from Supplementary File 3.2) if measured.

      We clarified that the phosphopeptide abundances and protein-level abundances were derived from different datasets that were each internally normalized (globally centered by total peptide amount). Protein-level abundances were summed from non-phosphopeptide abundances. The calculated log2 changes are based on the globally centered data within each dataset. We analyzed the kinetic profiles of changing phosphopeptide abundances relative to a control using approaches similar to those described for several recent temporally resolved T. gondii phosphoproteomes (e.g. PMID: 37933960, 35976251, 36265000, 29141230) and as described in the Methods. The approach does not first correct for unenriched-sample parent protein abundance—in some applications, unenriched samples are not collected at all; instead, phosphopeptide ratios are median-normalized to non-phosphopeptide ratios (quantified due to inefficient phosphopeptide enrichment) and are individually tested against the null distribution of non-phosphopeptide ratios (e.g. PMID: 36265000, 29141230). We did not use this approach because our phosphopeptide enrichment was 99% efficient (18518 phosphopeptides of 18758 peptides with quantification values). In several cases using our approach, parent protein abundance is not quantified in the unenriched proteome dataset, but phosphopeptides are reliably quantified in the enriched proteome dataset. We note that phosphopeptide abundance changes can be difficult to interpret in such cases, e.g. in the first submission lines 178-186 and 193-194. We have added similar text to the results noting that in the case of PKA and PKG, both unenriched parent protein and enriched phosphopeptide abundances decreased (see below). We have also moved speculation about whether SPARK phosphorylates the activation loop of PKA and PKG, or whether the down-regulation of PKA and PKG arises from indirect effects, to the Discussion.

      We have moved comparisons of protein and phosphopeptide abundances from the Results to the Discussion. We added the following sentences to the result section Clustering of phosphopeptide kinetics identifies seven response signatures: “Because non-phosphopeptide and phosphopeptide abundances were quantified in different mass spectrometry experiments, it is challenging to compare the rates of phosphopeptide and parent protein abundance changes, especially when phosphorylation status and protein stability are interconnected. In general, both PKA C1, PKA R, and PKG protein and phosphosite abundances decreased following SPARK depletion (Figure 3—figure supplement 1), as discussed further below. We also observed down-regulation of phosphosite and protein abundances of a MIF4G domain protein.” Figure 3—figure supplement 1E is a new panel that shows PKA C1, PKA R, and PKG phosphopeptide and parent protein abundances along with global changes in phosphopeptide and parent protein abundances in the cases which both were quantified. We changed lines 278-282 in the first submission to “The SPARK depletion time course phosphoproteome showed a reduction in the abundance of PKA C1 T190 and T341, which are located in the activation loop and C-terminal tail, respectively (Figure 4A). Several phosphosites residing in the N terminus of PKA R (e.g. S17, S27, and S94) also decreased following SPARK depletion (Figure 4B).” We changed lines 313-315 in the first submission to “The SPARK depletion time course phosphoproteome showed a reduction in the abundance of several phosphosites residing in the N terminus of PKG as well as T838, which corresponds to the activation loop (Figure 5A). By contrast, S105 did not greatly decrease, and S40 abundance slightly increased.”

      The description of experiments should be more detailed. For example, the 3, 8, and 24 h treatments were used reversely; thus, they should be emphasized as time points before natural egress. Consequently, it seems that 3h treatment should be prioritized, given the SPARK/SPARKEL role in egress/invasion. Unexpectedly, the study draws more attention to a 24-hour treatment. If the AID-SPARK/SPARKEL is eliminated within 1h, parasites undoubtedly accumulate numerous secondary defects during a prolonged 23h deprivation. Since the SPARK pathway activates kinase/phosphatase cascades, the 24h data is likely overwhelmed with the consequences of the long-term complex degradation, making it a poor source of the putative SPARK substrates. Likewise, the downregulation of PKA observed in the 8 hours after SPARK depletion may be an indirect effect of the SPARK degradation. The direct effects and immediate substrates should be detectable within 2-3h of auxin treatment of the nearly egressing cultures.

      The first submission described how parasites were harvested at 32 hours post-infection with 0, 3, 8, or 24 hours of IAA treatment (lines 157-160, 1097-1110, and Figure 3B). To reiterate this experimental detail, we have added “harvested 32 hours post-infection” to the sentence “...quantitative proteomics with tandem mass tag multiplexing that included samples with 0, 3, 8, and 24 hours of SPARK or SPARKEL depletion” and similarly in the figure legend. The time points are unrelated to natural egress because the experiment was terminated at 32 hours post-infection, which is earlier than the window typically used to study natural egress under these conditions (40-48 hours post-infection). We chose to terminate the experiment before natural egress to better localize phosphopeptide changes related to SPARK depletion. The phosphoproteome undergoes dramatic reorganization during egress due to the activity of myriad kinases and phosphatases (see PMID: 35976251, 37933960, and 36265000), which would have likely complicated the signal.

      A pivotal result motivating time-course experiments and analysis was that SPARK/SPARKEL's role in egress and invasion emerges only after an extended depletion period (Figure 2E–J, first submission lines 126-145). The 24h depletion was used in the experimental system that first identified SPARK as a regulator of egress, which motivated our initial experiments, as stated in the first submission lines 126-144 and 149-151. We draw attention to the observation that SPARK and SPARKEL phenotypes develop over time in the first submission, lines 137-145. The role for SPARK/SPARKEL in egress/invasion does not manifest at 3h depletion; it manifests at 24h depletion. To ensure that this point is not overlooked by the reader, we have created a new heading in the Results section (SPARK and SPARKEL depletion phenotypes develop over time) for the paragraph that was previously lines 137-145. The remainder of the manuscript integrates data from proteomic, genetic, and cell-based assays across temporal dimensions to build a working model of how the phenotypes associated with SPARK depletion develop over time.

      Underpinning this comment is an assumption that phosphopeptides that decrease the most rapidly following a kinase’s depletion are direct substrates, whereas phosphopeptides that decrease with slower kinetics are not. This is not always the case. Consider a kinase that phosphorylates sites on substrate A and substrate B. The site on substrate A is also the target of a phosphatase, whereas the site on substrate B is recalcitrant to phosphatase activity. If the kinase were inhibited, then the site on substrate A would be actively dephosphorylated. As measured by a phosphoproteomics experiment, the abundance of the substrate A phosphopeptide would drop rapidly due to the inactivity of the kinase and activity of the phosphatase. In the text, we called such sites “constitutively regulated” or dynamic—they are actively dephosphorylated and phosphorylated within a short timeframe. The phosphosite on substrate B is comparatively static; once it is phosphorylated by the kinase, it is unaffected by subsequent inhibition of the kinase. Only newly synthesized substrate B molecules would be affected by kinase inhibition. As measured by a phosphoproteomics experiment, the abundance of the substrate B phosphopeptide would drop more gradually after kinase inhibition, as the unphosphorylated peptide is found only on newly synthesized proteins that were not previously exposed to kinase activity. An example of the scenario described for substrate A would be that of yeast Cdk1 T14/Y15, which is phosphorylated by Wee1 and dephosphorylated by Cdc25 (e.g. PMID: 7880537). An example of the scenario described for substrate B would be that of the human PKA C activation loop T197, which is phosphorylated by PDK1 and is phosphatase-resistant under physiological conditions (e.g. PMID: 22493239, 15533936).

      Both substrate A and B may be “direct” and functionally relevant targets of the kinase. Categorizing substrates as “immediate” is comparatively less informative in this context (although it may be relevant when studying fast, synchronized processes with high temporal resolution, such as induced Plasmodium spp. gametocyte activation or stimulation of T. gondii secretion). Furthermore, our earlier experiments had shown that the role for SPARK/SPARKEL in motility manifests after 3h depletion and is complete by 24h depletion. By this logic, we were most interested in the candidates showing differences at these time points. We conducted proximity labeling experiments to identify the overlap of proteins that exhibited SPARK-dependent decreases in the global proteomics and were also proximal to SPARK in space (first submission Figure 3I and lines 260-275), thus revealing a prioritized list of candidates, which included PKG and PKA. When technically feasible, we included a temporal dimension to follow-up experiments, rather than relying on a 24h terminal comparison (e.g. Figure 4E–H, Figure 5D–E, Figure 7D–F, Figure 7I–K; all first submission).

      Fig2 (B and C). What antibodies had been used to detect tagged proteins? There is a concern regarding the use of multiple tags attached to the same protein to the point that it doubles the size of the studied protein. The switch of the mobility of the SPARK and SPARKEL on the WB due to a change in MW adds to the confusion. Furthermore, the study did not use all the fused epitopes (e.g., HA). At the same time, the same V5 tag was used to detect two factors in the same parasite. Although the controls are provided, it does not eliminate the possibility that the second band on the WB results from one protein degradation rather than the presence of two individual proteins. Different tags should be used to confirm the co-expression of two proteins. Panel E is missing the X-axis label.

      Figure 2B was incorrectly labeled; the labels corresponding to SPARK and SPARKEL were switched. We corrected this error in the revised figures. The antibodies used were mouse monoclonal anti-V5 as described in the key resources table of the first submission. We added “V5” to Figure 2A and 2B. Regarding the effect of the tagging payload attached to the proteins, we have included in all assays a control relative to a parental strain (TIR1) without a tagging payload, and additionally included internal controls within tagged strains to calculate dependency of a phenotype on IAA treatment. The western blots in Figure 2B and 2C are from two different strains and experiments. The strains and experiments are described in the first submission main text (lines 113-124), the figure legend (lines 1847-1850), the key resources table, and the methods (lines 650-664, 872-891). A description of the SPARK-AID/SPARKEL-mNG strain was included in the key resources table but omitted in the methods. We therefore added the following section to the Methods:

      “SPARKEL-V5-mNG-Ty/SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1

      The HiT vector cutting unit gBlock for SPARKEL (P1) was cloned into the pALH193 HiT empty vector. The vector was linearized with BsaI and co-transfected with the pSS014 Cas9 expression plasmid into SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1 parasites. Clones were selected with 1 µM pyrimethamine and isolated via limiting dilution to generate the SPARKEL-V5-mNG-Ty/SPARK-V5-mAID-HA/RHΔku80Δhxgprt/TIR1 strain. Clones were verified by PCR amplification and sequencing of the junction between the 3′ end of SPARKEL (5’-GGGAGGCCACAACGGCGC-3’) and 5′ end of the protein tag (5’-gggggtcggtcatgttacgt-3’).”

      To clarify the expected MW of each species, we have added the following text to the Methods:

      “The expected molecular weight of SPARKEL-V5-HaloTag-mAID-Ty is 66 kDa, from the 42.7 kDa tagging payload and 23.3 kDa protein sequence. The expected molecular weight of SPARK-V5-mCherry-HA is 89.7 kDa, from the 31.9 kDa tagging payload and 57.8 kDa protein sequence. The expected molecular weight of SPARK-V5-mAID-HA is 71.3 kDa, from the 13.5 kDa tagging payload and 57.8 kDa protein sequence. The expected molecular weight of SPARKEL-V5-mNG-Ty is 55.2 kDa, from the 31.9 kDa tagging payload and 23.3 kDa protein sequence.”

      SPARK and SPARKEL are lowly expressed, which may have been compounded by basal degradation due to the AID tag (see for example Figure 3—figure supplement 1D of the first submission). We attempted several immunoblot conditions and antibodies, and only the V5 antibody proved effective in recognizing these proteins above the limit of detection. For this reason, we included an additional single-tagged control in each immunoblot experiment. Uncropped images of the blots are included in the first submission as Figure 2—figure supplement 1D and E and as Figure 2 source data. We added the following statement to the results section of the text:

      “However, SPARK and SPARKEL abundances are low and approach the limit of detection. We could only detect each protein by the V5 epitope. Although our experiments included single-tagged controls, we cannot formally eliminate the possibility that SPARK-AID yields degradation products that run at the expected molecular weight of SPARKEL. More sensitive methods, such as targeted mass spectrometry, may be required to measure the absolute abundance and stoichiometries of SPARK and SPARKEL.”

      We added “h +IAA” to the x-axis of panel 2E.

      Fig. 3. There is plentiful proteomic data on the factor-depleted parasites. Can it be used to confirm the co-degradation of the SPARK/SPARKEL complex components? This figure mainly includes quality control data that can be moved to Supplement. Did you detect SPARKEL in the TurboID experiment described in panel I? The plot shows only an AGC kinase.

      SPARK and SPARKEL are lowly expressed, and we often do not detect SPARK or SPARKEL peptides with quantification values in complex samples (such as global depletion proteomes and phosphoproteomes; IPs and streptavidin pull-downs are comparatively less complex, with IPs being the least complex samples). We discussed this caveat in the first submission lines 178-186. To additionally clarify this point, we have added “We were unable to measure SPARK or SPARKEL abundances in this proteome” earlier in the text.

      We consider the figure panels relevant to the discussion in the text.

      SPARKEL was not quantified in the SPARK-TurboID experiment (Supplementary File 2). We have added “SPARKEL was not quantified in this experiment” to the text. “Not quantified” is a different outcome from “quantified but not enriched”. The interaction between SPARK and SPARKEL is supported by five other independent interaction experiments in which SPARKEL was quantified (Figure 1A, 1D, 1E; and Figure 1—figure supplement 1). The added insight from the SPARK proximity labeling experiments comes from integration with the global proteomics, which suggests that AGC kinases are in proximity to SPARK and exhibit SPARK-dependent stability and hence activity. The logic of the proximity labeling experiment is described in lines 258-275 of the first submission.

      Fig. 6G is missing deltaBDF1 control for unbiased evaluation of the SPARK KD effect.

      The logic of this experiment was to evaluate whether excess differentiation caused by SPARK and PKA C3 depletion (Figure 6A and 6B) was dependent on the BFD1 circuit. The ∆bfd1 phenotype is well-established under these experimental conditions: parasites lacking BFD1 do not differentiate under spontaneous or alkaline conditions (e.g. PMID: 31955846, 37081202, 37770433). Parasites lacking BFD1 do not differentiate when SPARK and PKA C3 are depleted, suggesting that differentiation caused by SPARK or PKA C3 depletion occurs through the BFD1 circuit. If differentiation caused by SPARK or PKA C3 depletion did not depend on the BFD1 circuit, we might have observed differentiation in the SPARK- and PKA C3-AID/∆bfd1 mutants.

      To clarify this point, we have changed the first sentences of the last paragraph in the results section Depletion of SPARK, SPARKEL, or PKA C3 promotes chronic differentiation: “To assess whether excess differentiation caused by SPARK and PKA C3 depletion is dependent on a previously characterized transcriptional regulator of differentiation, BFD1 (Waldman et al., 2020), we knocked out the BFD1 CDS with a sortable dTomato cassette in the SPARK- and PKA C3-AID strains (Figure 6–figure supplement 1). The resulting SPARK- and PKA C3-AID/∆bfd1 mutants failed to undergo differentiation as measured by cyst wall staining (Figure 6G–H), suggesting that differentiation caused by depletion of these kinases depends on the BFD1 circuit.”

      Lines 239-242. The logic behind the categories of "constitutively regulated sites" and "newly synthesized proteins dependent on SPARK activation" is odd. The former (3h treatment) represents the SPARK-specific events (even though it should be shortened to 1-2h), while an 8h treatment is already contaminated with secondary effects. Since Toxoplasma divides asynchronously, the "newly synthesized" proteins will be present at the time. Also, the protein phosphorylation does not always lead to substrate activation; it can be repressive, too.

      We describe the logic in response to a comment above (substrate A vs. substrate B). It is correct that T. gondii divides asynchronously, with a cell cycle of approximately 8 hours, and 60% of parasites in G1 at a given time (PMID: 11420103). The proteomics experiments measure peptide and protein abundances at a population level. Newly synthesized proteins will be present at all time points; but the proportion of proteins synthesized after SPARK depletion relative to proteins synthesized before SPARK depletion will increase over time.

      We moved lines 238-243 from the first submission to the Discussion.

      It is accurate that phosphorylation does not always lead to substrate activation; it can also be repressive or not change substrate behavior. However, in the case of protein kinases, activation loop phosphorylation is highly correlated with activation (e.g. PMID: 15350212, 31521607).

      Line 250-252: Because the SPARK degradation did not affect intracellular replication, SPARK is unlikely to affect cell cycle-specific phosphorylation.

      To parallel the prior sentences describing different SPARK-dependent down-regulated clusters, we truncated this sentence to “The final cluster of depleted phosphopeptides, Cluster 4, only exhibits down-regulation at 8h of IAA treatment.”

      SPARKEL depletion did not significantly affect intracellular replication under the time frames investigated here (approximately 25 hours post-invasion; Figure 2D). A prior study reported that SPARK depletion did not affect intracellular replication measured on a similar timescale (PMID: 35484233).

      The opening sentence of the Discussion: Typically, we refer to the newly discovered proteins as the orthologs of the previously discovered counterparts and not the vice versa. Thus, calling Toxoplasma SPARK the ortholog of mammalian PDK1 would be more appropriate.

      We changed the opening sentence of the Discussion to “SPARK is an ortholog of PDK1, which is considered a key regulator of AGC kinases”.

      Reviewer #3 (Recommendations For The Authors):

      (1) Authors should show alignment of SPARKEL with Elongin C. Are key residues conserved?

      We have added an alignment of the SKP1/BTB/POZ domains of Homo sapiens elongin C, S. cerevisiae elongin C, and T. gondii SPARKEL as Figure 1—figure supplement 1B. This panel highlights elongin B interface, cullin binding sites, and target protein binding sites based on the human elongin C annotation. As discussed below, these interfaces may not be functionally conserved in T. gondii. Ultimately, future mechanistic and structural studies beyond the scope of the current work will be required to determine how SPARK and SPARKEL physically interact. The Discussion states, “further biochemical studies are required to discern the regulatory interactions between SPARK and SPARKEL” (lines 590-591).

      (2) The failure to identify other Elongin B/C complex members should be addressed by direct IP analysis.

      Indeed, elongin C has traditionally been characterized as a component of multisubunit complexes comprising Elongin A/B/C or Elongin BC/cullin/SOCS that regulate transcription or function as ubiquitin ligases, respectively (for a review, PMID: 22649776). We see two major issues when attempting to generalize these results to apicomplexan parasites. First, nearly all studies of the function of elongin C have been conducted in a single eukaryotic supergroup (the opisthokonts, including yeast and metazoans). The majority of eukaryotic diversity exists in other supergroups, including the SAR supergroup to which apicomplexans such as T. gondii belong (PMID: 31606140). Proteins with elongin C domains may serve alternative and unexplored functions in non-opisthokont unicellular eukaryotes. Second (in support of the first), we were unable to find orthologs of many of the opisthokont complex members in T. gondii, as systematically described below.

      By BLAST, the most similar protein to SPARKEL in S. cerevisiae is ELC1 (YPL046C), with a BLAST E = 0.003. The next most similar protein was SCF ubiquitin ligase subunit SKP1 (YDR328C) with an E value of 0.62. ELC1 is 99 amino acids. The Elongin C (IPR039948) and SKP1/BTB/POZ superfamily domains (IPR011333) span most of this sequence. SPARKEL is 216 amino acids; the Elongin C and  SKP1/BTB/POZ superfamily domains occupy the C-terminal half of the protein. The N-terminal domain of SPARKEL may be important for its function; however, future work is required to address this hypothesis.

      Elongin B: Elongin B is not found universally amongst even opisthokonts; fungi and choanoflagellates lack obvious orthologs. The most similar T. gondii protein to human Elongin B (Q15370) by BLAST is TGME49_223125 (E = 0.017), an apicoplast ubiquitin-like protein PUBL (PMID: 28655825, 33053376). TGME49_223125 has a C-terminal ubiquitin-like domain (IPR000626) but no ELOB domain (IPR039049); indeed, no T. gondii protein has an ELOB domain that can be identified by sequence searching. Given the lack of similarity between EloB and TGME49_223125, as well as this protein’s possible red algal endosymbiont origin, we consider it an unlikely ortholog of EloB and topologically unlikely to  interact with the SPARK/SPARKEL complex. We did not detect TGME49_223125 in SPARK or SPARKEL IPs (Supplementary File 1).

      Elongin A: T. gondii appears to lack a human elongin A ortholog (Q14241) on the basis of sequence similarity. The most similar T. gondii protein to yeast Elongin A (O59671) by BLAST is TGME49_299230 (E = 0.022). Yeast EloA is 263 amino acids. TGME49_299230 is 1101 amino acids and does not have an EloA domain (IPR010684), suggesting it is not a true EloA ortholog.

      Suppressor of cytokine signaling (SOCS): T. gondii appears to lack human SOCS1 or SOCS2 orthologs (O15524 and O14508) on the basis of sequence similarity. We were unable to identify T. gondii proteins with SOCS domains (PF07525, SM00253, SM00969, and SSF158235).

      Von Hippel-Lindau tumor suppressor (VHL): T. gondii appears to lack a human VHL ortholog (P40337) on the basis of sequence similarity.  We were unable to identify T. gondii proteins with VHL domains (IPR024048, IPR024053, PF01847, and SSF49468).

      Cul-2/5: Cullins appeared early in the eukaryotic radiation (PMID: 21554755), and thus T. gondii possesses several. Since the ELC complex has been best characterized with human cullin-2 (Q13617) and cullin-5 (Q93034), we searched for orthologs of these proteins and identified TGME49_289310, TGME49_289310, and TGME49_316660. TGME49_289310 functionally resembles cullin-1 of the SCF complex (PMID: 31348812). None of these proteins were enriched in the SPARK or SPARKEL IPs (Supplementary Table 1).

      Rbx1: We searched for human Rbx1 orthologs (P62877) and identified TGME49_213690, which functionally resembles Rbx1 of the SCF complex (PMID: 31348812); as well as several other RING proteins (TGME49_267520, TGME49_277740, TGME49_261990, and TGME49_232160) that were not found in the SPARK or SPARKEL IPs (Supplementary File 1).

      Rbx2: We searched for human Rbx2 orthologs (Q9UBF6) and identified several RING proteins (TGME49_285190, TGME49_254700, TGME49_292340, TGME49_226740, TGME49_244610, and TGME49_304460) that were not found in the SPARK or SPARKEL IPs (Supplementary File 1). No T. gondii protein has an Rbx2 domain (cd16466) that can be identified by sequence searching.

      In conclusion, we conducted “direct IP analysis” (Figure 1A, 1D; Figure 1-supplement 1A) of the SPARK and SPARKEL complex in the first submission of the manuscript. The observation that SPARK and SPARKEL form strong interactions was validated in cellulo via proximity labeling (Figure 1E; Figure 1-supplement 1B) in the first submission of the manuscript. These results are described together in the results section SPARK complexes with an elongin-like protein, SPARKEL (lines 75-110, first submission of manuscript). The failure to identify an interaction between SPARKEL and Elongin B/C complex members in T. gondii may be due to the observation that Elongin B and several ELC complex members do not exist in most eukaryotes, including T. gondii. We added the sentences “The function of proteins with Elongin C-like domains has not been widely investigated in unicellular eukaryotes” to the Results and “However, the SPARK and SPARKEL IPs and proximity experiments failed to identify obvious components of ubiquitin ligase complexes” to the Discussion.

      (3) PKA and PKG half-lives should be measured as well as their transcript abundances.

      The finding that PKA C1 and PKG protein abundances decreased upon SPARK/SPARKEL depletion was internally consistent across experiments. This down-regulation may be due to transcriptional, translational, or post-translational mechanisms. We measured PKG and PKA C1 transcript abundances in SPARK-AID and TIR1 parasites after 24 hours of IAA treatment using RT-qPCR. We did not detect significant differences in transcript levels of the queried kinases. These findings suggest that SPARK depletion leads to PKG and PKA down-regulation through post-transcriptional mechanisms. Translational control is normally enacted globally, for example through regulation of eukaryotic translation factors (PMID: 15459663). The rapid and specific down-regulation of PKG and PKA C1 would suggest that the kinase abundance levels are regulated by non-global translational mechanisms (e.g. mRNA-specific) or rather post-translational mechanisms.

      Substantial additional work is required to determine protein half-lives in eukaryotic parasites. In our discussion of possible mechanisms and models, we were agnostic as to the cause of reduced PKG and PKA abundances upon SPARK depletion. We note in the discussion, “The cause for reduction of PKA C1 and PKG levels requires further study” (lines 541-542).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      (1) P-values should be reported adjusted for multiple tests or, at the very least, note that they are unadjusted to alert the reader that they may be biased by winner's curse.

      Throughout the manuscript, we applied the false discovery rate threshold to declare results that were statistically relevant for discussion. However, for reporting in abstract, we believe the raw p-values are most straightforward as we only reported the most important and robust results, and considering that 1) multiple testing correction does not change the ranking of the adjusted p-values; 2) p-value adjustment depends on both the method and the number of hypothesis tested; 3) all reporting of the most significant discovery results are prone to winner’s curse, but in the context of our study: the GFI1 finding was confirmatory in nature, thus raw p-value allows for a direct comparison with existing studies.

      We have taken the suggestion to quote the FDR-adjusted p-values throughout the manuscript for meta-analyzed results and discussed the impact of FDR correction for the EWAS and MRS association differed as a result of the number of hypothesis in each context:

      “For each EWAS or meta-analysis, the false discovery rate (FDR) adjustment was used to control multiple testing and we considered CpGs that passed an FDR-adjusted p-value < 0.05 to be relevant for maternal smoking.”

      “An FDR adjustment was used to control the multiple testing of meta-analyzed association between MRS and 25 (or 23, depending on the number of phenotypes available in the cohort) outcomes, and we considered association that passed an FDR-adjusted p-value < 0.05 to be relevant.”

      (2) The odds ratios and p-values reported in the abstract for associations of the MRS with smoking status and smoking exposure per week appear to be missing from the results section of the manuscript or (supplementary) tables.

      The results for smoking status during pregnancy was added to the results:

      “As a result, the epigenetic maternal smoking score was strongly associated with smoking status during pregnancy (OR=1.09 [1.07,1.10], p=5.5×10-33) in the combined European cohorts.”

      The exposure association was reported in the result section and Supplementary Table 8. We do note the typo in the cohort specific p-values, which now has been corrected.

      (3) It is misleading to report a lack of MRS associations with maternal smoking in South Asians without also stating that there were only two smokers.

      We agree with the reviewer that an association test would not be justified given the lack of smoking in the present South Asian cohort. We also removed the p-value of association for the START cohort in Figure 3, based on this and comment #4 from reviewer #3. The relevant results have been revised as follows:

      “The HM450 MRS was significantly associated with maternal smoking history in CHILD and FAMILY (n = 397), but we failed to meaningfully validate the association in START (n = 503; Figure 3) – not surprisingly – due to the low number of ever-smokers (n = 2).”

      (4) It is potentially confusing to report MRS associations with maternal smoking by ethnicity but then report associations with birth size and length combined without any explanation. The most novel result of this study is that there is virtually no maternal smoking among the South Asians and yet the MRS is associated with birth weight and size and with height at age 2. This result is buried in the combined analysis. I would suggest reporting the MRS associations with height and weight separately as has been done for maternal smoking behavior.

      We thank the reviewer for this suggestion and this has now been added the new Table 3, showing the cohort specific and meta-analyzed effect sizes. In the revision, we highlighted that the ethnic specific MRS associations, such as with smoking exposure at various age (1 and 3 years) and skinfold thickness in European cohorts but not the South Asian cohort, as well as associations that were more homogenous, such as the birth weight and unique body size association in combined cohorts. In particular, the MRS in the South Asian cohort exhibited a consistent association with body size at various time points (at birth, 1, 2, and 5 year) with similar effect sizes. The following was added to the results:

      “A higher maternal smoking MRS was significantly associated with smaller birth size (-0.37±0.12, p = 0.0023; Table 3) and height at 1, 2, and 5 year visits in the South Asian cohort (Table 3). We observed similar associations with body size in the white European cohorts (heterogeneity p-values> 0.2), collectively, the MRS was associated with a smaller birth size (-0.22±0.07, p=0.0016; FDR adjusted p = 0.019) in the combined European and South Asian cohorts (Table 3). Meanwhile, a higher maternal smoking MRS was also associated with a lower birth weight (-0.043±0.013, p = 0.001; FDR adjusted p = 0.011) in the combined sample, though the effect was weaker in START (-0.03±0.02; p = 0.094) as compared to the white European cohorts.

      The meta-analysis revealed no heterogeneity in the direction nor the effect size of associations for body size and weight between populations at birth or at later visits (heterogeneity p-values = 0.16–1; Supplementary Table 8).”

      Reviewer #3:

      (1) You mention that the 450K Score performs best even though only 10/143 are included for some populations. Did you explore recalibration of the MRS using only those 10 CpGs?

      We thank the reviewer for this comment – due to an error in result transferring, the number of overlapping CpGs between the 450K score and the targeted array was in fact 26. This error only impacted results relevant to the FAMILY study using the HM450K score and did not materially change our results nor conclusions. We have updated accordingly, Table 3, Suppl. Tables 5, 8, 9, Figure 3-B, and Suppl. Figures 5, 6-B), 7-B) and 7-D), and throughout the manuscript for meta-analyzed MRS associations.

      The subset of 26 CpGs using the originally derived weight was expected to perform worse than the original HM450K score using the full 143 CpGs. When we did restrict the methylation score construction to these 26 CpGs, the performance in CHILD was worse than the original score, but comparable to FAMILY (updated Suppl. Table 5). These 26 CpGs did overlap with the targeted score derived in CHILD (13 out of 15 present) and in FAMILY (19 out of 63 present), suggesting moderate agreement between the array platform as well as across studies.

      In other words, while the subset of 26 CpGs had reasonable performance in both CHILD and FAMILY, both studies could benefit by inclusion of the additional CpGs in the original score. We have included a sentence to discuss the choice of validation study and the trade-off between sample size and # of CpGs under response to Reviewer 3 comment # 2.

      (2) Could the internal validation performance be driven by sample size of the training, providing support for the need for larger training sizes? Should this be discussed in the study?

      The validation study, CHILD, has the smaller sample size between the two European cohorts. While both potential data for validation had smaller sample sizes, we chose CHILD (n=347), rather than FAMILY (n=397) as it had better coverage with respect to the discovery EWAS or the training data (# of associated CpGs = 3,092, n = 5,647). Beyond the signals of association, the validation performance also depends on a mix of overall sample size and the proportion of current smokers. Given the proportion of current smokers, the effective sample size for a direct comparison, i.e. equivalently-powdered sample size of a balanced (50% cases, 50% controls) design, are 41.7 and 104.7 for CHILD and FAMILY, respectively. While we are unable to directly compare whether a larger effective sample size produced a better performing score, we believe this to be the case, and thus a larger validation study would boost the performance of the methylation score. We have added the following to the discussion:

      “Given the proportion of current smokers, the effective sample size for a direct comparison between CHILD and FAMILY, i.e. equivalently-powdered sample size of a balanced (50% cases, 50% controls) design, were 41.7 and 104.7, respectively. While CHILD had a lower effective sample size, we ultimately chose it for validating the methylation score to better cover the CpGs that were significant in the discovery EWAS. A larger validation study will likely further boost the performance of the methylation score and be considered in future research.”

      (3) Figure 1: It is very helpful to have an overview diagram, but this should then follow the flow of the manuscript to aid the reader. Currently, the diagram does not follow the flow of the manuscript and thus is rather confusing - for instance, the figure starts with the MRS but initially an EWAS is conducted in the manuscript itself. I suggest to adapt the overview figure accordingly. Moreover, a description for (A), (B), (C) is not provided in the figure legends. Figure 1 could thus be improved further.

      We thank the reviewer for the suggestion to improve the key figure that summarizes the manuscript. The EWAS workflow for the primary, secondary and tertiary outcomes, as well as the European cohorts meta-analysis has been added to the updated sub-figure A). The description for each subfigures has also been added to the figure legends as follows:

      “Figure 1-A) shows the epigenome-wide association studies conducted in the European cohorts (CHILD and FAMILY); Figure 1-B) illustrated the workflow for methylation risk score (MRS) construction using an external EWAS (Joubert et al., 2016) as the discovery sample and CHILD study as the external validation study, while Figure 1-C) demonstrates the evaluation of the MRS in two independent cohorts of white European (i.e. FAMILY) and South Asian (i.e. START). The validated MRS was then tested for association with smoking specific, maternal, and children phenotypes in CHILD, FAMILY, and START, as shown in Figure 1-D).”

      (4) Figure 3: The readability and information content in this figure, and other figures containing boxplots (e.g., Supplementary Figure 5), could be improved. I would suggest to justify X axis labels to the axis rather than overlapping, and importantly, show individual data points wherever possible (e.g., overlaying the box plots). In c), the ANOVA is not justified given the sample size in START. In general, it is worth excluding the START cohorts from this analysis on the justification of a too small sample size for maternal smokers.

      We thank the reviewer for their thoughtful points for improvement. The axis labels have been wrapped to avoid overlapping, and the data points added to the boxplots. ANOVA p-value for START was removed due to the low counts of smokers in the figure and manuscript throughout. However, we retained START in Figure 3 and other boxplots to show the distribution of the score for non-smokers to benchmark with the European cohorts.

      (5) In addition to boxplots, it may be helpful to show AUC diagrams for ROC curves (e.g. Figure 3). AUCs are reported in the Tables but not shown. Additionally, all AUC results should include 95% Confidence intervals.

      This is a great suggestion and we have added the corresponding ROC, annotated with AUC (95% CI) to Figure 3. The 95% CI for all AUC results were added to the Tables and main text. The following was added to Methods:

      “The reported 95% confidence interval for each estimated AUC was derived using 2,000 bootstrap samples.”

      (6) Supplementary Figure 6: It could be helpful to discuss the amount of overlap between the different MRS.

      Most of the scores were derived using the Joubert et al., (2016) EWAS as the discovery sample, including ours, and thus there will be overlap between the scores. The exception was the GondaliaScore, which contained only 3 CpGs that do not overlap with any other scores.

      While different scores might not have selected completely identical sets of CpGs, the mapped genes are highly consistent across the scores. We have added to the discussion and results the extent of overlap between the top scores:

      “In particular, scores that were derived using the Joubert EWAS as the discovery sample, including ours, had higher pairwise correlation coefficients across the birth cohorts, with many of the CpGs mapping to the same genes, such as AHRR, MYO1G, GFI1, CYP1A1, and RUNX3.”

      (7) Supplementary Figure 7: This figure is never referenced in the text and from the legend itself it is not too clear what it is trying to show. Please refer to it in the main text with some additional context.

      Supplementary Figure 7 was referenced in the Results under subsection “Methylation Risk Score (MRS) Captures Maternal Smoking and Smoking Exposure”, following the<br /> Methods subsection “Statistical analysis” where we wanted to examine a systematic difference. We made revision to the main text to clarify the analysis:

      “For the derived MRS, we empirically assessed whether a systematic difference existed in the resulting score with respect to all other derived scores. This was examined via pairwise mean differences between the HM450 and other score using a two-sample t-test and an overall test of mean difference using an ANOVA F-test, among all samples and the subset of never smokers.”

      (8)   Tables: Tables are currently challenging to read and perhaps more formatting could be done to improve readability.

      We thank the reviewer for the suggestion. Main tables have been reformatted to a landscape layout and each numeric cell moved to the centre to improve readability.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Data on SSCs are published from a previous report (Fig. 1C). These should be deleted or marked as such.

      We acknowledge the need for clarification regarding our study population for the germ cell stainings. As stated in our Materials and Methods section, our current study population includes the cohort from our previous publication (Vereecke et al., 2020), supplemented by nine additional participants, totaling n=106 trans women. Fig. 1C incorporates both previous and new data on germ cells, and this was further clarified in the Materials and Methods section.

      (2) Many micrographs are suboptimal and need to be replaced by better photos presenting cellular details more clearly. 

      The Figures were remade to solve the suboptimal resolution.

      (3) Table 2 would benefit from a column indicating the target cell or organelle.

      This column was added to Table 2.

      (4) The pubertal status is poorly defined by pre- and peripubertal terms. The authors should add more informative clinical scores. 

      We included information on the Tanner stages of the trans women in our cohort (all G5), as well as details on the selection criteria for our controls and their pubertal status.

      (5) The characterization of Leydig cells is incomplete. Several better markers would validate the findings. 

      As briefly touched upon in the discussion, the marker delta-like homolog 1 would indeed be valuable to assess the presence of truly immature Leydig cells. Unfortunately, our attempts to optimize the immunofluorescence protocol for this marker were unsuccessful, resulting in a double staining instead of a triple staining for the Leydig cells. This statement was also added to the Discussion.  

      (6) The selection bias for datasets is obvious. It seems that the authors try to create nice stories but do not always refer to less compelling datasets. Here a more critical view may be necessary to gain a more realistic view and may open alternative explanations. 

      We would appreciate clarification on which datasets may have been insufficiently reviewed and how our selection of highlights may have introduced bias to the interpretation and conclusion of the study. It is important to note that we did not select any patients/ data; all patient data were incorporated into our results section.

      (7) The term rejuvenation for the stem cell niche/germ cell complement is misleading in the title and text. Could the authors consider another team e.g... restoration., (de)differentiation. Alternatively, define the term juvenation in a more substantial manner. 

      We did not change the term “partial rejuvenation” as we believe it best describes our findings. We did however introduce the term in a more substantial manner in our Abstract and Discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors provided a lot of scattered data, but it would be useful to formulate clear criteria (hormonal therapy, age, end points, etc.) that the material must meet so that it can be used for research into prepubertal processes. 

      We have added these criteria to our Discussion. However, our current results do not yet reveal how these tissues behave in vitro. Ongoing research is addressing this question and will be presented in a future paper.

      (2) Is there any research on the preservation of functions of testicular cells from trans women?

      This data would be very useful, for example, for models for drug testing.  Yes: a reference to this paper was added to our Discussion.

      (3) It is recommended to present the data in a table reflecting the correlations found by the authors and the correlations from the literature between cellular changes and hormone levels and age. 

      After careful consideration, we have decided to proceed without incorporating these suggested changes. Our paper focuses on original findings rather than synthesizing existing literature. As such, we have chosen to emphasize our novel results and to compare them to the existing literature in the discussion section.

      (4) The authors can also provide data on clinical standards for hormone levels depending on gender and age. 

      This was added as Supplementary Tables 1-6.

      (5) It is recommended to add links to sources from which information about cellular prepubertal, pubertal and adult markers was taken. 

      This information was added throughout the manuscript.  

      (6) Is it known which cells within the wall of the seminiferous tubules in adults express AMH? Please clarify. 

      It has been shown that AMH receptor type 2 starts to be expressed in peritubular mesenchymal cells within the tubular walls during puberty and it remains so throughout adulthood (Sansone et al., 2020). AMH bound to this receptor may help explain the observed AMH signal in the tubular wall of peripubertal and adult controls. This information was added to our Discussion.

      (7) How was the degree of hyalinization assessed? It's not obvious from the pictures.

      This was further clarified in the Materials & Methods section.

      (8) Why were inhibin B and AMH not measured in all patients? 

      Inhibin B and AMH levels were not available for all patients due to the retrospective nature of these analyses. The measurements were not consistently recorded for all individuals within the historical dataset upon which our research relies.

      (9) Why does picture 3A present few SOX9 on adult Sertoli cells, although this is their typical marker?

      SOX9 was present in the adult Sertoli cells. However, this signal appears to be more "diluted" in adults due to their ongoing spermatogenesis.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Ewing sarcoma is an aggressive pediatric cancer driven by the EWS-FLI oncogene. Ewing sarcoma cells are addicted to this chimeric transcription factor, which represents a strong therapeutic vulnerability. Unfortunately, targeting EWS-FLI has proven to be very difficult, and a better understanding of how this chimeric transcription factor works is critical to achieving this goal. Towards this perspective, the group had previously identified a DBD-𝛼𝛼4 helix (DBD) in FLI that appears to be necessary to mediate EWS-FLI transcriptomic activity. Here, the authors used multi-omic approaches, including CUT&tag, RNAseq, and MicroC to investigate the impact of this DBD domain. Importantly, these experiments were performed in the A673 Ewing sarcoma model where endogenous EWS-FLI was silenced, and EWS-FLI-DBD proficient or deficient isoforms were re-expressed (isogenic context). They found that the DBD domain is key to mediating EWS-FLI cis activity (at msat) and to generating the formation of specific TADs. Furthermore, cells expressing DBD-deficient EWS-FLI display very poor colony-forming capacity, highlighting that targeting this domain may lead to therapeutic perspectives.

      We thank Reviewer 1 for their strong summary of Ewing sarcoma background and accurate description of our experimental approaches and findings.

      Strengths:

      The group has strong expertise in Ewing sarcoma genetics and epigenetics and also in using and analyzing this model (Theisen et al., 2019; Boone et al., 2021; Showpnil et al., 2022).

      We thank the reviewer.  

      They aim at better understanding how EWS-FLI mediated its oncogenic activity, which is critical to eventually identifying novel therapies against this aggressive cancer.

      We are happy to see that our overall aim was also appreciated by Reviewer 1.

      They use the most recent state-of-the-art omics methods to investigate transcriptome, epigenetics, and genome conformation methods. In particular, Micro-C enables achieving up to 1kb resolved 3D chromatin structures, making it possible to investigate a large number of TADs and sub-TADs structures where EWS-FLI1 mediates its oncogenic activity.

      We thank Reviewer 1 for their acknowledgement of our approaches and the resolution achieved with our Micro-C experiments.  

      They performed all their experiments in an Ewing sarcoma genetic background (A673 cells) which circumvents bias from previously reported approaches when working in non-orthologous cell models using similar approaches.

      We agree with the reviewer about the importance of using model systems that accurately capture features of the disease being studied. As we have added an additional cell line in the revision we should note that this second model also represents a Ewing sarcoma genetic background while representing tumors expressing another oncogenic fusion found in this disease. 

      Weaknesses:

      The main weakness comes from the poor reproducibility of Micro-C data . Indeed, it appears that the distances/clustering observed between replicates are typically similar or even larger than between biological conditions. For instance, in Figure 1B, I do not see any clustering when considering DBD1, DBD2, DBD+1, DBD+2.

      Lanes 80-83: "KD replicates clustered together with DBD replicate 1 on both axes and with DBD replicate 2 on the y-axis. DBD+ replicates, on the other hand, clustered away from both KD and DBD replicates. These observations suggest that the global chromatin structure of DBD replicates is more similar to KD than DBD+ replicates."

      When replacing DBD replicate 1 with DBD replicate 2, their statement would not be true anymore.

      Additional replicates to clarify this aspect seem absolutely necessary since those data are paving the way for the entire manuscript.

      These are valid concerns and we thank the reviewers for highlighting this limitation of poor clustering of Micro-C replicates on MDS plot. We account for this variability between different replicates when identifying differentially interacting regions. By using an adjusted p-value < 0.05, we aim to ensure that repeating the experiments we will discover the same differentially interacting regions with a false discovery rate of 5%.

      We also would like to note that the replicates cluster much closely on PCA plot of RNA-seq data (Supplementary Figure 1C) and as well as on PCA plot of H3K27ac CUT&Tag data (Figure 4A). Notably, the RNA-seq result has now reproduced when performed with different sets of hands across multiple studies (Boone, et. al., 2021 and this report), as well as in a second cell line (as reported in this manuscript revision). These observations suggest that the cells of these replicates are functionally similar to each other at a population level. Chromatin organization detected by Micro-C is a highly heterogenous within cells of a population (Misteli, et. al., 2020). Moreover, despite increased resolution with Micro-C over Hi-C, the conventional sequencing depth that Micro-C is performed at makes resolving finer scale 3D interactions, particularly between enhancers and promoters, challenging (Goel, et. al., 2023). Thus biologically relevant interactions driving EWSR1::ETS transcriptional regulation through de novo enhancers may have relatively weak signal in Micro-C. Both the strength of the signal and the heterogeneous chromatin state present in bulk samples could affect the average signal leading to poor clustering replicates (Hafner and Boettiger, 2022). 

      Importantly, rather than add an additional replicate of a single cell line, we repeated our study in an additional cell line, TTC466, and largely reproduced our high-level findings for transcription, enhancer formation, and 3D chromatin. Specific limitations of the TTC466 study are addressed in the Discussion section (392-420). The reproduction of weak/moderate clustering in the MDS plot in both A673 and TTC466 cell lines suggests the α4 helix of EWSR1::ETS fusions are important for reshaping 3D chromatin. However, higher resolution analyses focused on specific EWSR1::ETS-bound loci are likely an important area of future study required to fully understand the role of the α4 helix in chromatin regulation in Ewing sarcoma.

      Similarly:

      - In Figure 1C, how would the result look when comparing DBD2/KD2/DBD+2? Same when comparing DBD 1 with KD1 and DBD+1. Would the difference go in the same direction?

      This is a great point. We added distance decay plots of individual replicates in Supplementary Figure 2 and added discussion of these results in lines 88-89 of the text.

      - Figure 1D-E. How would these plots look like when comparing each replicate to each other's? How much difference would be observed when comparing, for instance, DBD1/DBD2 ? or DBD1/DBD+1?

      Unfortunately, separate replicates are required to conduct Differentially Interacting Region analysis as it determines statistically significant interactions. Therefore, we are unable to plot these analyses with individual replicates. 

      - Figure 2: again, how would these analyses look like when performing the analysis with only DBD1/DBD+1/KD1 or DBD2/DBD+2/KD?

      This is a good suggestion. It is possible to do such analysis. However, we will lose resolution as such that we may not accurately detect TADs, especially smaller TADs. Therefore, we decided to combine the biological replicates.   

      Another major question is the stability of EWS-FLI DBD vs EWS-FLI DBD+ proteins. In the WB, FLAG intensities seem also higher (2/3 replicates) in DBD+ condition compared to the DBD condition (Figure S1B).

      This is a valid concern with shRNA knock-down/rescue system and we regularly validate new constructs to ensure that they have similar expression levels as rescue with the wildtype fusion before proceeding to more exhaustive experimental workups. We would note that while we have not tested for differences in protein stability, for these constructs we largely see similar expression levels across multiple experiments, multiple cell lines, and multiple sets of hands. There may be some variations in expression level from experiment to experiment, but western blotting is a semiquantitative assay and it is also not possible to rule out that slight differences in band intensity may be a result of error in gel loading. For this reason, alongside western blotting for construct expression, we also validate construct function using RNA-seq and colony formation assays (as reported in this manuscript) and these show good agreement across biological replicates.  

      Indeed, it seems that they have more FLAG (i.e., EWS-FLI) peaks in the DBD+ condition compared to the DBD condition (Figure 2B). 

      We appreciate the comment since the legend of Figure 2B led to a misunderstanding. Figure 2B depicts the number of TADs detected in DBD and DBD+ conditions (height of the bar graphs) and the proportion of those TADs overlapped with FLAG, CTCF, both or neither peaks on y-axis. The number of FLAG peaks is actually lower in DBD+ as compared to DBD as shown in Figure 5A-B.  We clarified our Figure 2 legend to accurately describe the various proportions (color coded section) of TADs bound by DBD/DBD+ FLAG and CTCF.

      Would it be possible that DBD+ is just more expressed or more stable than DBD? The higher stability of the re-expressed DBD+ could also partially explain their results independently of the 3D conformational change. In other words, can they exclude that DBD+ and DBD binding are not related to their respective protein stability or their global re-expression levels?

      It is possible that DBD+ protein is overexpressed or more stable than DBD. With our current set of data, we cannot conclusively exclude if binding by DBD and DBD+ are not related to their expression level or stability. We would note, as above, that western blots, RNA-seq, and agar assays have largely reproduced across experiments, hands, and cell lines and that western blot is an imperfect assay for assessing protein stability.

      Surprisingly, WB FLI bands in DBD+ conditions are systematically (3/3 replicates) fainter than in DBD conditions (Figure S1B). How do the authors explain these opposite results between FLI and FALG in the WB?

      This is an excellent observation that highlights one of the intricacies of studying EWSR1::FLI1 in our KD/rescue system. Often the limiting factor for an experiment is whether or not the KD condition maintains KD through a second viral transduction for rescue and selection. We have observed over many years of working with this system that rescue conditions which are fully functional (i.e. wildtype EWSR1::FLI1, DBD+, etc.) tend to maintain better KD of endogenous EWSR1::FLI1. Constructs that don’t rescue EWSR1::FLI1 function sometimes maintain KD to a lesser degree, though frequently to a functional degree (i.e. cells are not transformed and EWSR1::FLI1 transcriptional regulation is not rescued). We suspect this observation, also raised by Reviewer 1 is resulted from a potential selection of cells with more endogenous EWSR1::FLI1 escaping KD in in DBD conditions due to selective pressures during expansion in tissue culture.

      We should note that the antibody used for detecting FLI recognizes residues that are deleted in

      DBD and DBD+ constructs, such that the FLI1 blot in Supplementary Figure 1B does not detect either construct. It only detects endogenous EWSR1::FLI1 and the 3X-FLAG-EWSR1::FLI1 construct in the middle lane that runs at a slightly higher molecular weight. The FLAG antibody is the only antibody that detects all three rescue constructs.    

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Bayanjargal et al. entitled "The DBD-alpha4 helix of EWS::FLI is required for GGAA microsatellite binding that underlies genome regulation in Ewing sarcoma" reports on the critical role of a small alpha helix in the DNA binding domain (DBD) of the FLI1 portion of EWS::FLI1 that is critical for binding to repetitive stretches of GGAA-motifs, i.e. GGAA microsatellites, which serve as potent neoenhancers in Ewing sarcoma.

      We thank Reviewer 2 for their succinct and accurate summary of our manuscript. 

      Strengths:

      The paper is generally well-written, and easy to follow and the data presented are of high quality, welldescribed and underpin the conclusions of the authors. The report sheds new light on how EWS::FLI1 mechanistically binds to and activates GGAA microsatellite enhancers, which is of importance to the field.

      We appreciate the reviewer’s assessment of our work. 

      Weaknesses:

      While there are no major weaknesses in this paper, there are a few minor issues that the authors may wish to address before publication:

      (1) While the official protein symbol for the gene EWSR1 is indeed EWS, the protein symbol for the gene FLI1 is identical, i.e. FLI1. The authors nominate the fusion oncoprotein EWS::FLI1 (even in the title) but it appears more adequate to use EWS::FLI1.

      We appreciate the reviewer for bringing this to our attention. Indeed, the most recent guideline for fusion proteins nomenclature is to use the full gene symbols separated by double colons. Therefore, the accurate nomenclature is EWSR1::FLI1. We replaced instances of EWS::FLI with EWSR1::FLI1 and have used the EWSR1::ERG nomenclature in our revised manuscript.  

      (2) The used cell lines should be spelled according to their official nomenclature (e.g. A-673 instead of A673).

      Corrected, thanks!

      (3) It appears as if the vast majority of results were generated in a single Ewing sarcoma cell line (A-673) which is an atypical Ewing sarcoma cell line harboring an activating BRAF mutation and may be genomically quite unstable as compared to other Ewing sarcoma cell lines (Kasan et al. 2023 preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2023.11.20.567802v1). Hence, it may be supportive for the paper to recapitulate/cross-validate a few key results in other Ewing sarcoma cell lines, e.g. by using EWS::ERG-positive cell lines. Perhaps the authors could make use of available published data.

      We thank Reviewer 2 for this helpful comment. We replicated the experiments in TTC-466 cells containing EWSR1::ERG fusion and found that as for A-673 cells the DBD-α4 helix is important for transcriptional, enhancer, and 3D chromatin regulation (Supplementary Figures 9-18).  

      (4) Figure 6 and Supplementary Figure 5 are very interesting but focus on two selected target genes of the fusion (FCGRT and CCND1). It would be interesting to see whether these findings also extend to common EWS::ETS transcriptional signatures that have been reported. The authors could explore their data and map established consensus EWS::ETS signatures to investigate which other hubs might be affected at relevant target genes.

      We expanded our analysis to other genes demonstrated to be regulated by EWSR1::FLI1 nucleated transcriptional hubs (Chong, et. al., 2018) and included NKX2-2 and GSTM4 gene regions in

      Supplementary Figure 7-8 in A-673 cells. We also investigated the same gene regions of FCGRT, CCND1, NKX2-2, GSTM4 in TTC466 cells and report them in Supplementary Figures 14-17. For the purpose brevity, we decided to include the above examples. We may need to develop different tools to conduct further analysis to understand the gene regulatory networks driven by DBD and DBD+ in relation to hub formation. Although it is a great suggestion to map such network, this may be outside the scope of this manuscript. We thank the reviewer for bringing such a good point to our attention.  

      (5) Table 1 is a bit hard to read. In my opinion, it is not necessary to display P-values with up to 8 decimal positions. The gene symbols should be displayed in italic font.

      Suggestions are adapted, thanks!

      Reviewing Editor (Recommendations For The Authors):

      We would draw the authors' attention to the following issues that would best benefit from additional revision.

      As indicated by Referee 1, an important issue concerns the apparent poor reproducibility of Micro-C data. In Figure 1B, the clustering of the DBD1, DBD2, DBD+1, and DBD+2 is poor.

      It appears that the distances/clustering observed between replicates are typically similar or even larger than between biological conditions. Lines 80-83: "KD replicates clustered together with DBD replicate 1 on both axes and with DBD replicate 2 on the y-axis. DBD+ replicates, on the other hand, clustered away from both KD and DBD replicates. If one replaced DBD replicate 1 with DBD replicate 2, this statement would no longer be true. The referees believe that it is important to fully account for these potential discrepancies. Most of the study is based on analyses of these data sets, so if there are issues with them it has repercussions on the entire study. We note however that in Figure 4A the clustering of the H3K27ac data is much more convincing. The referees also feel that it is important to show immunoblots of the expression of DBD and DBD+ levels in the experiments performed here. While this was previously shown in the Boone et al publication in 2021, it could be illustrated again here.

      We thank the editors for concisely summarizing the main weaknesses of the paper and underscoring the importance of the Micro-C data in the rest of the paper. While the Editors note tighter clustering of the H3K27ac (Figure 4A), we would like to note that the replicates cluster much closely on PCA plot of RNA-seq data (Supplementary Figure 1C). Notably, the RNA-seq result has now reproduced when performed with different sets of hands across multiple studies (Boone, et. al., 2021 and this report), as well as in a second cell line (as reported in this manuscript revision). Though not as tight, the H3K27ac CUT&Tag also reproduces in TTC466 cells. Thus, we interpret these findings to indicate that our replicates are functionally similar to each other. As discussed above in the response to Reviewer 1 in more detail, there are several factors that could affect how these functional similarities are represented in Micro-C data. Micro-C is ultimately a readout of the chromatin organization in a heterogeneous population of cells (Misteli et al., 2020). Additionally, sequencing depth limitations in conventional Micro-C experiments limit the ability to faithfully assess the enhancer-promoter interactions that may be relevant for our model system (Goel, et. al., 2023). Thus, both the strength of the biologically relevant signal and the heterogeneous chromatin state present in bulk samples could affect the average signal and lead to poorly clustering replicates (Hafner and Boettiger, 2022). 

      To address these important concerns about rigor and reproducibility of the analyses, we repeated our study in an additional cell line, TTC466, and largely reproduced our high-level findings for transcription, enhancer formation, and 3D chromatin. These additional studies were not without their own limitations and these are addressed in the Discussion section (392-420). The reproduction of weak/moderate clustering in the MDS plot in both A673 and TTC466 cell lines suggests the α4 helix of EWSR1::ETS fusions are important for reshaping 3D chromatin. However, additional genomic analyses geared toward higher resolution at specific EWSR1::ETS-bound loci are likely an important area of future study required to fully understand the role of the α4 helix in chromatin regulation in Ewing sarcoma. Live cell imaging, as performed by Chong, et. al., 2018 and additional biochemical techniques may also be informative and are beyond the scope of this report.

      With regards to concerns about construct expression, we have included immunoblots of the rescue constructs in both cell lines (Supplementary Figure 1B and 9A) and discussed Reviewer 1’s specific concerns in detail above.  

      The referees also raise the issue of using an additional cell line to make a more general message. Although it would perhaps be asking too much to repeat the MicroC experiments, consolidation of the observations could be performed by focusing on specific loci such as FCGRT and CCND1 that were analyzed in this study. Could the authors use 4C-type experiments to reproduce the conclusions in an additional cell line? It would also be pertinent to consolidate the findings at these loci by 4C-type approaches even in the cell line used here. For the moment, all conclusions are based on the same set of data and a single technical approach.

      We repeated the experiments in TTC466 cells and analyzed the data using same cut-offs used in A-673 cells. This allows us to compare between the two cell lines. We hope this new set of experiments and analyses address the reviewers’ concerns.  

      Reviewer #1 (Recommendations For The Authors):

      All the data are performed in A673 cells. Knowing the transcriptomic and epigenetic heterogeneity of Ewing sarcoma cells, some of the experiments supporting their findings should be replicated in at least another Ewing sarcoma model.

      Per our discussion above, we have replicated our experiments in an additional cell line model of Ewing sarcoma. Importantly, the TTC466 cell line used expresses the EWSR1::ERG fusion found in 10-15% of Ewing sarcoma cases.  

      Supplementary Figure 2B. Proportion of TAD boundaries bound by FLAG (i.e., EWS-FLI1) and CTCF. The number/proportion of FLAG (i.e., EWS-FLI) peaks observed at CTCF peak/TAD boundaries seems unexpectedly high. How do they explain this result since EWS-FLI peaks are rather intra-TAD to mediate their enhancer function?

      In our previous study, we showed that EWSR1::FLI1 binding can be detected at boundaries of TADs (Showpnil, et. al., 2022). We think therefore it is likely that EWSR1::FLI1 binding is able to mediate enhancer function both inside TADs as well as at the borders of TADs and may, in some cases, function as an insulator between TADs.  

      For the >50kb loop analysis, what was the low-range threshold? Up to 15-20 kp, contact frequency interactions may be caused by PFA crosslink (did they use a 5kb threshold ?). Were those excluded from that analysis?

      We acknowledge that we did not use a lower threshold to exclude those short-range loop interactions. In our previous study, we observed that EWSR1::FLI1 binding reduces long-range interactions in favor of short-range interactions (Showpnil, et. al., 2022) and wanted to be able to capture short-range loops in our analysis.  

      In Figure 2D, they observed that within TADs containing FLAG peaks at GGAA microsatellites, the intensity of the DBD+ FLAG peaks was higher compared to DBD FLAG peaks. How would this analysis look when considering the ETS FLAG peaks (i.e., EWS-FLI rather repressive peaks)? Could they compare TAD with GGAA msat vs TAD with ETS peaks?

      We agree that this is an interesting observation. In our prior analyses we found no discernible relationship between EWSR1::FLI1 binding and changes in 3D chromatin associated with repression (Showpnil, et. al., Nucleic Acids Research, 2022). In contrast, EWSR1::FLI1-bound superenhancers had greater H3K27ac deposition when overlapping both a bound GGAA repeat and a non-microsatellite site. While there have been several additional reports about the relevance of EWSR1::FLI1 binding at nonmicrosatellite peaks, motifs at these loci have not yet been rigorously defined as GGAA repeats were by Johnson, et. al. in PLoS One, 2017. Each ETS factor binds different motifs containing the core 5’-GGAA-3’ with varying affinities depending on the flanking residues. There may be >100-fold difference in sequence-specific binding affinity for “high” vs. “low” affinity motifs. Better defining the types of ETS motifs bound by EWSR1::FLI1 and the functional changes associated with them thus represents an interesting area of future study.

      Figure 1F: What is the biological meaning of these results (29.7, 39.5, and 54Mbp)? These distances are typically the size of a chromosome arm and clearly beyond classical chromatin loop/TAD structures in which EWS-FLI mediates its cis-activity.

      We agree with referee here. This panel is now removed in our revised manuscript.  

      How do DBD, KD, and DBD+ conditions compare with WT parental cells in the omics data? (Figures 1B, 4A). Do DBD+ conditions overlap with WT conditions? It would be nice to have these analyses also for Micro-C and Cut&Tag data. To be acknowledged here, the transcriptome data showing this aspect in Figure S1C are very convincing.

      This is a fair point. We were not able to obtain similar sequencing depth of wtEF Micro-C libraries to that of KD, DBD and DBD+ due to disproportional use of wtEF libraries in troubleshooting. Therefore, we decided to exclude wtEF condition from these analysis. 

      EWS-FLI cis-regulation at CCND1 also occurs through a much closer EWS-FLI peak (~-20kb msat upstream of CCND1 TSS) which was not taken into consideration. EWS-FLI peak intensity in both DBD and DBD+ at this msta seems similar. How would this fit into their model?

      The referee is correct. The closest peak upstream of CCND1 TSS is about ~19kb away. We highlighted this peak with the dashed boxes near the CCND1 TSS (Supplementary Figure 6). Peak intensity of DBD+ FLAG is slightly higher compared to DBD. Nonetheless, we acknowledge that the difference is small. We suspect that the DBD-α4 helix is affecting binding dynamics at GGAA repeats, but these genomics approaches are not well suited to detect small, but significant, changes in binding affinity or dynamics. In this case a more biochemical approach may be needed. Even though, both protein can still bind the same microsatellites, it is possible that they might differ in their stability of binding or in the recruitment of additional proteins. These possibilities are discussed in the Discussion section (444-463).  

      For the Micro-C, they sequenced only 7 to 8 million reads per condition. This coverage seems particularly low, especially for their analyses using 1-5kb bins. How does this compare with other published Micro-C data? Can this explain the variability observed between replicates?

      We apologize for the inconsistent verbiage of sequencing coverage that may have caused confusion. 7 to 8 million reads were used for shallow sequencing and QC analysis. Once a sample passed QC, we then sequenced 300 million reads per sample. 300M is now changed to 300 million to prevent a misunderstanding at line 598.  

      They mention:

      "In our recent studies of EWS::FLI, we found a small alpha helix in the DNA binding domain DBD-𝛼𝛼4, to

      be required for transcription and regulation by the fusion protein (Boone et al., 2021). Interestingly, this study did not find any change in chromatin accessibility (ATAC-Seq) and genome localization of EWS::FLI constructs (CUT&RUN) when DBD-𝛼𝛼4 helix was deleted leaving the mechanistic basis for the requirement of DBD-𝛼𝛼4 in transcription regulation unclear. "

      And

      "To assay the enhancer landscape, we collected H3K27ac CUT&Tag data from KD, DBD, and DBD+ cells. Principal component analysis of H3K27ac localization shows that the DBD replicates were clustered closer to the KD replicates while being in between the KD and the DBD+ replicates (Figure 4A), suggesting that DBD-𝛼𝛼4 helix is required to reshape the enhancer landscape."

      But now H3K27ac CUT&Tag show strong differences which were not observed in ATAC seq. How to explain this discrepancy?

      Though both H3K27ac and ATAC signal are associated with enhancers and promoters in euchromatin, they are not exactly measurements of the same thing. H3K4me2 is a mark more closely associated with ATAC signal than H3K27ac (Henikoff, et. al., 2020). Nonetheless, there are clear differences between the prior publication (Boone, et. al., 2021) and this work with regards to similar ATAC signal for each replicate and differences in H3K27ac. We suspect this may be related to a tighter association between H3K27ac and EWSR1::FLI1-mediated genome regulation and ATAC. Notably, there were very few differentially accessible regions between EWSR1::FLI1-depleted cells and conditions with EWSR1::FLI1 expression (either endogenous or wildtype rescue) using the A673 KD/Rescue system in Boone, et. al., 2021. In contrast, other A673 KD-rescue studies have reported differences in H3K27ac in EWSR1::FLI1 expressing conditions relative to EWSR1::FLI1-depleted conditions (Theisen, et. al., 2021). .  

      The authors mention:

      "Our study thus uncovered a surprising role for FLI DBD in the process of hub formation which is usually attributed to the EWS low complexity domain."

      Not sure this can be claimed, hubs are composed of many other factors that are not investigated here. Furthermore, promoter enhancer hubs/loops often include combined ETS and mSat chains to generate transcriptional hubs which have not been considered here. None of these points were discussed here.

      We replaced “uncovered” with “suggest” in our revised manuscript at line 476.  

      What are the barcode patterns in Supp 5, are those frequently observed in their Micro-C data, likely mapping artifacts, do they have any impact on their analyses?

      The barcode patterns in now Supplementary Figure 6 are blind spots in the hg19 genome assembly. Since they are few in numbers, we don’t expect these blind spots to impact our analysis.

    1. Author response:

      Reviewer #1 (Public Review): 

      Summary: 

      The authors use fluorescence lifetime imaging (FLIM) and tmFRET to resolve resting vs. active conformational heterogeneity and free energy differences driven by cGMP and cAMP in a tetrameric arrangement of CNBDs from a prokaryotic CNG channel. 

      Strengths: 

      The excellent data provide detailed measures of the probability of adopting resting vs. activated conformations with and without bound ligands. 

      Weaknesses: 

      Limitations are that only the cytosolic fragments of the channel were studied, and the current manuscript does not do a good job of placing the results in the context of what is already known about CNBDs from other methods that yield similar information. 

      In the revision, we will put our results into context of the previous work of CNBD channels where possible.

      Reviewer #2 (Public Review): 

      The authors investigated the conformational dynamics and energetics of the SthK Clinker/CNBD fragment using both steady-state and time-resolved transition metal ion Förster resonance energy transfer (tmFRET) experiments. To do so, they engineered donor-acceptor pairs at specific sites of the CNBD (C-helix and β-roll) by incorporating a fluorescent noncanonical amino acid donor and metal ion acceptors. In particular, the authors employed two cysteine-reactive metal chelators (TETAC and phenM). This allowed them to coordinate three transition metals (Cu2+, Fe2+, and Ru2+) to measure both short (10-20 Å, Cu2+) and long distances (25-50 Å, Fe2+, and Ru2+). By measuring tmFRET with fluorescence lifetimes, the authors determined intramolecular distance distributions in the absence and presence of the full agonist cAMP or the partial agonist cGMP. The probability distributions between conformational states without and with ligands were used to calculate the changes in free energy (ΔG) and differences in free energy change (ΔΔG) in the context of a simple four-state model. 

      Overall, the work is conducted in a rigorous manner, and it is well-written. I greatly enjoyed reading it. 

      Nonetheless, I do not see the novelty that the authors claim. 

      We will try to highlight the novelty in the revision. (See below for examples).

      In terms of methodology, this work provides further support to steady-state and time-resolved tmFRET approaches previously developed by the authors of the present work to probe conformational rearrangements by using a fluorescent noncanonical amino acid donor (Anap) and transition metal ion acceptor (Zagotta et al., eLIfe 2021; Gordon et al., Biophysical Journal 2024; Zagotta et al., Biophysical Journal 2024). 

      This work is the first use of the time-resolved tmFRET method to obtain intrinsic DG (of an apo conformation) and DDG values for different ligands, and the first application of this approach to a protein other than MBP.

      Regarding cyclic nucleotide-binding domain (CNBD)-containing ion channels, I disagree with the authors when they state that "the precise allosteric mechanism governing channel activation upon ligand binding, particularly the energetic changes within domains, remains poorly understood". On the contrary, I would say that the literature on this subject is rather vast and based on a significantly large variety of methodologies. This is a not exhaustive list of papers: Zagotta et al., Nature 2003; Craven et al., GJP, 2004; Craven et al., JBC, 2008; Taraska et al., Nature Methods, 2009; Puljung et al., JBC, 2013; Saponaro et al., PNAS 2014; Goldschen-Ohm et al., eLife, 2016; Bankston et al., JBC, 2017; Hummert et al., PLoS Comput Biol., 2018; Porro et al., eLife, 2019; Ng et al., JGP, 2019; Porro et al., JGP, 2020; Evans et al., PNAS, 2020; Pfleger et al., Biophys J. 2021; Saponaro et al., Mol Cell, 2021; Dai et al., Nat Commun. 2021; Kondapuram et al., Commun Biol. 2022. These studies were conducted either on the isolated Clinker/CNBD fragments or on the entire full-length proteins. As is evident from the above list, the authors of the present work have significantly contributed to the understanding of the allosteric mechanism governing the ligand-induced activation of CNBD-containing channels, including a detailed description of the energetic changes induced by ligand binding. Particularly relevant are their works based on DEER spectroscopy. In DeBerg et al., JBC 2016, the authors described, in atomic detail, the conformational changes induced by different cyclic nucleotides on the HCN CNBD fragment and derived energetics associated with ligand binding to the CNBD (ΔΔG). In Collauto et al., Phys Chem Chem Phys. 2017, they further detailed the ligand-CNBD conformational changes by combining DEER spectroscopy with microfluidic rapid freeze quench to resolve these processes and obtain both equilibrium constants and reaction rates, thus demonstrating that DEER can quantitatively resolve both the thermodynamics and the kinetics of ligand binding and the associated conformational changes. 

      Despite this vast literature, some of which is our own work, there is no consensus about the energetics and coupling of domains that underlies the allosteric mechanism in any CNBD channel. Our approach addresses energetics of the CNBD upon ligand binding, which we aim to later expand to a more complete assessment of the allosteric mechanism in the intact channel.

      Suggestions: 

      - In light of the above, I suggest the authors better clarify the contribution/novelty that the present work provides to the state-of-the-art methodology employed (steady-state and time-resolved tmFRET) and of CNBD-containing ion channels. In particular, it would be nice to have a comparison with the conformational dynamics and energetics reported in the previous works of the authors based on DEER spectroscopy (DeBerg et al., JBC 2016, Collauto et al., Phys Chem Chem Phys. 2017 and Evans et al., PNAS, 2020) and with Goldschen-Ohm et al., eLife, 2016, where single-molecule events (FRET-based) of cAMP binding to HCN CNBD were measured and kinetic rate constants were models in the context of a simple four-state model, reminiscent of the model employed in the present work. 

      In the revision, we will put our results into context of the previous work of CNBD channels where possible.

      - Even considering the bacterial SthK channel, cryo-EM has significantly advanced the atomistic understanding of its ligand-dependent regulation (Rheinberger et al., eLife, 2018). More recently, the authors of the present work have elegantly employed DEER on full-length SthK protein to reveal ligand-dependent conformational rearrangements in the Clinker region (Evans et al., PNAS, 2020). In light of the above, what is the contribution/novelty that the present work provides to the SthK biophysics? 

      Neither of the papers mentioned above (structure or DEER) reported energetics for SthK. This work describes an approach that will allow us to get a more complete picture of the energetics of SthK.

      - The authors decided to use the Clinker/CNBD fragment of SthK. On the basis of the above-cited work (Evans et al., PNAS, 2020) the authors should clarify why they have decided to work on the isolated Clinker/CNBD fragment and not on the full-length protein. I assume that the use of the C-licker/CNBD fragment was necessary to isolate tetramers with only one labelled subunit (fSEC and MP were used to confirm this) to avoid inter-subunit crass-talk. However, I am not clear if this is correct. 

      We chose to start on the C-terminal fragment to provide a technically more tractable system for validating our approach using time-resolved tmFRET before moving to the full-length membrane protein.

      - What is the advantage of using the Clinker/CNBD fragment of a bacterial protein and not one of HCN channels, as already successfully employed by the authors (see above citations)? 

      SthK is a useful model system that allows us to later express full-length channels in bacteria.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript aims to provide insights into conformational transitions in the cyclic nucleotide-binding domain of a cyclic nucleotide-gated (CNG) channel. The authors use transition metal FRET (tmFRET) which has been pioneered by this lab and previously led to detailed insights into ion channel conformational changes. Here, the authors not only use steady-state measurements but also time-resolved, fluorescence lifetime measurements to gain detailed insights into conformational transitions within a protein construct that contains the cytosolic C-linker and cyclic nucleotide-binding domain (CNBD) of a bacterial CNG channel. The use of time-resolved tmFRET is a clear advancement of this technique and a strength of this manuscript. 

      In summary, the present work introduced time-resolved tmFRET as a novel tool to study conformational distributions in proteins. This is a clear technological advance. At this stage, conclusions made about energetics in CNG channels are overstated. However, it will be interesting to see in the future how results compare to similar measurements on full-length channels, for example, reconstituted into nanodiscs. 

      Strengths: 

      The results capture known differences in promoting the open state between different ligands (cAMP and cGMP) and are consistent across three donor-acceptor FRET pairs. The calculated distance distributions further are in reasonable agreement with predicted values based on available structures. The finding that the C-helix is conformationally more mobile in the closed state as compared to the open state quantitatively increases our understanding of conformational changes in these channels. 

      Weaknesses: 

      While the use of a truncated construct of SthK is justified, it also comes with certain limitations. The construct is missing the transmembrane part including the pore for ions. However, the pore is the central part of every ion channel and is crucial to describe conformational transitions and energetics that lead to ion channel gating. Two observations in the present study disagree with the results for the full-length channel protein. Here, under apo conditions, the CNBD can adopt an 'open' conformation, and second, cooperativity of channel opening is lost. These differences need to be weighed carefully when judging the impact of the presented results for understanding allostery in CNG channels. Qualitatively, the results can describe movements of the C-helix in CNBDs, but detailed energetics as calculated in this study, need to be limited to the truncated protein construct used. The entire ion channel is an allosteric system and detailed, energetic conclusions cannot be made for the full-length channel when working with only the cytosolic domains. Similarly, the statement "These results demonstrate that time-resolved tmFRET can be utilized to obtain energetic information on the individual domains during the allosteric activation of SthK." is misleading. The data only describe movements of the C-helix. Upon ligand binding, the C-helix moves upwards to coordinate the ligand. Thus, the results are ligand-induced conformational changes (as the title states). Allosteric regulation usually involves remote locations in the protein, which is not the case here. 

      We agree that the full-length channel is more complicated than the C-terminal fragment studied here, but we disagree that there isn’t relevant energetic information from the individual domains. For example, the DDG values measured for the C-helix movement in the isolated fragment should be the same as those of the intact channel. In the future we aim to make direct comparisons of the energetics between the fragment and the intact channel.

    1. Author response:

      We thank the editors and the reviewers for their considered comments and helpful suggestions.

      In our revision, we plan to focus on tightening the relationship between the bias-variance tradeoff theory and the empirical analyses that follow.

      We will also work to better communicate what we argue—and what is beyond our scope—with respect to GxE in complex traits. For example, our language is currently insufficiently clear as it suggested to the editor and reviewers that we are developing a method to characterize polygenic GxE here. Developing a new method that does so (let alone evaluating performance in extensive scenarios) is beyond the scope of this manuscript.

      Similarly, we use amplification only as an example of a mode of GxE that is not adequately characterized by current approaches. We do not wish to argue it is an omnibus explanation for all GxE in complex traits. In many cases, a mixture of polygenic GxE relationships seems most fitting (as observed, for example, in Zhu et al., 2023, for GxSex in human physiology).

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      Summary: The global decline of amphibians is primarily attributed to deadly disease outbreaks caused by the chytrid fungus, Batrachochytrium dendrobatidis (Bd). It is unclear whether and how skin-resident immune cells defend against Bd. Although it is well known that mammalian mast cells are crucial immune sentinels in the skin and play a pivotal role in immune recognition of pathogens and orchestrating subsequent immune responses, the roles of amphibian mast cells during Bd infections is largely unknown. The current study developed a novel way to enrich X. laevis skin mast cells by injecting the skin with recombinant stem cell factor (SCF), a KIT ligand required for mast cell differentiation and survival. The investigators found an enrichment of skin mast cells provides X. laevis substantial protection against Bd and mitigates the inflammation-related skin damage resulting from Bd infection. Additionally, the augmentation of mast cells leads to increased mucin content within cutaneous mucus glands and shields frogs from the alterations to their skin microbiomes caused by Bd. 

      Strengths: This study underscores the significance of amphibian skin-resident immune cells in defenses against Bd and introduces a novel approach to examining interactions between amphibian hosts and fungal pathogens. 

      We thank the reviewer for recognizing the significance and the novelty of our work.

      Weaknesses: The main weakness of the study is lack of functional analysis of X. laevis mast cells. Upon activation, mast cells have the characteristic feature of degranulation to release histamine, serotonin, proteases, cytokines, and chemokines, etc. The study should determine whether X. laevis mast cells can be degranulated by two commonly used mast cell activators IgE and compound 48/80 for IgE-dependent and independent pathway. This can be easily done in vitro. It is also important to assess whether in vivo these mast cells are degranulated upon Bd infection using avidin staining to visualize vesicle releases from mast cells. Figure 3 only showed rSCF injection caused an increase in mast cells in naïve skin. They need to present whether Bd infection can induce mast cell increase and rSCF injection under Bd infection causes a mast cell increase in the skin. In addition, it is unclear how the enrichment of mast cells provides the protection against Bd infection and alternations to skin microbiomes after infection. It is important to determine whether skin mast cell release any contents mentioned above. 

      We would like to thank the reviewer for taking the time to review our work and providing us with valuable feedback.

      Please note, that as indicated in our previous rebuttal to reviewers, amphibians do not possess the IgE antibody isotype1.

      To our knowledge, there are no published works describing the approaches used in studying mammalian mast cell degranulation towards examining amphibian mast cells. While there are commercially available kits and reagents for examining mammalian mast cell granule content, most of these do not cross-react with amphibian counterparts. This is especially true of cytokines and chemokines, which diverged quickly with evolution and thus do not share substantial protein sequence identity across species as diverged as frogs and mammals. We would also like to highlight the fact that several studies suggest that amphibian mast cells lack histamine2, 3, 4, 5 and serotonin2, 6. While following up on these findings would be possible, we would like to respectfully emphasize that adopting approaches used in mammalian research to comparative immunology work is not always straightforward.

      As we highlight in our manuscript, frog mast cells upregulate their expression of interleukin-4 (IL4), a hallmark cytokine associated with mammalian mast cells7. The additional findings presented in our revised manuscript indicate that mast cells respond to Bd by upregulating IL4 expression in vitro and in vivo. Together, this suggests that IL4 may be a central means by which frog mast cells confer protection against Bd, by counteracting Bd-elicited inflammation, including minimizing neutrophil infiltration, maintaining skin integrity, and promoting cutaneous mucus production. Please find that these additional results are presented in Figure 8 and are described in the results and discussion sections of our revised manuscript.

      Our attempts to elicit degranulation of frog mast cells using compound 48/80 have so far not been successful. This may reflect technical issues with assays optimized for mammalian mast cells or biological difference between frog and mammalian mast cells, such as species differences in mas-related G-protein coupled receptors, through which compound 48/80 acts8. We will continue to explore means to study frog mast cell degranulation both in vitro and in vivo but also respectfully point out that while degranulation is a feature commonly associated with mammalian mast cells, this is not the only means by which the mammalian mast cells confer their immunological effects. Indeed, our studies suggest that frog mast cell IL4 production may be a key means by which these cells offer anti-Bd protection.

      Please note that we successfully adopted an avidin staining approach to visualize mast cell heparin content in vitro and to evaluate cutaneous mast cell numbers in vivo in control and mast cell-enriched, mock- and Bd-infected animals. This additional work is depicted in Figure 4 and addressed in the results and discussion sections of our revised manuscript.

      Reviewer #2 (Public Review):

      Summary: In this study, Hauser et al investigate the role of amphibian (Xenopus laevis) mast cells in cutaneous immune responses to the ecologically important pathogen Batrachochytrium dendrobatidis (Bd) using novel methods of in vitro differentiation of bone marrow-derived mast cells and in vivo expansion of skin mast cell populations. They find that bone marrow-derived myeloid precursors cultured in the presence of recombinant X. laevis Stem Cell Factor (rSCF) differentiate into cells that display hallmark characteristics of mast cells. They inject their novel (r)SCF reagent in the skin of X. laevis and find that this stimulates expansion of cutaneous mast cell populations in vivo. They then apply this model of cutaneous mast cell expansion in the setting of Bd infection and find that mast cell expansion attenuates skin burden of Bd zoospores and pathologic features including epithelial thickness and improves protective mucus production and transcriptional markers of barrier function. Utilizing their prior expertise with expanding neutrophil populations in X. laevis, the authors compare mast cell expansion using (r)SCF to neutrophil expansion using recombinant colony stimulating factor 3 (rCSF3) and find that neutrophil expansion in Bd infection leads to greater burden of zoospores and worse skin pathology. Combining these two observations, they demonstrate that mast cell expansion using rSCF attenuates cutaneous neutrophilic infiltration. They further show that mast cell expansion correlates to cutaneous IL-4 expression, and that treatment with exogenous rIL-4 reduces neutrophilic infiltration and restores markers of epithelial health, offering a mechanism by which mast cell expansion protects from Bd infection. 

      Strengths: The authors report a novel method of expanding amphibian mast cells utilizing their custom-made rSCF reagent. They rigorously characterize expanded mast cells in vitro and in vivo using histologic, morphologic, transcriptional, and functional assays. This establishes solid footing with which to then study the role of rSCF-stimulated mast cell expansion in the Bd infection model. This appears to be the first demonstration of exogenous use of rSCF in amphibians to expand mast cell populations and may set a foundation for future mechanistic studies of mast cells in the X. laevis model organism. Building on prior work, they are able to contrast mast cell expansion with their neutrophil expansion model, allowing them to infer a mechanistic link between mast cell expansion and IL-4 production and subsequent suppression of neutrophil infiltration and cutaneous dysbiosis. 

      We thank the reviewer for recognizing the rigorousness and utility of the studies presented in our manuscript.

      Weaknesses: The main weaknesses derive from technical limitations inherent to the Xenopus model at this time. For example, in mice a mechanistic study would be expected to use IL-4 knockouts, preferably mast cell-specific, to prove the link between mast cell expansion and IL-4 production being necessary and sufficient to suppress neutrophils. However, the novel reagents in this manuscript present a compelling technical advance and a step forward in the tools available to study amphibian biology. 

      We agree with the reviewer that an IL4 knock-out animal model would be a great way to support our findings. Unfortunately, working with a non-mammalian model such as X. laevis poses limitations that include lack of knock-out lines for immunology research. Moreover, as mentioned in our manuscript, we do not believe that IL4 is the sole mast cell-produced component responsible for the conferred antifungal protection. We thank the reviewer for acknowledging the limitations of our model system and recognizing the novelty, technical advances, and merits of the work presented in our manuscript.

      In addition to their discussion, one open question from the revised manuscript is how a single treatment with rSCF leads to a peak in mast cell numbers and then decline to baseline in mock-infected frogs, while Bd infection either sustains rSCF-boosted mast cells or leads to steady mast cell increase over time in control-treated frogs. Whether this is mediated by endogenous SCF or some other factor remains unexplored.

      This is an interesting question that we hope to explore in future studies. We did not see significant differences in skin SCF gene expression at 21 days post Bd infection. This does not rule out the possibility that the observed Bd-mediated effects to frog skin mast cell composition are not due to changes in skin SCF gene expression at earlier infection times, alone or in combination with other host or pathogen derived factors. We know that other factors are responsible for homing/retention of antimicrobial and immunosuppressive granulocyte subsets within frog skin9 and we postulate that some of these may be distinct mast cell types. Additionally, Bd is known to produce a myriad of immunomodulatory factors10, which may well also directly affect frog skin mast cell composition. Mammalian mast cells are heterogenous and are homed or recruited into tissues by an extensive array of host as well as microbiome-derived components11, 12. Undoubtedly, the frog skin mast cell composition is likewise complex, dynamic, and contingent on a plethora of host, cutaneous microbial flora- and in this case also Bd-produced factors.

      References

      (1) Flajnik, M.F. A cold-blooded view of adaptive immunity. Nat Rev Immunol 18, 438-453 (2018).

      (2) Mulero, I., Sepulcre, M.P., Meseguer, J., Garcia-Ayala, A. & Mulero, V. Histamine is stored in mast cells of most evolutionarily advanced fish and regulates the fish inflammatory response. Proc Natl Acad Sci U S A 104, 19434-19439 (2007).

      (3) Reite, O.B. A phylogenetical approach to the functional significance of tissue mast cell histamine. Nature 206, 1334-1336 (1965).

      (4) Reite, O.B. Comparative physiology of histamine. Physiol Rev 52, 778-819 (1972).

      (5) Takaya, K., Fujita, T. & Endo, K. Mast cells free of histamine in Rana catasbiana. Nature 215, 776-777 (1967).

      (6) Galli, S.J. New insights into "the riddle of the mast cells": microenvironmental regulation of mast cell development and phenotypic heterogeneity. Lab Invest 62, 5-33 (1990).

      (7) Babina, M., Guhl, S., Artuc, M. & Zuberbier, T. IL-4 and human skin mast cells revisited: reinforcement of a pro-allergic phenotype upon prolonged exposure. Archives of dermatological research 308, 665-670 (2016).

      (8) Hermans, M.A.W. et al. Human Mast Cell Line HMC1 Expresses Functional Mas-Related G-Protein Coupled Receptor 2. Front Immunol 12, 625284 (2021).

      (9) Hauser, K. et al. Discovery of granulocyte-lineage cells in the skin of the amphibian Xenopus laevis. FACETS 5, 571 (2020).

      (10) Rollins-Smith, L.A. & Le Sage, E.H. Batrachochytrium fungi: stealth invaders in amphibian skin. Curr Opin Microbiol 61, 124-132 (2021).

      (11) Halova, I., Draberova, L. & Draber, P. Mast cell chemotaxis - chemoattractants and signaling pathways. Front Immunol 3, 119 (2012).

      (12) West, P.W. & Bulfone-Paus, S. Mast cell tissue heterogeneity and specificity of immune cell recruitment. Front Immunol 13, 932090 (2022).


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The global decline of amphibians is primarily attributed to deadly disease outbreaks caused by the chytrid fungus, Batrachochytrium dendrobatidis (Bd). It is unclear whether and how skin-resident immune cells defend against Bd. Although it is well known that mammalian mast cells are crucial immune sentinels in the skin and play a pivotal role in the immune recognition of pathogens and orchestrating subsequent immune responses, the roles of amphibian mast cells during Bd infections are largely unknown. The current study developed a novel way to enrich X. laevis skin mast cells by injecting the skin with recombinant stem cell factor (SCF), a KIT ligand required for mast cell differentiation and survival. The investigators found an enrichment of skin mast cells provides X. laevis substantial protection against Bd and mitigates the inflammation-related skin damage resulting from Bd infection. Additionally, the augmentation of mast cells leads to increased mucin content within cutaneous mucus glands and shields frogs from the alterations to their skin microbiomes caused by Bd.

      Strengths:

      This study underscores the significance of amphibian skin-resident immune cells in defenses against Bd and introduces a novel approach to examining interactions between amphibian hosts and fungal pathogens. 

      We thank the reviewer for acknowledging the novelty and importance of the work presented in our manuscript.

      Weaknesses:

      The main weakness of the study is the lack of functional analysis of X. laevis mast cells. Upon activation, mast cells have the characteristic feature of degranulation to release histamine, serotonin, proteases, cytokines, and chemokines, etc. The study should determine whether X. laevis mast cells can be degranulated by two commonly used mast cell activators IgE and compound 48/80 for IgE-dependent and independent pathways. This can be easily done in vitro. It is also important to assess whether in vivo these mast cells are degranulated upon Bd infection using avidin staining to visualize vesicle releases from mast cells. Figure 3 only showed rSCF injection caused an increase in mast cells in naïve skin. They need to present whether Bd infection can induce mast cell increase and rSCF injection under Bd infection causes a mast cell increase in the skin. In addition, it is unclear how the enrichment of mast cells provides protection against Bd infection and alternations to skin microbiomes after infection. It is important to determine whether skin mast cells release any contents mentioned above. 

      We would like to thank the reviewer for taking the time to review our work and providing us with valuable feedback. We feel that we have successfully incorporated the reviewer’s suggestions into our revised manuscript, thereby improving this work.

      Please note that amphibians do not possess the IgE antibody isotype1.

      To our knowledge there have been no published work assimilating approaches used when studying mammalian mast cell degranulation towards examining amphibian mast cells. While there are commercially available kits and reagents for examining mammalian mast cell granule content, most of these reagents do not cross-react with amphibian counterparts. This is especially true of cytokines and chemokines, which diverged quickly with evolution and thus do not share substantial protein sequence identity across species as diverged as frogs and mammals. Additionally, several studies suggest that amphibian mast cells lack histamine2, 3, 4, 5 and serotonin2, 6. Respectfully, while following up on these findings is possible, we would not consider adopting approaches used in mammalian research to comparative immunology work as easy.

      As noted in our manuscript, frog mast cells upregulate their expression of interleukin-4 (IL4), which is a hallmark cytokine associated with mammalian mast cells7. The additional findings, presented in our revised manuscript indicate that mast cells respond to Bd by upregulating IL4 expression in vitro and in vivo. In turn, our work indicates that IL4 may be a central means by which frog mast cells confer protection against Bd, by counteracting Bd-elicited inflammation, including minimizing neutrophil infiltration, maintaining skin integrity, and promoting mucus production by skin mucus glands. Please find that these additional findings are presented in Figure 8 of our revised manuscript and are described in the results and discussion sections of the paper.

      Our attempts to elicit degranulation of frog mast cells using compound 48/80 have so far not been successful. This may reflect technical issues with assays optimized for mammalian mast cells or biological difference between frog and mammalian mast cells, such as species differences in mas-related G-protein coupled receptors, through which compound 48/80 acts8. We will continue explore means to study frog mast cell degranulation both in vitro and in vivo but would also like to respectfully point out that while mast cell degranulation is a feature most associated with mammalian mast cells, this is not the only means by which the mammalian mast cells confer their immunological effects. Indeed, our additional studies suggest that mast cell IL4 production may be a key means by which these cells offer anti-Bd protection.

      Please find that we have adopted an avidin-staining approach to visualize mast cell heparin content in vitro and to evaluate mast cell numbers in vivo in the skins of control and mast cell-enriched, mock- and Bd-infected animals. This additional work is depicted in Figure 4 of our revised manuscript and addressed in the results and discussion sections of our revised paper.

      Reviewer #2 (Public Review):

      Summary:

      In this study, Hauser et al investigate the role of amphibian (Xenopus laevis) mast cells in cutaneous immune responses to the ecologically important pathogen Batrachochytrium dendrobatidis (Bd) using novel methods of in vitro differentiation of bone marrow-derived mast cells and in vivo expansion of skin mast cell populations. They find that bone marrow-derived myeloid precursors cultured in the presence of recombinant X. laevis Stem Cell Factor (rSCF) differentiate into cells that display hallmark characteristics of mast cells. They inject their novel (r)SCF reagent into the skin of X. laevis and find that this stimulates the expansion of cutaneous mast cell populations in vivo. They then apply this model of cutaneous mast cell expansion in the setting of Bd infection and find that mast cell expansion attenuates the skin burden of Bd zoospores and pathologic features including epithelial thickness and improves protective mucus production and transcriptional markers of barrier function. Utilizing their prior expertise with expanding neutrophil populations in X. laevis, the authors compare mast cell expansion using (r)SCF to neutrophil expansion using recombinant colony-stimulating factor 3 (rCSF3) and find that neutrophil expansion in Bd infection leads to greater burden of zoospores and worse skin pathology. 

      Strengths:

      The authors report a novel method of expanding amphibian mast cells utilizing their custom-made rSCF reagent. They rigorously characterize expanded mast cells in vitro and in vivo using histologic, morphologic, transcriptional, and functional assays. This establishes solid footing with which to then study the role of rSCF-stimulated mast cell expansion in the Bd infection model. This appears to be the first demonstration of the exogenous use of rSCF in amphibians to expand mast cell populations and may set a foundation for future mechanistic studies of mast cells in the X. laevis model organism. 

      We thank the reviewer for recognizing the breadth and extent of the undertaking that culminated in this manuscript. Indeed, this manuscript would not have been possible without considerable reagent development and adaptation of techniques that had previously not been used for amphibian immunity research. In line with the reviewer’s sentiment, to our knowledge this is the first report of using molecular approaches to augment amphibian mast cells, which we hope will pave the way for new areas of research within the fields of comparative immunology and amphibian disease biology.

      Weaknesses:

      The conclusions regarding the role of mast cell expansion in controlling Bd infection would be stronger with a more rigorous evaluation of the model, as there are some key gaps and remaining questions regarding the data. For example: 

      (1) Granulocyte expansion is carefully quantified in the initial time courses of rSCF and rCSF3 injections, but similar quantification is not provided in the disease models (Figures 3E, 4G, 5D-G). A key implication of the opposing effects of mast cell vs neutrophil expansion is that mast cells may suppress neutrophil recruitment or function. Alternatively, mast cells also express notable levels of csfr3 (Figure 2) and previous work from this group (Hauser et al, Facets 2020) showed rG-CSF-stimulated peritoneal granulocytes express mast cell markers including kit and tpsab1, raising the question of what effect rCSF3 might have on mast cell populations in the skin. Considering these points, it would be helpful if both mast cells and neutrophils were quantified histologically (based on Figure 1, they can be readily distinguished by SE or Giemsa stain) in the Bd infection models. 

      We thank the reviewer for this insightful suggestion. Please find that we successfully adopted an in situ hybridization approach to evaluate neutrophil numbers in the skins of control and mast cell-enriched, mock- and Bd-infected animals based on expression of the neutrophil marker, myeloperoxidase (MPO9).  Please find these results are presented in Figures 6 and 8 of our revised manuscript and addressed in the appropriate sections of our revised paper.

      Our findings suggest that rSCF administration results in the accumulation of mast cells that are polarized such, that they ablate the inflammatory response elicited by Bd infection, such as through mechanisms like IL4 production. Mammalian mast cells, including peritonea-resident mast cells, express csf3r10, 11. For this reason, we used MPO expression to visualize neutrophil skin infiltration in Figures 6 and 8 of our revised work. While the X. laevis animal model does not permit nearly the degree of immune cell resolution afforded by mammalian animal models, we do know that the adult X. laevis peritonea contain a myriad of immune cell subsets. We anticipate that the high kit expression reported by Hauser et al., 2020 in the rCSF3-recruited peritoneal leukocytes reflects the presence of mast cells therein.

      Please find that we have used avidin-staining and MPO in situ hybridization to respectively visualize and enumerate mast cells and neutrophils in the skin of control and mast cell-enriched, mock- and Bd-infected animals. Indeed, our results show interesting, experimental condition-dependent changes in both the skin neutrophil and mast cell numbers. The results of these additional studies are presented in Figures 4, 6 and 8 of the revised manuscript and addressed in the results and discussions sections of our revised paper.

      (2) Epithelial thickness and inflammation in Bd infection are reported to be reduced by rSCF treatment (Figure 3E, 5A-B) or increased by rCSF3 treatment (Figure 4G) but quantification of these critical readouts is not shown.

      We thank the reviewer for this suggestion. We scored epithelial thickness under the distinct conditions described in our manuscript and presented the quantified data in Figures 5 and 8 of the revised paper.

      (3) Critical time points in the Bd model are incompletely characterized. Mast cell expansion decreases zoospore burden at 21 dpi, while there is no difference at 7 dpi (Figure 3E). Conversely, neutrophil expansion increases zoospore burden at 7 dpi, but no corresponding 21 dpi data is shown for comparison (Figure 4G). Microbiota analysis is performed at a third time point,10 dpi (Figure 5D-G), making it difficult to compare with the data from the 7 dpi and 21 dpi time points. Reporting consistent readouts at these three time points is important to draw solid conclusions about the relationship of mast cell expansion to Bd infection and shifts in microbiota.

      We thank reviewer for noting this discrepancy. Please find that we have repeated our mast cell-enrichment, Bd-challenge studies, examining days 10 and 21 post infection. Our new findings indicate that compared to control animals, mast cell-enrichment does result in significant reduction in Bd loads at both 10 and 21 dpi. The difference in Bd loads between r-ctrl and rSCF-treated animals at 10 dpi corroborates the other parameters that are altered between the two treatment groups at this experimental time point.

      Our question regarding the roles of inflammatory granulocytes/neutrophils during Bd infections was that of ‘how’ rather ‘when’ these cells affect Bd infections.  Thus, and because the central focus of this work was mast cells and not other granulocyte subsets; when we saw that rCSF3-recruited granulocytes adversely affect Bd infections at 7 days, we did not pursue the kinetics of these observations further. We plan to explore the roles of inflammatory mediators and immune cell subsets during the course of Bd infections but feel that these future studies are more peripheral to the central thesis of the present manuscript regarding the roles of frog mast cells during Bd infections.

      (4) Although the effect of rSCF treatment on Bd zoospores is significant at 21 dpi (Figure 3E), bacterial microbiota changes at 21 dpi are not (Figure S3B-C). This discrepancy, how it relates to the bacterial microbiota changes at 10 dpi, and why 7, 10, and 21 dpi time points were chosen for these different readouts (Figure 5F-G), is not discussed.

      Please find that our additional studies indicate that compared to control animals, frog skin mast cell-enrichment results in significant reduction in Bd loads at 10 dpi. This corroborate our other findings including the observation that at 10 dpi, control animals exhibit reduced microbial richness whereas mast cell-enriched frogs were protected from this disruption of their microbiome. The amphibian microbiome serves as a major barrier to these fungal infections12 and we anticipate that Bd-mediated disruption of microbial richness facilitates host skin colonization by this pathogen. In turn, we anticipate that frog mast cells are conferring the observed anti-Bd protection in part by preventing microbial disassembly and thus interfering with optimal Bd colonization and growth on frog skins. Please find that we acknowledge and discuss these notions in our revised manuscript.

      (5) The time course of rSCF or rCSF3 treatments relative to Bd infection in the experiments is not clear. Were the treatments given 12 hours prior to the final analysis point to maximize the effect? For example, in Figure 3E, were rSCF injections given at 6.5 dpi and 20.5 dpi? Or were treatments administered on day 0 of the infection model? If the latter, how do the authors explain the effects at 7 dpi or 21 dpi given mast cell and neutrophil numbers return to baseline within 24 hours after rSCF or rCSF3 treatment, respectively?

      Please find that in our revised manuscript, we underlined the kinetics of our animal treatments and Bd-infections. In brief, for mast cell-enrichment, animals were injected with r-ctrl or rSCF, challenged 12 hours later with Bd and examined after 10 (per reviewers’ suggestions) and 21 days of infection. For neutrophil enrichment, animals were injected with r-ctrl or rCSF3, challenged 12 hours later with Bd and examined after 7 days of infection.

      The title of the manuscript may be mildly overstated. Although Bd infection can indeed be deadly, mortality was not a readout in this study, and it is not clear from the data reported that expanding skin mast cells would ultimately prevent progression to death in Bd infections.

      We acknowledge this point. The revised manuscript will be titled: “Amphibian mast cells: barriers to chytrid fungus infections”.

      Reviewer #3 (Public Review):

      Summary:

      Hauser et al. provide an exceptional study describing the role of resident mast cells in amphibian epidermis that produce anti-inflammatory cytokines that prevent Batrachochytrium dendrobatidis (Bd) infection from causing harmful inflammation, and also protect frogs from changes in skin microbiomes and loss of mucin in glands and loss of mucus integrity that otherwise cause changes to their skin microbiomes. Neutrophils, in contrast, were not protective against Bd infection. Beyond the beautiful cytology and transcriptional profiling, the authors utilized elegant cell enrichment experiments to enrich mast cells by recombinant stem cell factor, or to enrich neutrophils by recombinant colony-stimulating factor-3, and examined respective infection outcomes in Xenopus.

      Strengths:

      Through the use of recombinant IL4, the authors were able to test and eliminate the hypothesis that mast cell production of IL4 was the mechanism of host protection from Bd infection. Instead, impacts on the mucus glands and interaction with the skin microbiome are implicated as the protective mechanism. These results will press disease ecologists to examine the relative importance of this immune defense among species, the influence of mast cells on the skin microbiome and mucosal function, and open the potential for modulating mucosal defense.

      We thank the reviewer for recognizing the utility of the work presented in our manuscript.

      Weaknesses:

      A reduction of bacterial diversity upon infection, as described at the end of the results section, may not always be an "adverse effect," particularly given that anti-Bd function of the microbiome increased. Some authors (see Letourneau et al. 2022 ISME, or Woodhams et al. 2023 DCI) consider these short-term alterations as encoding ecological memory, such that continued exposure to a pathogen would encounter an enriched microbial defense. Regardless, mast cell-initiated protection of the mucus layer may negate the need for this microbial memory defense.

      We thank the reviewer their insightful comment. We have revised our discussion to include this notion.

      While the description of the mast cell location in the epidermal skin layer in amphibians is novel, it is not known how representative these results are across species ranging in chytridiomycosis susceptibility. No management applications are provided such as methods to increase this defense without the use of recombinant stem cell factor, and more discussion is needed on how the mast cell component (abundance, distribution in the skin) of the epidermis develops or is regulated.

      We thank the reviewer for this suggestion. Please find that we have added a paragraph to our revised manuscripts to address possible source(s) of skin mast cells and a statement acknowledging that greater understanding of mast cell biology across distinct amphibian species may be used to develop future strategies for management of amphibian diseases.

      We are very thankful to the reviewer for this excellent suggestion but would like to point out that the work presented in our manuscript was driven by comparative immunology questions more than by conservation biology. As such and considering just how little is known about mast cells outside of mammals; we chose not to speculate too much into possible utilities of altering amphibian skin mast cell composition and instead to focus our discussion on the immediate takeaways of the work presented by our paper.

      References

      (1) Flajnik, M.F. A cold-blooded view of adaptive immunity. Nat Rev Immunol 18, 438-453 (2018).

      (2) Mulero, I., Sepulcre, M.P., Meseguer, J., Garcia-Ayala, A. & Mulero, V. Histamine is stored in mast cells of most evolutionarily advanced fish and regulates the fish inflammatory response. Proc Natl Acad Sci U S A 104, 19434-19439 (2007).

      (3) Reite, O.B. A phylogenetical approach to the functional significance of tissue mast cell histamine. Nature 206, 1334-1336 (1965).

      (4) Reite, O.B. Comparative physiology of histamine. Physiol Rev 52, 778-819 (1972).

      (5) Takaya, K., Fujita, T. & Endo, K. Mast cells free of histamine in Rana catasbiana. Nature 215, 776-777 (1967).

      (6) Galli, S.J. New insights into "the riddle of the mast cells": microenvironmental regulation of mast cell development and phenotypic heterogeneity. Lab Invest 62, 5-33 (1990).

      (7) Babina, M., Guhl, S., Artuc, M. & Zuberbier, T. IL-4 and human skin mast cells revisited: reinforcement of a pro-allergic phenotype upon prolonged exposure. Archives of dermatological research 308, 665-670 (2016).

      (8) Hermans, M.A.W. et al. Human Mast Cell Line HMC1 Expresses Functional Mas-Related G-Protein Coupled Receptor 2. Front Immunol 12, 625284 (2021).

      (9) Buchan, K.D. et al. A transgenic zebrafish line for in vivo visualisation of neutrophil myeloperoxidase. PLoS One 14, e0215592 (2019).

      (10) Aponte-Lopez, A., Enciso, J., Munoz-Cruz, S. & Fuentes-Panana, E.M. An In Vitro Model of Mast Cell Recruitment and Activation by Breast Cancer Cells Supports Anti-Tumoral Responses. Int J Mol Sci 21 (2020).

      (11) Jamur, M.C. et al. Mast cell repopulation of the peritoneal cavity: contribution of mast cell progenitors versus bone marrow derived committed mast cell precursors. BMC Immunol 11, 32 (2010).

      (12) Walke, J.B. & Belden, L.K. Harnessing the Microbiome to Prevent Fungal Infections: Lessons from Amphibians. PLoS Pathog 12, e1005796 (2016).

      Reviewer #2: (Recommendations For The Authors): 

      We thank the reviewer for their excellent suggestions, their time reviewing this work and their help with this manuscript.

      While we were not able to incorporate some of these changes, please find that we have significantly altered our manuscript in accordance with the reviewer’s suggestions from their public review. We feel that we have substantially altered our paper, including providing considerable additional data, supporting the key findings therein.

      (1) The heatmap in Figure 1I appears to be scaled data, similar to Figure 4A, in which case the indicated scale numbers are not correct (e.g. they should be -2 to 2, or -3 to 3) 

      Thank you for the suggestion. Please find that we have changed this figure accordingly.

      (2) For Figure 1, additional curated gene lists might better illustrate the difference in cell types, e.g. include the data for a panel of mast cell genes in a heatmap (mcpt1, tpsab1, etc.) and another panel of curated neutrophil genes (e.g. lyz) in a heatmap. If the authors still have leftover RNA, qPCR verification of some of the critical genes (e.g. kit) would add to the rigor of the analysis, as this study is the foundation of a new method for culturing amphibian mast cells. 

      We thank the reviewer for this suggestion. Unfortunately, we do not have leftover RNA/cDNA and we have not been able to locate mcpt1 or tpsab1 in our DEGs. We anticipate that this issue may stem from the suboptimal annotation of the Xenopus laevis genome. We agree that curating more mast cell/neutrophil genes would be ideal but feel that we have adequately highlighted those genes that are differentially expressed between the two populations in our analysis.

      (3) The presentation of counts in Figure 2 is a bit hard to interpret. Although it is mentioned that everything is statistically significant, explicitly showing statistics for each gene would be better. One possibility would be to use a volcano plot (p-value vs log2 fold change) and highlight the genes shown in Figure 2, potentially with an accompanying heat map to show replicate variability. 

      We thank the reviewer for this suggestion. We entertained presenting the data as volcano plots or heat maps, but in the end felt that the bar graphs better conveyed the information that we are hoping to get across. Please note that the error bars in the bar graph depict the replicate variability. Please also note that to highlight that all the depicted genes were differentially expressed, we italicized the statement in the corresponding figure legend: “All depicted genes were significantly differentially expressed between the two populations”.

      (4) Narratively, it might make more sense to put Figure 4A-C with Figure 3. 

      We thank the reviewer for this suggestion. Please find that we significantly revised most of our figures to better convey the content therein. We combined the content of Figure 4A-C with Figure 5A-C and added data on epidermal thickness under different conditions into this figure; Figure 5 of our revised manuscript.

      (5) If possible, complementing the skin RNA-seq from rSCF treatment in Bd infection with skin RNA-seq from rCSF3 treatment to compare effects on transcriptional programs of barrier function, etc would elevate this study and add additional insights into cutaneous inflammation in the setting of Bd infection. 

      We thank the reviewer for this suggestion. We anticipate that the skin inflammation caused by Bd infection is not due solely to neutrophil infiltration and artificially altering the frog skin neutrophil content would thus not recapitulate chytridiomycosis progression. We completely agree that it would be valuable to examine barrier functions in control and mast cell-enriched, Bd-infected frogs. This is something that we hope to pursue further in future studies but feel that together with our additional findings, we are presenting a significant amount of data to constitute a stand-alone story.

      (6) In Figure S1A, analyzing only 3 AMP genes by qPCR is perhaps too focused. As a control, it would be useful to also test some genes known to be functionally important in neutrophil anti-microbial responses, e.g. lyz. Expanding on this experiment by performing RNA-seq on Bd-treated, bone-marrow-derived mast cells and neutrophils would be a great addition to the manuscript and an important resource for future studies in the field. The fact that the use of rSCF (or rCSF3) enables the differentiation of these cells in large numbers of pure populations presents this unique opportunity. Although IL-4 did not end up affecting mucus production, clues to the mediator(s) of this mast cell-dependent effect may be found with unbiased RNA-seq after exposure to Bd. 

      We thank the reviewer for this suggestion but would like to point out that our manuscript is focused on mast cells rather than neutrophils. We also believe that in vitro exposure of leukocytes to Bd is not the most physiologically relevant model of what would happen to skin-resident and incoming immune cell subsets, since Bd primarily infects top-most keratinocytes. We anticipate that rather than coming into direct contact with the fungus, cells like mast cells and neutrophils are responding to Bd-produced and infected cell-produced products. For this reason, we did not perform RNA-seq analysis of in vitro derived mast cells or neutrophils stimulated with Bd. As we develop more X. laevis-specific reagents, we hope to revisit the question of infected skin mast cell and neutrophil gene expression profiles but are not in a position to ask these questions at this time.

      This work is also guided by a finite budget, and we feel that together with our significant additional findings described in our revised manuscript, we are presenting a substantial amount of work to constitute a stand-alone story and manuscript.

      Reviewer #3 (Recommendations For The Authors): 

      The following are minor edits needed in the text and figure legends: 

      Standardize terms such as IL4 instead of il4 or ril4 vs rIL4 throughout. Also, r-SCF vs rSCF. 

      Thank you. Please find that we have standardized such terms throughout our revised manuscript. Please note that we are adhering to the convention that gene names are in lower case, protein names are in upper case and recombinant protein names are preceded by an ‘r’.

      Pg 9 Change "In contract" to "In contrast". 

      Thank you and changed accordingly.

      Fig 4 - Perhaps indicate if results in addition to 7dpi are also available. 

      Please find that we analyzed Bd loads in control and mast cell-enriched, infected frogs after 10 dpi. This data is presented in Figures 3 and 4 of our revised manuscript.

      Similarly in Fig. 5, are results other than 10dpi available in the supplement? 

      Please find that the results from the microbiome studies are presented in supplemental figure 3 (Fig. S3). Please note that the results presented in original manuscript Fig. 5A-C - revised manuscript Fig. 5B-E depict data for 21 dpi, which is the longest examined infection timepoint. We present data from 1 and 10 dpi in Fig. 4 of our revised manuscript.

      Indicate why these days were chosen in the methods. 

      Please find that we indicated why the experimental timepoints were chosen, in the methods section of our revised manuscript.

      Fig S1 legend has errors in describing which panels are for which asterisks. 

      Fig. S3 legend indicates panels F and G. 

      Thank you. Please find that we revised our supplemental figures and amended the corresponding figure legends.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The study entitled "Rifampicin tolerance and growth fitness among isoniazid-resistant clinical Mycobacterium tuberculosis isolates: an in-vitro longitudinal study" by Vijay et al. provides valuable insights into the association of rifampicin tolerance and growth fitness with isoniazid resistance among clinical isolates of M. tuberculosis. Antibiotic tolerance in M. tuberculosis is an important topic since it contributes to the lengthy and complicated treatment required to cure tuberculosis disease and may portend the emergence of antibiotic resistance. The authors found that rifampicin tolerance was correlated with bacterial growth, rifampicin minimum inhibitory concentrations, and isoniazid-resistance mutations.

      Strengths:

      The large number of clinical isolates evaluated and their longitudinal nature during treatment for TB (including exposure to rifampin) are strengths of the study.

      Weaknesses:

      Some of the methodologies are not well explained or justified and the association of antibiotic tolerance with growth rate is not a novel finding. In addition, the molecular mechanisms underlying rifampicin tolerance only in rapidly growing isoniazid-resistant isolates have not been elucidated and the potential implications of these findings for clinical management are not immediately apparent.

      We thank the reviewer for the comments, we have modified the method section and figure 1 to clarify the method as suggested by the reviewer.

      Although we agree that previous studies have shown the association of slow growth rate with antibiotic tolerance, ours is the most comprehensive assessment of rifampicin tolerance among clinical isolates, to our knowledge. In particular, we show that the degree of tolerance in clinical isolates can vary over several orders of magnitude: which had not been previously documented or appreciated. Furthermore, the association of high tolerance among IR isolates is a new finding, and given the potential for tolerance to increase risk of de novo drug resistance, our study suggests that IR isolates with high rifampicin tolerance may present a risk for development of MDR-TB.

      In addition, we have also analysed the longitudinal isolates and the genetic variants emerging in them associated with increase in rifampicin tolerance. This analysis reveals possible multiple pathways to increase in rifampicin tolerance among clinical M. tuberculosis isolates. Possible clinical implication includes associating high rifampicin tolerance and isoniazid resistance as a risk factor for tuberculosis treatment failure. This study helps to develop further clinical studies to evaluate the role of rifampicin tolerance in IR isolates and treatment outcome. We have focused on these aspects in the discussion of the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study by Vijay and colleagues addresses a clinically important, and often overlooked aspect of Tb treatment. Detecting for variations in the level of antibiotic tolerance amongst otherwise antibiotic-susceptible isolates is difficult to routinely screen for, and consequently not performed. The authors, present a convincing argument that indeed, there is significant variation in the susceptibility of isoniazid-resistant strains to killing by rifampicin, in some cases at the same tolerance levels as bona fide resistant strains. On the whole, the study is easy to follow and the results are justified. This work should be of interest to the wider TB community at both a clinical and basic level.

      Weaknesses:

      The manuscript is long, repetitive in places, and the figures could use some amending to improve clarity (this could be a me-specific issue as they look ok on my screen, yet the colour is poor when printed).

      We thank the reviewer for the comments, we have modified the revised manuscript as per the reviewer suggestions.

      It would have been great to have seen some correlation between increased rifampicin tolerance and treatment outcome, although I'm not sure if this data is available to the researchers. I agree with the researchers the use of a single media condition is a limitation. However, this is true of a lot of studies. Rifampicin tolerance and treatment outcome analysis.

      We agree with the reviewer that correlation between rifampicin tolerance and treatment outcome is important. This needs to be performed in future studies with better design to correlate rifampicin tolerance with treatment progression or outcome data.  

      Reviewer #3 (Public Review):

      Summary:

      The authors have initiated studies to understand the molecular mechanisms underlying the devolvement of multi-drug resistance in clinical Mtb strains. They demonstrate the association of isoniazid-resistant isolates by rifampicin treatment supporting the idea that selection of MDR is a microenvironment phenomenon and involves a group of isolates.

      Strengths:

      The methods used in this study are robust and the results support the authors' claims to a major extent.

      Weaknesses:

      The manuscript needs a thorough vetting of the language. At present, the language makes it very difficult to comprehend the methodology and results.

      We thank the reviewer for the comments, we have revised the manuscript as per the reviewer’s suggestions.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) Methods: The authors attempt to differentiate between "fast"- and "slow"-growing bacteria in order to determine if the growth rate is associated with rifampicin tolerance. This is accomplished by assessing growth on solid agar at 15 and 60 days post-incubation, respectively. However, mycobacterial growth rate is not a binary phenomenon but rather a continuous variable. Moreover, it is not clear why 15 and 60 days were selected. Also, instead of a "slow growth" phenotype, the 60-day time point might simply reflect a longer lag phase. Were the plates examined at any interval time points? It would be interesting to know whether colony growth was delayed overall in the populations observed only at 60 days, or simply if the appearance of microcolonies visible to the naked eye was delayed (with normal growth afterwards).

      We thank the reviewer for the comments, we want to clarify that we have not used agar plates but most-probable number method to determine the survival fraction post antibiotic treatment. We have clarified this in the revised manuscript and revised figure 1. The MPN method is a binary measure (growth/ no growth) and therefore cannot differentiate between long lag time and other mechanisms. In our original analysis, we included an intermediate time point of 30 days, but these data (included as supp fig. 1) cannot address the issue of lag phase directly. Since the 30-day time point did not add to the overall analysis and interpretation, we had not included them in the original submission.

      (2) Methods/Results/Discussion: Some important clinical information is missing-how were the patients treated who had IR isolates? Did they receive the standard regimen for DS TB or was another drug substituted for isoniazid? Exposure to different drugs could affect the rifampicin-tolerant populations during the intensive phase (Figure 5).

      Thank you for this comment, we have included the information regarding the treatment regimen in the revised manuscript.

      Were there differences in microbiological (sputum culture conversion rate at 8 weeks or time to culture negativity) or clinical outcomes based on isoniazid susceptibility? Perhaps more importantly, were there differences in microbiological/clinical outcomes based on the proportion of bacterial subpopulations with rifampicin tolerance for a particular isolate? There should be more discussion on the potential clinical implications of the study's findings.

      We agree with the reviewer that correlation between rifampicin tolerance and treatment progression or outcome is important. This needs to be performed in future studies with better design to correlate rifampicin tolerance with treatment progression or outcome data.  

      (3) Results (Figure 3A): Although an interesting finding, the increased rifampicin tolerance observed only in the "rapidly" growing populations of isoniazid-resistant isolates (IR) vs. isoniazid-susceptible (IS) isolates is not explained. In contrast, equally, increased rifampicin tolerance is seen in the "slowly" growing populations of both IR and IS isolates. It would be interesting to know if these slowly growing populations show specific tolerance to rifampicin or if, as expected, slow growth confers tolerance to a range of different bactericidal antibiotics.

      We thank the reviewer for the suggestions. we agree these will be interesting to investigate in a future study but are outside the scope of the current study.

      (4) Results (Figure 3B): The basis for the classification into tertiles is not clear and appears somewhat arbitrary-does this represent the survival of a particular isolate following rifampicin exposure relative to the other isolates based on isoniazid susceptibility (IS or IR) or the % growth relative to other populations for the same isolate? Figure 3B is missing a y-axis label. Is it a log10 MPN ratio?

      We thank the reviewer for pointing this, we want to clarify that for the classification into tertiles, first we pooled both group of isolates isoniazid susceptible (IS) and isoniazid resistant (IR) into a single population. Subsequently, we categorized this unified population into three distinct groups: low, medium, and high, based on their survival fraction following rifampicin treatment. Consequently, the 'low,' 'medium,' and 'high' tertiles represent the survival of each isolate following rifampicin exposure relative to the total number of isolates  combing both IS and IR isolates.

      For clarity, we provide a breakdown of the criteria for each tertile:

      +Low tertile: Consists of isolates with the lowest survival fraction (bottom 25%).

      +Medium tertile: Encompasses isolates with survival fractions that fall between the bottom 25% and the top 25%.

      +High tertile: Comprises isolates with the highest survival fractions (top 25%). This we have modified in the revised manuscript to clarify.

      We have also modified the Figure 3B to correct the y-axis label.

      (5) Results (lines 185-186): For correlating relative growth in the absence of antibiotics, 19 clinical isolates "outliers" were removed without explanation.

      We have added explanation for the “outliers” which were removed earlier due to deviation from normal distribution, we have also provided the supplementary figure 3 which includes these outliers.

      (6) Results (lines 203-211): The authors attempted to investigate a potential association between the mechanism of M. tuberculosis isoniazid resistance and the degree of rifampicin tolerance. However, the vast majority of IR clinical isolates (n=71) had a katG_S315X mutation and only 8 isolates had alternative mutations (inhA_I21T and fabG1_C-15X). Given the wide range of rifampicin tolerance observed within these isoniazid-resistant isolates, they concluded that other genetic or epigenetic determinants must be playing a role. WGS of longitudinally collected isolates from the same patients during TB treatment yielded non-synonymous SNPs in a list of genes previously reported to be associated with persistence, tolerance, and mycobacterial survival. However, precise mechanisms (including, e.g., expression of efflux pumps) are not investigated.

      We thank the reviewer for summarising the findings. Yes, we agree that investigating the precise mechanism of rifampicin tolerance is beyond the scope of the current work.

      Minor comments:

      (1) Abstract (line 41): The nonstandard abbreviations "IR" and "IS" have not been introduced prior to this usage.

      We have modified this in the abstract.

      (2) Introduction (line 60): Insert "phenomena" or "mechanisms" after "two".

      We have modified this in the introduction.

      (3) Introduction (lines 66-69): This sentence is confusing, especially the second part ("supporting this studies...").

      We have modified the lines to clarify.

      (4) Introduction (line 84): In the current text, it appears as if "IR" is the abbreviation for "isoniazid". Therefore, I recommend changing "resistance to isoniazid" to "isoniazid resistance".

      We have modified this in the revised manuscript.

      (5) Results (line 141): Insert "the" before "rest".

      We have modified this in the revised manuscript.

      (6) Results (line 187): Replace "did not had" with "did not have".

      We have modified this in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Abstract:

      The abstract is long and repetitive. It needs reworking and shortening to improve clarity and highlight the main takeaway message.

      We thanks the reviewer for the suggestions and have modified this in the revised manuscript.

      The introduction is interesting and contains relevant information. However, it is long and takes a while to get to the point of the study. It needs re-writing to emphasise key prior results and the purpose of this study.

      We thanks the reviewer for the suggestions and we have modified this in the revised manuscript.

      Results:

      As the study relies predominately on the use of MPN, I think a simple schematic of how the experiment is performed would be informative. Could this be added to Figure 1?

      We have revised the figure 1 in the manuscript to include the schematic representation.

      Some of the differences in MKD90, whilst they may be significant, are small so it would at least provide context as to the relevance of these differences. This may also alleviate my confusion as to how the authors can measure the time required to achieve MDK90 as 1.23-1.31 days when the first time point that is taken is day 2 (the data in Figure 2). They have FigS6 but this is small and hard to follow.

      We thank the reviewer for this suggestion, we have modified this in the revised manuscript and figureS6.

      Figure 2:

      Would be helpful to have -1 on the Y axis.

      The grey dots don't print very well (Might be my printer)

      We have modified this in the revised manuscript, figure 2.

      Line 142: The authors note a difference in RIF tolerance at day 15 that disappeared by day 60. I assume they are referring to the day 5 timepoint although this isn't clear as written.

      Yes, it is referring to the day 5 time point and we have clarified this in the revised manuscript.

      The section starting at line 148 (fig 3) is interesting, but it is difficult to read and follow what the difference is between this data and the prior data in Figure 2. It also wasn't until about line 165 that the purpose became clear. Overall the conclusions are sound and interesting.

      We have modified this in the revised manuscript.

      Line 154: What are the early and late time recovery time points?

      Is Figure 3A the same data as Figure 2?

      We have clarified this in the revised manuscript, the figure 3A is the same data as Figure 2.

      I found Figure 6 hard to follow. I'm not sure how better to present this data, but it should be improved. Some further clarification in the text would be helpful.

      We thank the reviewer for the suggestions. We have added more explanation in the text to clarify figure 6.

      Conclusions:

      The conclusions are sound, based on the data presented. The clinical relevance is highlighted, yet appropriately phrased to not be too far-reaching.

      Again, I think the conclusions could be condensed considerably. It is repetitive in places, which distills the main outcomes of this otherwise interesting and important study. The authors appropriately highlight some of the limitations of their study.

      We thank the reviewer for these comments and have modified this in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript "Rifampicin tolerance and growth fitness among isoniazid-resistant clinical Mycobacterium tuberculosis isolates: an in-vitro longitudinal study" by Srinivasan et.al., details the identification/ development of isoniazid-resistant strains in clinical isolates following testament with rifampicin. This is an important aspect of understanding MDR development in TB strains. the results are promising and gel well with the hypothesis. However, the manuscript requires a thorough language modification. While the overall idea is clear the methodology does not come out clearly.

      Specific comments:

      (1) It is not clear whether rifampicin treatments were given for 2 and 5 days before kill curves or for 15 and 60 days? The methodology needs to be phased clearly. Why was this time interval of 15 days and 60 days taken? is there a rationale for this?

      We thank the reviewer for the suggestions, we have modified the method and figure 1 to clarify this in the revised manuscript.

      (2) A concentration of 2ug/ml was used for in vitro culture in this study. While the authors themselves indicate that this is well above the MIC, this might represent a non- natural dose and hence may force the evolution of strains. What will be the scenario in the natural course of antibiotic treatment (dose at MIC or less than MIC)?

      We have observed that till 5 days there is no significant resistant emergence but after 5 days only resistance emerges, therefore we avoided determining the survival fraction after resistance emergence, the kill curve represents mostly tolerant sub population. ADD: Pharmacokinetic studies of rifampicin dosing suggest that peak concentrations of >2-32 µg/mL are typical for standard doses of the drug, therefore we believe the chosen concentration of 2 µg/mL to be physiologically relevant.

      (3) As described in line 155, the survival spanned a broad distribution, across a million times in difference. This is rather surprising that 5 days of rifampicin treatment would lead to such a spread in resistance patterns. Did the authors study the different populations to understand this phenomenon? This is important given the scale of resistance developed in this short time.

      We want to clarify that the broad range of survival fraction reflect the difference in tolerant sub-populations but not resistant sub-population to rifampicin as they are determined post rifampicin treatment in rifampicin free media, this has been clarified in the revised figure 1.

      Overall, the manuscript is a detailed study with new insights into the development of multi-drug resistance by Mtb. A thorough vetting for language is essential for a greater impact of the study.

      We thank the reviewer and have attempted to improve the clarity of the language to increase the potential impact of our findings.

    1. Author response:

      The following is the authors' response to the current reviews.

      Reviewer #1 (Public Review):

      I'll begin by summarizing what I understand from the results presented, and where relevant how my understanding seems to differ from the authors' claims. I'll then make specific comments with respect to points raised in my previous review (below), using the same numbering. Because this is a revision I'll try to restrict comments here to the changes made, which provide some clarification, but leave many issues incompletely addressed.

      As I understand it the main new result here is that certain recurrent network architectures promote emergence of coordinated grid firing patterns in a model previously introduced by Kropff and Treves (Hippocampus, 2008). The previous work very nicely showed that single neurons that receive stable spatial input could 'learn' to generate grid representations by combining a plasticity rule with firing rate adaptation. The previous study also showed that when multiple neurons were synaptically connected their grid representations could develop a shared orientation, although with the recurrent connectivity previously used this substantially reduced the grid scores of many of the neurons. The advance here is to show that if the initial recurrent connectivity is consistent with that of a line attractor then the network does a much better job of establishing grid firing patterns with shared orientation.

      Beyond this point, things become potentially confusing. As I understand it now, the important influence of the recurrent dynamics is in establishing the shared orientation and not in its online generation. This is clear from Figure S3, but not from an initial read of the abstract or main text. This result is consistent with Kropff and Treves' initial suggestion that 'a strong collateral connection... from neuron A to neuron B... favors the two neurons to have close-by fields... Summing all possible contributions would result in a field for neuron B that is a ring around the field of neuron A.' This should be the case for the recurrent connections now considered, but the evidence provided doesn't convincingly show that attractor dynamics of the circuit are a necessary condition for this to arise. My general suggestion for the authors is to remove these kind of claims and to keep their interpretations more closely aligned with what the results show.

      We would like to clarify that the simple (flexible) attractor is a weaker condition than the ones previously used to align grid cells. However, by no means we claim that it is a necessary condition for grid maps to align. Other architectures, certainly more complex ones but perhaps even simpler ones, can align grid maps in our model.

      Major (numbered according to previous review)

      (1) Does the network maintain attractor dynamics after training? Results now show that 'in a trained network without feedforward Hebbian learning the removal of recurrent collaterals results in a slight increase in gridness and spacing'. This clearly implies that the recurrent collaterals are not required for online generation of the grid patterns. This point needs to be abundantly clear in the abstract and main text so the reader can appreciate that the recurrent dynamics are important specifically during learning.

      We respectfully disagree with the interpretation of this result. In this model cells self-organize to produce aligned grid maps. In such systems it makes sense to characterize the equilibrium states of the system. We turned learning off in Figure S3 to show that the recurrent connections have a contractive effect on grid spacing. But artificially turning off learning means that one can no longer make claims about the equilibrium states of the system, since it can no longer evolve freely. In a functional network, if the recurrent attractor is removed, the system will evolve towards poor gridness and no alignment no matter what the starting point is, as also shown in Figure S3. Several experimental results invite us to think of grid cells as the equilibrium solution of a series of constraints that is ready to change at any time: Barry et al, 2012; Yoon et al, 2013; Carpenter et al, 2015; Krupic et al, 2015; Krupic et al, 2018; Jayakumar et al, 2019.

      One point in which we perhaps agree with the reviewer is that information about the hexagonal maps is kept in the feedforward weights, while behavior and the recurrent collaterals act as constraints of which these feedforward weights are the equilibrium solution.

      (2) Additional controls for Figure 2 to test that it is connectivity rather than attractor dynamics (e.g. drawing weights from Gaussian or exponential distributions). The authors provide one additional control based on shuffling weights. However, this is far from exhaustive and it seems difficult on this basis to conclude that it is specifically the attractor dynamics that drive the emergence of coordinated grid firing.

      Again, we do not claim that this is the only way in which grid maps can be aligned, but it is the simplest one proposed so far. We were asked if it was the specific combination of input weights to a cell rather than the organization provided by the attractor which resulted in aligned maps. By shuffling the inputs to a cell we keep the combination of inputs invariant but lose the attractor architecture. Since grid maps in this new situation are not aligned, we can safely conclude that it is not the combination of inputs per se, but the specific organization of these inputs that allows grid alignment. It is not fully clear to us what ‘exhaustive’ means in this context.

      (3) What happens if recurrent connections are turned off? The new data clearly show that the recurrent connections are not required for online grid firing, but this is not clear from the abstract and is hard to appreciate from the main text.

      This point is related to (1). Absent this constraint, Figure S3 shows that the system evolves toward larger spacing, with poorer gridness and no alignment.

      (4) This is addressed, although the legend to Fig. S2D could provide an explanation / definition for the y-axis values.

      We have now added: Mean input fields are the sum of all inputs of a given kind entering a neuron at a given moment in time, averaged across cells and time.

      (5) Given the 2D structure of the network input it perhaps isn't surprising that the network generates 2D representations and this may have little to do with its 1D connectivity. The finding that the networks maintain coordinated grids when recurrent connections are switched off supports my initial concern and the authors explanation, to me at least, remain confusing. I think it would be helpful to consider that the connectivity is specifically important for establishing the coordinated grid firing, but that the online network does not require attractor dynamics to generate coordinated grid firing.

      This point is related to (1) and (3). We agree with the reviewer that the input lies within a 2D manifold, but this is not something that the network has to find out because it receives one datapoint of information at a time. This alone is not enough to form aligned grid cells, since each grid cell can find a roughly equivalent equilibrium in a different direction. It is only the constraint imposed by the recurrent collaterals that aligns grid maps, and, as we show, this constraint does not need to be constructed ad hoc to work on 2D, as previously thought. When recurrent connections are switched off, the system evolves toward unaligned grid maps, with larger spacing and lower gridness. Regarding the results obtained after modifying the network and turning off learning, we think they have a very limited scope (in this case showing the contractive effect of recurrent collaterals on grid spacing), given that the system is artificially being kept out of its natural equilibrium.

      (6) Clarity of the introduction. This is somewhat clearer, but I wonder if it would be hard for someone not familiar with the literature to accurately appreciate the key points.

      We have made our best effort to improve the clarity of the introduction.

      (7) Remapping. I'm not sure why this is ill posed. It seems the proposed model can not account for remapping results (e.g. Fyhn et al. 2007). Perhaps the authors could just clearly state this as a limitation of the model (or show that it can do this).

      We view our model as perfectly consistent with Fyhn et al, 2007. Remapping is not triggered by the network itself, though, but rather by a re-arrangement of the inputs requiring the network to learn new associations. Different simulations of the same model with identical parameters can be interpreted as remapping experiments.

      Reviewer #3 (Public Review):

      Summary:

      The paper proposes an alternative to the attractor hypothesis, as an explanation for the fact that grid cell population activity patterns (within a module) span a toroidal manifold. The proposal is based on a class of models that were extensively studied in the past, in which grid cells are driven by synaptic inputs from place cells in the hippocampus. The synapses are updated according to a Hebbian plasticity rule. Combined with an adaptation mechanism, this leads to patterning of the inputs from place cells to grid cells such that the spatial activity patterns are organized as an array of localized firing fields with hexagonal order. I refer to these models below as feedforward models.

      It has already been shown by Si, Kropff, and Treves in 2012 that recurrent connections between grid cells can lead to alignment of their spatial response patterns. This idea was revisited by Urdapilleta, Si, and Treves in 2017. Thus, it should already be clear that in such models, the population activity pattern spans a manifold with toroidal topology. The main new contributions in the present paper are (i) in considering a form of recurrent connectivity that was not directly addressed before. (ii) in applying topological analysis to simulations of the model. (iii) in interpreting the results as a potential explanation for the observations of Gardner et al.

      We wanted to note that we do not see this paper as proposing an alternative to the attractor hypothesis, given that we use attractor networks, but rather as an exploration of possibilities not yet visited by this hypothesis.

      Strengths:

      The exploration of learning in a feedforward model, when recurrent connectivity in the grid cell layer is structured in a ring topology, is interesting. The insight that this not only align the grid cells in a common direction but also creates a correspondence between their intrinsic coordinate (in terms of the ring-like recurrent connectivity) and their tuning on the torus is interesting as well, and the paper as a whole may influence future theoretical thinking on the mechanisms giving rise to the properties of grid cells.

      Weaknesses:

      (1) In Si, Kropff and Treves (2012) recurrent connectivity was dependent on the head direction tuning, in addition to the location on a 2d plane, and therefore involved a ring structure. Urdapilleta, Si, and Treves considered connectivity that depends on the distance on a 2d plane. The novelty here is that the initial connectivity is structured uniquely according to latent coordinates residing on a ring.

      The recurrent architectures in the cited works are complex and require arranging cells in a 2D manifold to calculate connectivity based on their relative 2D position. In other words, the 2D structure is imprinted in the architecture, as in our 2D condition. In this work the network is much simpler and only requires neighboring relations in 1D. Such relationships have been shown to spontaneously emerge in the hippocampal formation (Pastalkova et al, 2008; Gonzalo Cogno et al, 2024).

      (2) The paper refers to the initial connectivity within the grid cell layer as one that produces an attractor. However, it is not shown that this connectivity, on its own, indeed sustains persistent attractor states. Furthermore, it is not clear whether this is even necessary to obtain the results of the model. It seems possible that (possibly weaker) connections with ring topology, that do not produce attractor dynamics but induce correlations between neurons with similar locations on the ring would be sufficient to align the spatial response patterns during the learning of feedforward weights.

      Regarding the first part of the comment, the recurrent collaterals create one or at times multiple bumps of activity in the network so that neighboring (interconnected) cells activate together. An initial random state of activity rapidly falls into this dynamic, constrained by the attractor. To us this is not surprising given that this connectivity is the classical means of creating a continuous attractor. Perhaps there is some deeper meaning in this comment that we are not fully grasping.

      Regarding the second part of the comment, we fully agree with the reviewer. We are presenting what so far is the simplest connectivity that can align grid maps, but by no means we claim that it is the simplest possible one. Regarding weaker connections with ring topology, we show in Figure S2 that a ring attractor with too weak or too strong connections is incapable of aligning grids, since a balance between feedforward and feedback inputs is required.

      (3) Given that all the grid cells are driven by an input from place cells that span a 2d manifold, and that the activity in the grid cell network settles on a steady state which is uniquely determined by the inputs, it is expected that the manifold of activity states in the grid cell layer, corresponding to inputs that locally span a 2d surface, would also locally span a 2d plane. The result is not surprising. My understanding is that this result is derived as a prerequisite for the topological analysis, and it is therefore quite technical.

      We understand that the reviewer is referring to the motivation behind studying local dimensionality. We agree that the topological analysis approach is quite technical, but it provides unique insights. The theorem of closed surfaces, which allows us to deduce a toroidal topology from Betti numbers (1,2,1), only applies to closed surfaces. One thus needs to show that the point cloud is a surface (local dimensionality of 2) and is closed (no borders or singularities). If borders or singularities were present, a toroidal topology could not be claimed from these Betti numbers. Thus, it is a crucial step of the analysis.

      (4) The modeling is all done in planar 2d environments, where the feedforward learning mechanism promotes the emergence of a hexagonal pattern in the single neuron tuning curve. Under the scenario in which grid cell responses are aligned (i.e. all neurons develop spatial patterns with the same spacing and orientation) it is already quite clear, even without any topological analysis that the emerging topology of the population activity is a torus.

      However, the toroidal topology of grid cells in reality has been observed by Gardner et al also in the wagon wheel environment, in sleep, and close to boundaries (whereas here the analysis is restricted to the a sub-region of the environment, far away from the walls). There is substantial evidence based on pairwise correlations that it persists also in various other situations, in which the spatial response pattern is not a hexagonal firing pattern. It is not clear that the mechanism proposed in the present paper would generate toroidal topology of the population activity in more complex environments. In fact, it seems likely that it will not do so, and this is not explored in the manuscript.

      We agree that our work was constrained to exploration in 2D and that the situations posed by the reviewer are challenging, but we do not see them as unsurmountable. The wagon wheel shows a preservation of toroidal topology locally, where the behavior of the animal is rather 2-dimensional. Globally, hexagonal maps are lost, which is compatible with some flexibility in the way grid maps are formed. If sleep meant that all inputs are turned off, our model would predict a dynamic dictated by the architecture (1D for the ring attractor, for example), but we do not really know that this is the case. In the future, we intend to explore predictive activity along the linear attractor, which could both result in path integration and in some level of preservation of the activity when inputs are completely turned off.

      Regarding boundaries, as we have argued before, the cited work chooses to filter away what looks like more than half of the overall explained variance through PCA, and this is only before applying a non-linear dimensionality reduction algorithm. It is specifically shown that the analyzed components are the ones with global periodicity throughout the environment. Thus, it is conceivable that through this approach, local irregularities found only at the borders are disregarded in favor of a clearer global picture. While using a different methodology, our approach follows a similar spirit, albeit with far less noisy data.

      (5) Moreover, the recent work of Gardner et al. demonstrated much more than the preservation of the topology in the different environments and in sleep: the toroidal tuning curves of individual neurons remained the same in different environments. Previous works, that analyzed pairwise correlations under hippocampal inactivation and various other manipulations, also pointed towards the same conclusion. Thus, the same population activity patterns are expressed in many different conditions. In the present model, this preservation across environments is not expected. Moreover, the results of Figure 6 suggest that even across distinct rectangular environments, toroidal tuning curves will not be preserved, because there are multiple possible arrangements of the phases on the torus which emerge in different simulations.

      We agree with this observation. A symmetry in our implementation results in the fact that only ~50% of times the system falls in the preferred solution, and the rest of the times it falls into other local minima. Whether this result is at odds with current observations can be debated on the basis of probabilities. However, we believe that the symmetry we found is purely circumstantial, and that it can be broken by elements such as head direction modulation or other ingredients used to achieve path integration. In other words, we acknowledge that symmetry is an issue of the implementation we show here (which has been kept as simple as possible to serve as a proof-of-principle) but we do not think that it is a defining feature of flexible attractors in general. We expect that future implementations that incorporate path integration capabilities will not present this kind of symmetry in the space of solutions.

      Regarding the rigid phase translation across modalities, while this effect is very clear in Gardner et al, it is less so in other datasets. The analyses shown in Hermansen et al (2024) can rather be interpreted as somewhere in the way between perfect rigid translation and fully randomized phases across navigation modalities.

      (6) In real grid cells, there is a dense and fairly uniform representation of all phases (see the toroidal tuning of grid cells measured by Gardner et al). Thus, the highly clustered phases obtained in the model (Fig. S1) seem incompatible with the experimental reality. I suspect that this may be related to the difficulty in identifying the topology of a torus in persistent homology analysis based on the transpose of the matrix M.

      We partly agree with this observation and note that a pattern of ordered phases is an issue not only for the 1D attractor but also for the 2D one, which appears much more uniform than in experimental data. The low number of neurons we used for computational economy and the full connectivity could be key ingredients to generate these phase patterns. To show that this is not a defining feature of flexible attractors, apart from the fact that these patterns appear also with non-flexible 2D architectures, we included in Figure S1 simulations with ‘fragmented 1D’ architectures. In this case the architecture is a superposition of 20 random 1D stripe-like attractors. While the alignment of maps achieved with this architecture is almost at the same level as the one obtained with 1D and 2D attractors, the phases are much more similar to what has been observed experimentally, and less uniform than what is obtained with 2D attractors.

      (7) The motivations stated in the introduction came across to me as weak. As now acknolwledged in the manuscript, attractor models can be fully compatible with distortions of the hexagonal spatial response patterns - they become incompatible with this spatial distortions only if one adopts a highly naive and implausible hypothesis that the attractor state is updated only by path integration. While attractor models are compatible with distortions of the spatial response pattern, it is very difficult to explain why the population activity patterns are tightly preserved across multiple conditions without a rigid two-dimentional attractor structure. This strong prediction of attractor models withstood many experimental tests - in fact, I am not aware of any data set where substantial distortions of the toroidal activity manifold were observed, despite many attempts to challenge the model. This is the main motivation for attractor models. The present model does not explain these features, yet it also does not directly offer an explanation for distortions in the spatial response pattern.

      Some interesting examples are experiments in 3D, where grid cells presumably communicate with each other through the same recurrent collaterals, but global periodicity is lost and only some local order is preserved even away from boundaries (Ginosar et al, 2021; Grieves et al, 2021). While these datasets have not been explored using topological analysis, they serve as strong motivators to understanding 2D grid cells as one equilibrium solution that arises under some set of constraints, but belongs to a wider space of possible solutions that may arise as well under more flexible constraints. Even (and especially) if one adheres to the hypothesis that grid cells are pre-wired into a 2D torus, a concept like flexible attractors might become useful to understand how their activity is rendered in 3D. Another strong motivation is our lack of understanding of how a perfectly balanced 2D structure is formed and maintained. Simpler architectures could be thought of as alternatives, but also as an intermediate step towards it.

      Regarding the rigid phase translation across modalities, while this effect is very clear in Gardner et al, it is less so in other datasets. The analyses shown in Hermansen et al (2024) can rather be interpreted as somewhere in the way between perfect rigid translation and fully randomized phases.

      In a separate point, although it might not be strictly related to the comment, we do not fully share the idea that persistent activity patterns during sleep are necessary or sufficient conditions for attractor dynamics, although we do agree that attractors could be the mechanism behind them and any alternative is at least as complex as attractors. On the necessity side, attractors in the hippocampus are not constantly engaged (Wills et al, 2005). For sufficiency, one should prove that no other network is capable of reproducing the phenomenon, and to our best knowledge we are still far from that point.

      (8) There is also some weakness in the mathematical description of the dynamics. Mathematical equations are formulated in discrete time steps, without a clear interpretation in terms of biophysically relevant time scales. It appears that there are no terms in the dynamics associated with an intrinsic time scale of the neurons or the synapses (a leak time constant and/or synaptic time constants). I generally favor simple models without lots of complexity, yet within this style of modelling, the formulation adopted in this manuscript is unconventional, introducing a difficulty in interpreting synaptic weights as being weak or strong, and a difficulty in interpreting the model in the context of other studies.

      We chose to keep the model as simple as possible and in the line of previous publications developing it. However, we see the usefulness of putting it in what in the meantime has become a canonical framework. Fortunately this has been done by D’Albis and Kempter (2017). In our simplified version of the model there is no leak term and adaptation on its own brings down activity in the absence of input, but we agree that such a term could be added, albeit not without modifying all other network parameters.

      In my view, the weaknesses discussed above limit the ability of the model, as it stands, to offer a compelling explanation for the toroidal topology of grid cell population activity patterns, and especially the rigidity of the manifold across environments and behavioral states. Still, the work offers an interesting way of thinking on how the toroidal topology might emerge.

      Reviewer 1:

      Reviewer #1 (Recommendations For The Authors):

      See comments above. In addition:

      (1) Abstract: '...interconnected by a two-dimensional attractor guided by path integration'. This is unclear. I think the intended meaning might be along the lines of '...their being computed by a 2D continous attractor that performs path integration'?

      'path integration allowing for no deviations from the hexagonal pattern' This is incorrect. Local modulation of the gain of the speed input to a standard CAN would distort the grid pattern.

      'Using topological data analysis, we show that the resulting population activity is a sample of a torus' Activity in the model?

      'More generally, our results represent a proof of principle against the intuition that the architecture and the representation manifold of an attractor are topological objects of the same dimensionality, with implications to the study of attractor networks across the brain' I guess one might hold this intuition, but it strikes me as obvious that if you impose an sufficiently strong n-dimensional input on a network then it it's activity could have the same dimensionality. I don't really see this as being a point worth highlighting. Perhaps the more interesting point, it that during learning the recurrent connectivity aligns the grid fields of neurons in the network, and this may be a specific function of the 1D attractor dynamcis, although I don't think the authors have made this point convincing.

      'The flexibility of this low dimensional attractor allows it to negotiate the geometry of the representation manifold with the feedforward inputs'. See above for comments on the use of 'negotiate'.

      'while the ensemble of maps preserves features of the network architecture'. I don't understand this. What is the 'ensemble of maps' and what are the features referred to.

      We have reviewed the abstract considering these points. Regarding the ‘strong n-dimensional input’, we want to point out that it is not the input itself that generates a torus (the no attractor condition does not lead to a torus) but rather the interplay between the input and the attractor.

      ‘Perhaps the more interesting point …’, we do not fully understand how this sentence deviates from our own conclusions. We here show that a strong n-dimensional input is not enough to align grid cells (produce a n-torus), it is the interplay between inputs and attractor dynamics that does so, even if the attractor is not n-dimensional in terms of architecture.

      The ensemble of maps refers to the transpose of the population activity matrix, where each point in the cloud is a map, and the features refer to the persistent homology.

      (2) The manuscript still fails to clarify the difference between a model that path integrates in two dimensions and a model that simply represents information with a given dimensionality. The argument that it's surprising that a network with 1D architecture represents a higher dimensional input strikes me as incorrect and an unnecessary attempt to argue for conceptual importance. At least to me this isn't surprising. It would be surprising if the 1D network could path integrate but this doesn't seem to be the case.

      In response to the reviewer’s concerns, we have made clear in the introduction and discussion that this model has no path integration capabilities, although we aim to develop a model capable of path integration using the kind of simple architecture presented here. We want to highlight here that equating attractor dynamics with path integration would be a conceptual mistake.

      (3) Other wording also seems to make unnecessary conceptual claims. E.g. The repeated use of 'negotiate' implies some degree of intelligence, or at least an exchange of information, that isn't shown to exist. I wonder if more precise language could be used? As I understand it the dimensionality is bounded by the inputs on the one hand, and the network connectivity on the other, with the actual dimensionality being a function of the recurrent and feedforward synaptic weights. There's clearly some role for the relative weights and the properties of plasticity rules, but I don't see any evidence for a negotiation.

      An interesting observation in Figure S2 is that grid maps are aligned only if the relative strength of feedforward and recurrent inputs is similar. If one of them can impose over the other, grid maps do not align. This equilibrium can metaphorically be thought of as a negotiation instance, where the negotiation is an emergent property of the system rather than something happening at an individual synapse.


      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Reviewer #1 (Recommendations For The Authors):

      Major

      (1) What is the evidence that, after training, the 1D network maintains its attractor dynamics when feedforward inputs are active? If the claim is that it does then it's important to provide evidence, e.g. responses to perturbations, or other tests. The alternative is that after training the recurrent inputs are drowned out by the feed forward spatial inputs.

      We agree with the reviewer on the importance of this point. In our model, networks are always learning, and the population activity represented by aligned grid maps in a trained network is a dynamic equilibrium that emerges from the interplay between feedforward and collateral constraints. If Hebbian learning is turned off, one gets a snapshot of the network at that moment. We now show in Fig. S3 that in a trained network without feedforward Hebbian learning the removal of recurrent collaterals results in a slight increase in gridness and spacing. The expansion is due to the fact that, as we argue in the Results section, the attractor has a contractive effect on grid maps, which could relate to observations in novel environments (Barry et al, 2007). If Hebbian learning is turned on in the same situation, the maps, no longer constrained by the attractor, drift toward the equilibrium solution of the ‘No attractor’ condition, with significantly larger spacing, no alignment and lower individual gridness. Thus, the attractor is the force preventing them to do so when feedforward Hebbian learning is on.

      These observations point to the key role played by the attractor not only in forming but also in sustaining grid activity. The dynamic equilibrium framework fits well known properties of the system, such as its capacity to recalibrate very fast (Jayakumar et al, 2019), although this particular feature cannot be modeled with the current version of our model, that lacks path integration capabilities.

      (2) It would be useful to include additional control conditions for Figure 2 to test the hypothesis that it is simply connectivity, rather than attractor dynamics, that drives alignment.

      This could be achieved by randomly assigning strengths to the recurrent connections, e.g. drawing from exponential or Gaussian distributions.

      We agree and have included Fig. S2b-d, showing that the same distribution of collateral input weights entering each neuron, but lacking the 1D structure provided by the attractor, does not align grid maps. This is achieved by shuffling rows in the connectivity matrix, while avoiding self connections to make the comparison fair (self connections substantially alter the dynamic of the network, making it much more rigid). We observed that individual grid maps have very low gridness levels, even lower than in the no-attractor condition. In contrast, they have levels of population gridness slightly higher than in the no-attractor condition, but closer to 0 than to levels achieved with attractors. Our interpretation of these results is that irregular connectivity achieves some alignment in a few arbitrary directions and/or locations, which improves the coordination between maps at the expense of impairing rather than improving hexagonal responses of individual cells. Such observations stand in clear context to what is observed with continuous attractors with an orderly architecture.

      These results suggest that it is the structure of the attractor that allows grid cells to be aligned rather than the mere presence of recurrent collateral connections.

      (3) It seems conceivable that once trained the recurrent connections would no longer be required for alignment. Can this be evaluated by considering what happens if the recurrent connections are turned off after training (or slowly turned off during training)? Does the network continue to generate aligned grid fields?

      This point has elements in common with point 1. As we argued in that response, the attractor has two main effects on grid maps: it aligns them and it contracts them. If the attractor is turned off, feedforward Hebbian learning progressively drives maps toward the solution obtained for the ‘no attractor’ condition, characterized by maps with larger spacing, poorer gridness and lack of alignment.

      (4) After training what is the relative strength of the recurrent and feedforward inputs to each neuron?

      Both recurrent and feedforward synaptic-strength matrices are normalized throughout training, so that the overall incoming synaptic strength to each neuron is invariant. Because of this, although individual feed-forward and recurrent input fields vary dynamically, their average is constant, with the exception of the very first instances of the simulation, before a stable regime is reached in grid-cell activity levels. We have included Fig. S2d, showing the dynamics of feedforward and recurrent mean fields throughout learning as well as their ratio. In addition, Fig. S2a shows that the strength of recurrent relative to feedforward inputs is an important parameter, since alignment is only obtained in an intermediate range of ratios.

      (5) It would be helpful to also evaluate the low dimensional structure of the input to the network. Assuming it has a 2D structure, as it represents 2D space, can an explanation be provided for why it is surprising that the trained network also encodes activity with a 2D manifold? It strikes me that the more interesting finding might relate to alignment of the grids rather than claims about a 1D attractor encoding a 2D representation. Either way, stronger evidence and clearer discussion would be helpful.

      The reviewer is correct in assuming that the input has a 2D structure, that can be represented by a sheet embedded in a high dimensional space and thus has the Betti numbers [1,0,0]. The surprising element in our results is that we are showing for the first time that the population activity of an attractor network is constrained to a manifold that results from the negotiation between the architecture of the attractor and the inputs, and does not merely reflect the former as previously assumed. In this sense, the alignment of grid cells by a 1D attractor is an instance of the more general case that 1D attractors can encode 2D representations.

      It is certainly the case that the 2D input is a strong constraint pushing population activity toward a 2D manifold. However, the final form of the 2D manifold is strongly constrained by the attractor, as shown by the contrast with the no-attractor condition (a 2D sheet, as in the input, vs a torus when the attractor is present). The 1D attractor is able to flexibly adapt to the constraint posed by the inputs while doing its job (as demonstrated in previous points), which results in 2D grid maps aligned by a 1D attractor. Generally speaking, this work provides a proof of principle demonstrating that the topology of the attractor architecture and the manifold of the population activity space need not be identical, as previously widely assumed by the attractor community, and need not even have the same dimensionality. Instead, a single architecture can potentially be applied to many purposes. Hence, our work provides a valuable new perspective that applies to the study of attractors throughout the brain.

      (6) The introduction should be clearer about the different types of grid model and the computations they implement. E.g. The authors' previous model generates grid fields from spatial inputs, but if my understanding is correct it isn't able to path integrate. By contrast, while the many 2D models with continuous attractor dynamics also generate grid representations, they do so by path integration mechanisms that are computationally distinct from the spatial transformation implemented by feedforward models (see also general comments above).

      We agree with the reviewer and have made this point explicit in the introduction.

      (7) A prediction from continuous attractor models is that when place cells remap the low dimensional manifold of the grid activity is unaffected, except that the location of the activity bump is moved. It strikes me as important to test whether this is the case for the model presented here (my intuition is that it won't be, but it would be important to establish either way).

      We want to emphasize that our model is a continuous attractor model, so the question regarding the difference between what our model and continuous attractor network models predict is an ill-posed one. One of our main conclusions is precisely that attractors can work in a wider spectrum of ways than previously thought.

      In lack of a better definition, our multiple simulations could be thought of as training in different arenas. It is true that in our model maps take time to form, but this is also the case in novel environments (Barry et al, 2007 ), and continuous attractor models exclusively or strongly guided by self motion cues struggle to replicate this phenomenon. We show that the current version of our model accepts multiple solutions (in practice four but conceptually infinite countable), all of them resulting in a torus for the population activity (i.e. the same topology or low dimensional manifold). It is not clear to us how easy it would be to differentiate between most of these solutions in experimental data, with only incomplete information. This said, incorporating a symmetry-breaking ingredient to the model, for example related to head direction modulation, could perhaps lead to the prevalence of a single type of solution. We intend to explore this possibility in the future in order to add path-integration capabilities to the system, as described in the discussion.

      (8) The Discussion implies that 1D networks could perform path integration in a manner similar to 2D networks. This is a strong claim but isn't supported by evidence in the study. I suggest either providing evidence that this is the case for models of this kind or replacing it with a more careful discussion of the issue.

      The current version of our model has no path integration capabilities, as is now made explicit in the Introduction and Discussion. In addition, we have now made clear that the idea that path integration could perhaps be implemented using 1D networks is, although reasonable, purely speculative.

      Minor

      (1) Introduction. 'direct excitatory communication between them'. Suggest rewording to 'local synaptic interactions', as communication can also be purely inhibitory (e.g. Burak and Fiete, 2009) or indirect by excitation of local interneurons (e.g. Pastoll et al., Neuron, 2013).

      We agree and have adopted this phrasing.

      (2) The decision to focus the topology analysis on the 60 cm wide central square appears somewhat arbitrary. Are the irregularities referred to a property of the trained networks or would they also emerge with analysis of simulated ideal data? Can more justification be expanded and supplementary analyses be shown when the whole arena is used?

      In practical terms, a subsampling of the data to around half was needed because the persistent homology packages struggle to handle large amounts of data, especially in the calculation of H2. We decided to cut a portion of contiguous pixels in the open field at least larger than the hexagonal tile representing the whole grid population period (as represented in Figure 6). Leaving the borders aside was a logical choice since it is known that the solution at the borders is particularly influenced by the speed anisotropy of the virtual rat (see Si, Kropff & Treves, 2012), in a way that mimics how borders locally influence grid maps in actual rats (Krupic et al, 2015). The specific way in which our virtual rat handles borders is arbitrary and might not generalize. A second issue around borders is that maps are differently affected by incomplete smoothing, although this issue does not apply to our data because we did not smooth across neighboring pixels. In sum, considering the central 60 cm wide square was sufficient to contain the whole torus and a reasonable compromise that would allow us to perform all analyses in the part of the environment less influenced by boundaries.

      (3) It could help the general reader to briefly explain what a persistence diagram is.

      This is developed in the Appendix, but we have now added a reference to it and a brief description in the main text.

      (4) For the analyses in Figure 3-4, and separately for Figure 5, it might help the reader to provide visualizations of the low dimensional point cloud.

      All these calculations take place in the original high-dimensional point cloud. Doing them in a reduced space would be incorrect because there is no dimensionality reduction technique that guarantees the preservation of topology. In Figure 7 we reduce the dimensionality of data but emphasize that it is only done for visualization purposes, not to characterize topology. We also point out in this Figure that the same non-linear dimensionality reduction technique applied to objects with identical topology yields a wide variety of visualizations, some of them clear and some less clear. This observation further exemplifies why one cannot assume that a dimensionality-reduction technique preserves topology, even for a low-dimensional object embedded in a high-dimensional space.

      (5) The detailed comparison of the dynamics of each model is limited by the number of data points. Why not address this by new simulations with more neurons?

      We are not sure we understand this comment. In Figure 2, the dynamics for each model are markedly different. These are averages over 100 simulations. We are not sure what benefit would be obtained from adding more neurons. Before starting this work we searched for the minimal number of neurons that would result in convergence to an aligned solution in 2D networks, which we found to be around 100. Optimizing this parameter in advance was important to reduce computational costs throughout our work.

      (6) Could the variability in Figure 7 also be addressed by increasing the number of data points?

      As we argued in a previous point, there is no reason to expect preservation of topology after applying Isomap. We believe this lack of topology preservation to be the main driver of variability.

      (7) Page/line numbers would be useful.

      We agree. However, the text is curated by biorxiv which, to our best knowledge, does not include them.

      Reviewer 2:

      Reviewer #2 (Recommendations For The Authors):

      (1) I highly suggest that the author rewrite some parts of the Results. There are lots of details which should be put into the Methods part, for example, the implementation details of the network, the analysis details of the toroidal topology, etc. It will be better to focus on the results part first in each section, and then introduce some of the key details of achieving these results, to improve the readability of the work.

      This suggestion contrasts with that of Reviewer #1. As a compromise, we decided to include in the Results section only methodological details that are key to understanding the conclusions, and describe everything else in the Methods section.

      (2) 'Progressive increase in gridness and decrease in spacing across days have been observed in animals familiarizing with a novel environment...' From Fig.2c I didn't see much decrease. The authors may need to carry out some statistical test to prove this. Moreover, even the changes are significant, this might be not the consequence of the excitatory collateral constraint. To prove this, the authors may need to offer some direct evidence.

      We agree that the decrease is not evident in this figure due to the scale, so we are adding the correlation in the figure caption as proof. In addition, several arguments, some related to new analyses, demonstrate that the attractor contracts grid maps. First, the ‘no attractor’ condition has a markedly larger spacing compared to all other conditions (Fig. 2a). We also now show that spacing monotonically decreases with the strength of recurrent relative to feedforward weights, in a way that is rather independent of gridness (Fig. S2a). Second, as we now show in Fig. S2b-d, simulations with a shuffled 1D attractor, such that the sum of input synapses to each neuron are the same as in the 1D condition but no structure is present, lead to a spacing that is mid-way between the ‘no attractor’ condition and the conditions with attractors. Third, as we now show in Fig. S3a, turning off both recurrent connections and feedforward learning in a trained network results in a small increase in spacing. Fourth, as we now show in Fig. S3b, turning off recurrent connections while feedforward learning is kept on increases grid spacing to levels comparable to those of the ‘no attractor’ condition. All these elements support a role of the attractor in contracting grid spacing.

      (3) Some of the items need to be introduced first before going into details in the paper, for instance, the stipe-like attractor network, the Betti number, etc.

      We have added in the Results section a brief description and references to full developments in the Appendix.

      Reviewer 3 (Public Review):

      (1) It is not clear to me that the proposal here is fundamentally new. In Si, Kropff and Treves (2012) recurrent connectivity was dependent on the head direction tuning and thus had a ring structure. Urdapilleta, Si, and Treves considered connectivity that depends on the distance on a 2d plane.

      In the work of Si et al connectivity is constructed ad-hoc for conjunctive cells to represent a torus, it depends on head-directionality but also on the distance in a 2D plane. The topology of this architecture has not been assessed, but it is close to the typical 2D ‘rigid’ constraint. In the work of Urdapilleta et al, the network is a simple 2D one. The difference with our work is that we focus on the topology of the recurrent network and do not use head-direction modulation. In this context, we prove that a 1D network is enough to align grid cells and, more generally, we provide a proof of principle that the topology of the architecture and the representation space of an attractor network do not need to be identical, as previously assumed by the attractor community. These two important points were neither argued, speculated nor self-evident from the cited works.

      (2) The paper refers to the connectivity within the grid cell layer as an attractor. However, would this connectivity, on its own, indeed sustain persistent attractor states? This is not examined in the paper. Furthermore, is this even necessary to obtain the results in the model? Perhaps weak connections that do not produce an attractor would be sufficient to align the spatial response patterns during the learning of feedforward weights, and reproduce the results? In general, there is no exploration of how the strength of collateral interactions affects the outcome.

      The reviewer makes several important points. Local excitation combined with global inhibition is the archetypical architecture for continuous attractors (see for example Knierim and Zhang, Annual review of neuroscience, 2012). Thus, in the absence of feedforward input, we observe a bump of activity. As in all continuous attractors, this bump is not necessarily ‘persistent’ and instead is free to move along the attractor.

      We cannot prove that there is not a simpler architecture that has the same effect as our 1D or 1DL conditions, and we think that there are some interesting candidates to investigate in the future. What we now prove in new Fig. S2b-d is that it is not the strength of recurrent connections themselves, but instead the continuous attractor structure that aligns grid cells in our model. To demonstrate this, we shuffle incoming recurrent connections to each neuron in the 1D condition (while avoiding self-connections for fairness), and show that training does not lead to grid alignment. We also show in Fig. S1 that an architecture represented by 20 overlapping 1DL attractors, each formed by concatenating 10 random cells, aligns grid cells to levels slightly lower but similar to the 1D or 1DL attractors. This architecture can perhaps be considered as simpler to build in biological terms than all the others, but it is still constituted by continuous attractors.

      The strength of recurrent collaterals, or more precisely the recurrent to feedforward ratio, is crucial in our model to achieve a negotiated outcome from constraints imposed by the attractor and the inputs. We now show explicit measures of this ratio in Fig. S2, as well as examples showing that an imbalance in this ratio impairs grid alignment. When the ratio is too high or too low, both individual and population gridness are low. Interestingly, grid spacing behaves differently, decreasing monotonically with the relative strength of recurrent connections.

      (3) I did not understand what is learned from the local topology analysis. Given that all the grid cells are driven by an input from place cells that spans a 2d manifold, and that the activity in the grid cell network settles on a steady state that depends only on the inputs, isn't it quite obvious that the manifold of activity in the grid cell layer would have, locally, a 2d structure?

      The dimensionality of the input is important, although not the only determinant of the topology of the activity. The recurrent collaterals are the other determinant, and their architecture is a crucial feature. For example, as we now show in Figure S2b-d, shuffled recurrent synaptic weights fail to align grid cells. In the 1D condition, if feedforward inputs were absent, the dynamics of the activity would be confined to a ring. The opposite condition is our ‘no attractor’ condition, in which activity in the grid cell layer mimics the topology of inputs, a 2D sheet (and not a torus). It is in the intermediate range, when both feedforward and recurrent inputs are important, that a negotiated solution (a torus) is achieved.

      The analyses of local dimensionality and local homology of Figure 3 are crucial steps to demonstrate toroidal topology. According to the theorem of classification of closed surfaces, global homology is not enough to univocally define the topology of a point cloud, and thus this step cannot be skipped. The step is aimed to prove that the point cloud is indeed a closed surface.

      (4) The modeling is all done in planar 2d environments, where the feedforward learning mechanism promotes the emergence of a hexagonal pattern in the single neuron tuning curve. This, combined with the fact that all neurons develop spatial patterns with the same spacing and orientation, implies even without any topological analysis that the emerging topology of the population activity is a torus.

      We cannot agree with this intuition. In the ‘no attractor’ condition, individual maps have hexagonal symmetry with standardized spacing, but given the lack of alignment the population activity is not a closed surface and thus not a torus. It can rather be described as a 2D sheet embedded in a high dimensional space, a description that also applies to the input space.

      While it is rather evident that an ad hoc toroidal architecture folds this 2D population activity into a torus, it is less evident and rather surprising that 1D architectures have the same capability. This is the main novelty in our work.

      (5) Moreover, the recent work of Gardner et al. demonstrated much more than the preservation of the topology in the different environments and in sleep: the toroidal tuning curves of individual neurons remained the same in different environments. Previous works, that analyzed pairwise correlations under hippocampal inactivation and various other manipulations, also pointed towards the same conclusion. Thus, the same population activity patterns are expressed in many different conditions. In the present model, the results of Figure 6 suggest that even across distinct rectangular environments, toroidal tuning curves will not be preserved, because there are multiple possible arrangements of the phases on the torus which emerge in different simulations.

      We agree with the reviewer in the main point, although the recently found ring activity in the absence of sensory feedback (Gonzalo Cogno et al, 2023) suggests that what is happening in the EC is more nuanced than a pre-wired torus. Solutions in Figure 6 are different ways of folding a 1D strip into a torus, with or without the condition of periodicity in the 1D strip. Whether or not these different solutions would be discernible from one another in a practical setup is not clear to us. For example, global homology, as addressed in the Gardner paper, is the same for all these solutions. Furthermore, while our solutions of up to order 3 are highly discernable, higher order solutions, potentially achievable with other network parameters, would be impossible to discern by eye in representations similar to the ones in Figure 6. In addition, while we chose to keep our model in the simplest possible form as a clear proof of principle, new elements introduced to the model such as head directionality could break the symmetry and lead to the prevalence of one preferred solution for all simulation replicates. We plan to investigate this possibility in the future when attempting to incorporate path-integration capabilities to the model.

      (6) In real grid cells, there is a dense and fairly uniform representation of all phases (see the toroidal tuning of grid cells measured by Gardner et al). Here the distribution of phases is not shown, but Figure 7 suggests that phases are non uniformly represented, with significant clustering around a few discrete phases. This, I believe, is also the origin for the difficulty in identifying the toroidal topology based on the transpose of the matrix M: vectors representing the spatial response patterns of individual neurons are localized near the clusters, and there are only a few of them that represent other phases. Therefore, there is no dense coverage of the toroidal manifold that would exist if all phases were represented equally. This is not just a technical issue, however: there appears to be a mismatch between the results of the model and the experimental reality, in terms of the phase coverage.

      As mentioned in the results section, Figure 7 is meant for visualization purposes only, and serves more as cautionary tale regarding the imprevisible risks of non-linear dimensionality reduction than as a proof of the organization of activity in the network. Isomap is a non-linear transformation that deforms each of our solutions in a unique way so that, while all have the topology of a torus embedded in a high dimensional space, only a few of them exhibited one of two possible toroidal visualizations in a 3D Isomap reduction. Isomap, as well as all other popular dimensionality reduction techniques, provide no guarantee of topology invariance. A better argument to judge the homogenous distribution of phases is persistent homology, which identifies relatively large holes (compared to the sampling spacing) in the original manifold embedded in a high dimensional space. In our case, persistent homology identified only two holes significantly larger than noise (the two cycles of a torus) and one cavity in all conditions that included attractors. Regarding the specific distribution of phases in different conditions, however, see our reply below.

      (7) The manuscript makes several strong claims that incorrectly represent the relation between experimental data and attractor models, on one hand, and the present model on the other hand. For the latter, see the comments above. For the former, I provide a detailed list in the recommendations to the authors, but in short: the paper claims that attractor models induce rigidness in the neural activity which is incompatible with distortions seen in the spatial response patterns of grid cells. However, this claim seems to confuse distortions in the spatial response pattern, which are fully compatible with the attractor model, with distortions in the population activity patterns, which would be incompatible with the attractor model. The attractor model has withstood numerous tests showing that the population activity manifold is rigidly preserved across conditions - a strong prediction (which is not made, as far as I can see, by feedforward models). I am not aware of any data set where distortions of the population activity manifold have been identified, and the preservation has been demonstrated in many examples where the spatial response pattern is disrupted. This is the main point of two papers cited in the present manuscript: by Yoon et al, and Gardner et al.

      First of all, we would like to note that our model is a continuous attractor model. Different attractor models have different outcomes, and one of the main conclusions of our manuscript is that attractors can do a wider range of operations than previously thought.

      We agree with the reviewer that distortions in spatial activity (which speak against a purely path-integration guided attractor) should not be confused with distortions in the topology of the population activity (which would instead speak against the attractor dynamics itself). We have rephrased these observations in the manuscript. In fact, we believe that the capacity of grid cells to present distorted maps without a distortion of the population activity topology, as shown for example by Gardner and colleagues, could result from a tension between feedforward and recurrent inputs, the potential equilibriums of which our manuscript aims to characterize.

      (8) There is also some weakness in the mathematical description of the dynamics. Mathematical equations are formulated in discrete time steps, without a clear interpretation in terms of biophysically relevant time scales. It appears that there are no terms in the dynamics associated with an intrinsic time scale of the neurons or the synapses, and this introduces a difficulty in interpreting synaptic weights as being weak or strong. As mentioned above, the nature of the recurrent dynamics within the grid cell network (whether it exhibits continuous attractor behavior) is not sufficiently clear.

      We agree with the reviewer that our model is rather simple, and we value the extent to which this simplicity allows for a deep characterization. All models are simplifications and the best model in any given setup is the one with the minimum amount of complexity necessary to describe the phenomenon under study. We believe that to understand whether or not a 1D continuous attractor architecture can result in a toroidal population activity, a biophysically detailed model, with prohibitive computational costs, would have been unnecessarily complex. This argument does not intend to demerit biophysically detailed models, which are capable of addressing a wider range of questions regarding, for example, the spiking dynamics of grid cells, which cannot be addressed by our simple model.

      Reviewer #3 (Recommendations For The Authors):

      The work points to an interesting scenario for the emergence of toroidal topology, but the interpretation of this idea should be more nuanced. I recommend reconsidering the claims about limitations of the attractor theory, and acknowledging the limitations of the present theory.

      I don't see the limitations mentioned above as a reason to reject the ideas proposed in this manuscript, for two main reasons: first, additional research might reveal a regime of parameters where some issues can be resolved (e.g. the clustering of phases). In addition, the mechanism described here might act at an early stage in development to set up initial dynamics along a toroidal manifold, while other mechanisms might be responsible for the rigidity of the toroidal manifold in an adult animal. But all this implies that the novelty in the present manuscript is weaker than implied, the ability to explain experimental observations is more limited than implied, and these limitations should be acknowledged and discussed.

      I recommend reporting on the distribution of grid cell phases and, if indeed clustered, this should be discussed. It will be helpful to explore whether this is the reason for the difficulty in identifying the toroidal topology based on the collection of spatial response patterns (using the transpose of the matrix M).

      Ideally, a more complete work would also explore in a more systematic and parametric way the influence of the recurrent connectivity's strength on the learning, and whether a toroidal manifold emerges also in non-planar, such as the wagon-wheel environment studied in Gardner et al.

      Part of these recommendations have been addressed in the previous points (public review). Regarding the reason why the transpose of M does not fully recapitulate architecture with our conservative classification criteria, we believe that there is no reason why it should in the first place. We view the fact that the transpose of M recapitulates some features of the architecture as a purely phenomenological observation, and we think it is important as a proof that M is not exactly the same for the different conditions. We imagined that if M matrices were exactly the same this could be due to poor spatial sampling by our bins. Knowing that they are intrinsically different is important even if the reason why they have these specific features is not fully clear to us.

      Although we do not think that the distribution of phases is related to the absence of a cavity in the transpose of M or to the four clusters found in Isomap projections, it remains an interesting question that we did not explore initially. We are now showing examples of the distribution of phases in Figure S1. We observed that in both 2D and 1D conditions phases are distributed following rather regular patterns. Whether or not these patterns are compatible with experimental observations of phase distribution is to our view debatable, given that so far state-of-the-art techniques have only allowed to simultaneously record a small fraction of the neurons belonging to a given module. This said, we think that it is important to note that ordered phase patterns are an anecdotal outcome of our simulations rather than a necessary outcome of flexible attractors or attractors in general. To prove this point, we simulated a condition with a new architecture represented by the overlap of 20 short 1DL attractors, each recruiting 10 random neurons from the pool of 100 available ones.

      The rest of the parameters of the simulations were identical to those in the other conditions.

      By definition, the topology of this architecture has Betti numbers [20,0,0]. We show in Figure S1 that this architecture aligns grid cells, with individual and population gridness reaching slightly lower levels compared to the 1D condition. However, the distribution of phases of these grid cells has no discernible pattern. This result is an arbitrary example that serves as a proof-of-principle to show that flexible attractors can align grid cells without exhibiting ordered phases, not a full characterization of the outcome of this type of architecture, which we leave for future work. For the rest of our work, we stick to the simplest versions of 1D architectures, which allow for a more in-depth characterization.

      The wagon-wheel is an interesting case in which maps loose hexagonal symmetry although the population activity lies in a torus, perhaps evidencing the tension between feedforward and recurrent inputs and suggesting that grid cell response does not obey the single master of path integration. If we modeled it with a 1D attractor, we believe the outcome would strongly depend on virtual rat trajectory. If the trajectory was strictly linear, the population activity would be locally one-dimensional and potentially represented by a ring. Instead, if the trajectory allowed for turns, i.e. a 2D trajectory within a corridor-like maze, the population activity would be toroidal as in our open field simulations, while maps would not have perfect hexagonal symmetry, mimicking experimental results.

      More minor comments:

      Recurrent dynamics are modeled as if there is no intrinsic synaptic or membrane time constant. This may be acceptable for addressing the goals of this paper, but it is a bit unusual and it will be helpful to explain and justify this choice.

      As mentioned above, we believe that the best model in a given setup is the one with the lowest number of complexities that can still address the phenomenon under study. One does not use general relativity to build a bridge, although it provides a ‘more accurate’ description of the physics involved. All models are simplifications, and the more complex a model, the more it has to be taken as a black box.

      The Introduction mentions that in most models interaction between co-modular neurons occurs through direct excitatory communication, but in quite a few models the interaction is inhibitory. The crucial feature is that the interaction is strongly inhibitory between neurons that differ in their tuning, and either less inhibitory or excitatory between neurons with similar phases.

      We agree that directed inhibition has been shown to be as efficient as directed excitation, and we have modified the introduction to reflect this.

      The Discussion claims that the present work is the first one in which the topology of the recurrent architecture differs from the topology of the emergent state space. However, early works on attractor models of grid cells showed how neural connectivity which is arranged on a 2d plane, without any periodic boundary conditions, leads to a state space that exhibits the toroidal topology. Therefore, this claim should be revised.

      We agree, although the 2D sheet in this case acts as a piece of the torus, and locally the input space and architecture are identical objects. It could be argued that architectures that represent a 2D local slice of the torus, the whole torus, or several cycles around the torus form a continuous family parametrized by the extension of recurrent connections, and as a consequence it is not surprising that these works have not made claims about the incongruence between architecture and representation topologies. The 2D sheet connectivity is still constructed ad hoc to organize activity in a 2D bump, and there is no negotiation between disparate constraints because locally the constraints imposed by input and architecture are the same. We believe this situation is conceptually different from our flexible 1D attractors. We have adapted our claim to include this technical nuance.

      Why are neural responses in the perimeter of the environment excluded from the topological analysis? The whole point of the toroidal manifold analysis on real experimental data is that the toroidal manifold is preserved regardless of the animal's location and behavioral condition.

      We agree, although experimental data needs to go through extensive pre-processing such as dimensionality reduction before showing a toroidal topology. Such manipulations might smooth away the specific effects of boundaries on maps, together with other sources of noise. In our case, the original reason to downsample the dataset is related to the explosion in computational time that we experience with the ripser package when using more than ~1000 data points. For a proof-of-principle characterization we were much more interested in what happened in the center of the arena, where a 1D attractor could fold itself to confine population activity into a torus. The area we chose was sufficiently large to contain the whole torus. Borders do affect the way the attractor folds (they also affect grid maps in real rats). We feel that these imperfections could be interesting to study in relation to the parameters controlling how our virtual rat behaves at the borders, but not at this proof-of-principle stage.

      The periodic activity observed in Ref. 29 could in principle provide the basis for the ring arrangement of neurons. However, it is not yet clear whether grid cells participate in this periodic activity.

      We agree. So far it seems that entorhinal cells in general participate in the ring, which would imply that all kinds of cells are involved. However, it could well be that only some functional types participate in the ring and grid cells specifically do not, as future experiments will tell.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable work explores death coding data to understand the impact of COVID-19 on cancer mortality. The work provides solid evidence that deaths with cancer as a contributing cause were not above what would be expected during pandemic waves, suggesting that cancer did not strongly increase the risk of dying of COVID-19. These results are an interesting exploration into the coding of causes of death that can be used to make sense of how deaths are coded during a pandemic in the presence of other underlying diseases, such as cancer.

      We thank the editor and reviewers for the time they took to review our manuscript and for the thoughtful suggestions they provided. We have completed several revisions based on their feedback and we feel our paper is stronger as a result. However, none of these revisions change the overall conclusions of our study.

      Reviewer #1 (Public Review):

      Summary:

      In the paper "Disentangling the relationship between cancer mortality and COVID-19", the authors study whether the number of deaths in cancer patients in the USA went up or down during the first year (2020) of the COVID-19 pandemic. They found that the number of deaths with cancer mentioned on the death certificate went up, but only moderately. In fact, the excess with-cancer mortality was smaller than expected if cancer had no influence on the COVID mortality rate and all cancer patients got COVID with the same frequency as in the general population. The authors conclude that the data show no evidence of cancer being a risk factor for COVID and that the cancer patients were likely actively shielding themselves from COVID infections.

      Strengths:

      The paper studies an important topic and uses sound statistical and modeling methodology. It analyzes both, deaths with cancer listed as the primary cause of death, as well as deaths with cancer listed as one of the contributing causes. The authors argue, correctly, that the latter is a more important and reliable indicator to study relationships between cancer and COVID. The authors supplement their US-wide analysis by analysing three states separately.

      Weaknesses:

      The main findings of the paper can be summarized as six numbers. Nationally, in 2022, multiple-cause cancer deaths went up by 2%, Alzheimer's deaths by 31%, and diabetes deaths by 39%. At the same time, assuming no relationship between these diseases and either Covid infection risk or Covid mortality risk, the deaths should have gone up by 7%, 46%, and 28%. The authors focus on cancer deaths and as 2% < 7%, conclude that cancer is not a risk factor for COVID and that cancer patients must have "shielded" themselves against Covid infections.

      However, I did not find any discussion of the other two diseases. For diabetes, the observed excess was 39% instead of "predicted by the null model" 28%. I assume this should be interpreted as diabetes being a risk factor for Covid deaths. I think this should be spelled out, and also compared to existing estimates of increased Covid IFR associated with diabetes.

      And what about Alzheimer's? Why was the observed excess 31% vs the predicted 46%? Is this also a shielding effect? Does the spring wave in NY provide some evidence here? Why/how would Alzheimer's patients be shielded? In any case, this needs to be discussed and currently, it is not.

      We thank the reviewer for their positive feedback on the paper and for these suggestions. It is true that we have emphasized the impact on cancer deaths, as this was the primary aim of the paper. In the revised version, we have expanded the results and discussion sections to more fully describe the other chronic conditions we used as comparators (lines 267-284;346 – 386).

      Note that we are somewhat reluctant to designate any of these conditions as risk factors based solely on comparing the time series model with the demographic model of our expectations. As we mention in the discussion, there is considerable uncertainty around estimates from the demographic model in terms of the size of the population-at-risk, the mean age of the population-at-risk, and the COVID-19 infection rates and infection fatality ratios. Our demographic model is primarily used to demonstrate the effects of competing risks across types of cancers and chronic conditions, since these findings are robust to model assumptions. In contrast, the demographic model should be used with caution if the goal is to titrate the level of these risk factors (as the level of imputed risk is dependent on model assumptions). In the updated version of the manuscript, we have included uncertainty intervals in Table 3, using the upper and lower bounds of the estimated infection rates and IFRs, to better represent this uncertainty. We have also discussed this uncertainty more explicitly in the text and ran sensitivity analyses with different infection rate assumptions in the discussion (lines 354-362; 367 -370).

      We would like to note that rather than interpreting the absolute results, we used this demographic model as a tool to understand the relative differences between these conditions. From the demographic model we determined that we would expect to see much higher mortality in diabetes and Alzheimer’s deaths compared to cancer deaths due to three factors (1. Size of population-at-risk, 2. Mean age of the population-at-risk, 3. Baseline risk of mortality from the condition), that are separate from the COVID-19 associated IFR. And in general, this is what we observed.

      In comparing the results from the demographic model to the observed excess, diabetes does standout as an outlier from cancer and Alzheimer’s disease in that the observed excess is consistently above the null hypothesis which does lend support to the conclusion that diabetes is in fact a risk factor for COVID-19. A conclusion which is also supported by many other studies. Our findings for hematological cancers are also similar, in that we find consistent support for this condition being a risk factor. We have commented on this in the discussion and added a few references (lines 346-354; 395-403).

      Our hypothesis regarding non-hematological cancer deaths (lower than anticipated mortality due to shielding) could also apply to Alzheimer’s deaths. Furthermore, we used the COVID-19 attack rate for individuals >65 years (based on the data that is available), but we estimate that the mean age of Alzheimer’s patients is actually 80-81 years, so this attack rate may in fact be a bit too high, which would increase our expected excess. We have commented on this in the discussion (lines 363-377).

      Reviewer #2 (Public Review):

      The article is very well written, and the approach is quite novel. I have two major methodological comments, that if addressed will add to the robustness of the results.

      (1) Model for estimating expected mortality. There is a large literature using a different model to predict expected mortality during the pandemic. Different models come with different caveats, see the example of the WHO estimates in Germany and the performance of splines (Msemburi et al Nature 2023 and Ferenci BMC Medical Research Methodology 2023). In addition, it is a common practice to include covariates to help the predictions (e.g., temperature and national holidays, see Kontis et al Nature Medicine 2020). Last, fitting the model-independent for each region, neglects potential correlation patterns in the neighbouring regions, see Blangiardo et al 2020 PlosONE.

      Thank you for these comments and suggestions. We agree there are a range of methods that can be used for this type of analysis, and they all come with their strengths, weaknesses, and caveats. Broadly, the approach we chose was to fit the data before the pandemic (2014-2019), and project forward into 2020. To our knowledge it is not a best practice to use an interpolating spline function to extrapolate to future years. This is demonstrated by the WHO estimates in Germany in the paper you mention. This was our motivation for using polynomial and harmonic terms.

      Based on the above:

      a. I believe that the authors need to run a cross-validation to justify model performance. I would suggest training the data leaving out the last year for which they have mortality and assessing how the model predicts forward. Important metrics for the prediction performance include mean square error and coverage probability, see Konstantinoudis et al Nature Communications 2023. The authors need to provide metrics for all regions and health outcomes.

      Thank you for this suggestion. We agree that our paper could be strengthened by including cross validation metrics to justify model performance. Based on this suggestion, and your observations regarding Alzheimer’s disease, we have done two things. First, for the full pre-pandemic period (2014-2019) for each chronic condition and location we tested three different models with different degree polynomials (1. linear only, 2. linear + second degree polynomial, 3. linear + second degree polynomial + third degree polynomial) and used AIC to select the best model for each condition and location. Next, also in response to your suggestion, we estimated coverage statistics. Using the best fit model from the previous step, we then fit the model to data from 2014-2018 only and used the model to predict the 2019 data. We calculated the coverage probability as the proportion of weekly observed data points that fell within the 95% prediction interval. For all causes of death and locations the coverage probability was 100% (with the exception of multiple cause kidney disease in California, which is only shown in the appendix). The methods and results have been updated to reflect this change and we have added a figure to the appendix showing the selected model and coverage probability for each cause of death and location (lines 504 – 519; 847-859; Appendix 1- Figure 11).

      b. In the context of validating the estimates, I think the authors need to carefully address the Alzheimer case, see Figure 2. It seems that the long-term trends pick an inverse U-shape relationship which could be an overfit. In general, polynomials tend to overfit (in this case the authors use a polynomial of second degree).It would be interesting to see how the results change if they also include a cubic term in a sensitivity analysis.

      Thank you for this observation. Based on the changes described above, the model for Alzheimer’s disease now includes a cubic term in the national data and in Texas and California. The model with the second-degree polynomial remained the best fit for New York (Appendix 1 – Figure 11).

      c. The authors can help with the predictions using temperature and national holidays, but if they show in the cross-validation that the model performs adequately, this would be fine.

      At the scale of the US, adding temperature or environmental covariates is difficult and few US-wide models do so (see Goldstein 2012 and Quandelacy 2014 for examples from influenza). Furthermore, because we are looking at chronic disease outcomes, it is unclear that viral covariates or national holidays would drive these outcomes in the same way as they would if we were looking at mortality outcomes more directly related to transmissible diseases (such as respiratory mortality). Our cross validation also indicates that our models fit well without these additional covariates.

      d. It would be nice to see a model across the US, accounting for geography and spatial correlation. If the authors don't want to fit conditional autoregressive models in the Bayesian framework, they could just use a random intercept per region.

      We think the reviewer is mistaken here about the scale of our national analysis. Our national analysis did not fit independent models for each state or region. Rather, we fit a single model to the weekly-level national mortality data where counts for the whole of the US have been aggregated. We have clarified in the text (lines 156, 464). As such, we do not feel a model accounting for spatial correlation would be appropriate nor would we be able to include a random intercept for each region. We did fit three states independently (NY, TX, CA), but these states are very geographically distant from each other and unlikely to be correlated. These states were chosen in part because of their large population sizes, yet even in these states, confidence intervals were very wide for certain causes of death. Fitting models to each of the 50 US states, most of which are smaller than those chosen here, would exacerbate this issue.

      (2) I think the demographic model needs further elaboration. It would be nice to show more details, the mathematical formula of this model in the supplement, and explain the assumptions

      Thank you for this comment. We have added additional details on the demographic model to the methods. We have also extended this analysis to each state to further strengthen our conclusions (lines 548-590).

      Reviewing Editor Recommendations:

      I think that perhaps something that is missing is that the authors never make their underlying assumption explicit: they are assuming that if cancer increases the risk of dying of COVID-19, this would be reflected in the data on multiple causes of death where cancer would be listed as one of the multiple causes rather than as the underlying cause, and that their conclusions are predicated on this assumption. I would suggest explicitly stating this assumption, as opposed to other reasons why cancer mortality would increase (ex. if cancer care worsened during pandemic waves leading to poorer cancer survival).

      Response: Thank you for this suggestion. We have added a few sentences to the introduction to make this assumption clear (lines 106-112).

      Reviewer #1 (Recommendations For The Authors):

      - It could make sense to add "in the United States" into the title, as the paper only analyses US data.

      - It may make sense to reformulate the title from "disentangling the relationship..." into something that conveys the actual findings, e.g. "Lack of excess cancer mortality during Covid-19 pandemic" or something similar. Currently, the title tells nothing about the findings.

      Thank you for these suggestions. We have added “in the US” to the title. However, we feel that our findings are a bit more subtle than the suggested reformulation would imply, and we prefer to leave it in its current form.

      - Abstract, lines 42--45: This is the main finding of the paper, but I feel it is simplified too strongly in the abstract. Your simulations do *not* "largely explain" excess mortality with cancer; they give higher numbers! Which you interpret as "shielding" etc., but this is completely absent from the abstract. This sentence makes the impression that you got a good fit between simulated excess and real excess, which I would say is not the case.

      Thank you for this comment. We have rephrased the sentence in the abstract to better reflect our intentions for using the demographic model (lines 46-49). As stated above, the purpose of the demographic model was not to give a good fit with the observed excess mortality. Rather, we used the demographic model as a tool to understand the relative differences between these conditions in terms of expected excess mortality given the size, age-distribution, and underlying risk of death from the condition itself, assuming similar IFR and attack rates. And based on this, we conclude that it is not necessarily surprising that we see higher excess mortality for diabetes and Alzheimer’s compared to cancer.

      - Results line 237: you write that it's "more consistent with the null hypothesis", however clearly it is *not* consistent with the null hypothesis either (because 2% < 7%). You discuss in the Discussion that it may be due to shielding, but it would be good to have at least one sentence about it already here in the Results, and refer to the Discussion.

      We have mentioned this in the results and refer to the discussion (lines 277-278).

      - Results line 239: why was it closer to the assumption of relative risk 2? If I understand correctly, your model prediction for risk=1 was 7% and for risk=2 it was 13%. In NY you observed 8% (line 187). How is this closer to risk=2?

      Thank you for this observation. We have updated the demographic model with new data, extended the model to state-level data, and included confidence intervals on these estimates. We have also added additional discussion around the differences between our observations and expectations (lines 249-284).

      - Discussion line 275: "we did not expect to see large increases" -- why exactly? Please spell it out here. Was it due to the age distribution of the cancer patients? Was it due to the high cancer death risk?

      We demonstrate that it is the higher baseline risk of death for cancer that seems to be driving our low expectations for cancer excess mortality (lines 304-320). We have added this to the sentence to clarify our conclusions on this point and have added a figure to better illustrate this concept of competing risks (Figure 6).

      - Methods, line 405: perhaps it makes sense to cite some other notable papers on Covid excess mortality such as Msemburi et al Nature 2023, Karlinsky & Kobak eLife 2021, Islam et al BMJ 2021, etc.

      Thank you for mentioning this oversight. We certainly should have cited these papers and have included them in the updated version.

      - Methods line 410: why did you use a 5-week moving average? Why not fit raw weekly death counts? NB regression should be able to deal with it.

      Smoothing time series data with a moving average prior to running regression models is a very common practice. We did a sensitivity analysis using the raw data. This produced excess estimates with slightly larger confidence intervals, but does not change the overall conclusions of the paper.

      - Methods line 416: please indicate the software/library/package you used for fitting NB regression.

      We fit the NB regression using the MASS package in R version 4.3. We have added this to the methods (line 519).

      - Line 489: ORCHID -> ORCID

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      Codol et al. present a toolbox that allows simulating biomechanically realistic effectors and training Artificial Neural Networks (ANNs) to control them. The paper provides a detailed explanation of how the toolbox is structured and several examples that demonstrate its usefulness.

      Main comments:

      (1) The paper is well written and easy to follow. The schematics help in understanding how the toolbox works and the examples provide an idea of the results that the user can obtain.

      We thank the reviewer for this comment.

      (2) As I understand it, the main purpose of the paper should be to facilitate the usage of the toolbox. For this reason, I have missed a more explicit link to the actual code. As I see it, researchers will read this paper to figure out whether they can use MotorNet to simulate their experiments, and how they should proceed if they decide to use it. I'd say the paper provides an answer to the first question and assures that the toolbox is very easy to install and use. Maybe the authors could support this claim by adding "snippets" of code that show the key steps in building an actual example.

      This is an important point, which we also considered when writing this paper. We instead decided to focus on the first approach, because it is easier to illustrate the scientific use of the toolbox using code or interactive (Jupyter) notebooks than a publication format. We find the “how to proceed” aspect of the toolbox can more easily and comprehensively be covered using online, interactive tutorials. Additionally, this allows us to update these tutorials as the toolbox evolves over different versions, while it is more difficult to update a scientific article. Consequently, we explicitly avoided code snippets on the article itself. However, we appreciate that the paper would gain in clarity if this was more explicitly stated early. We have modified the paper to include a pointer to where to find tutorials online. We added this at the last paragraph of the introduction section:

      The interested reader may consult the full API documentation, including interactive tutorials on the toolbox website at https://motornet.org.

      (3) The results provided in Figures 1, 4, 5 and 6 are useful, because they provide examples of the type of things one can do with the toolbox. I have a few comments that might help improving them:

      a. The examples in Figures 1 and 5 seem a bit redundant (same effector, similar task). Maybe the authors could show an example with a different effector or task? (see point 4).

      The effectors from figures 1 and 5 are indeed very similar. However, the tasks in figure 1 and 5 present some important differences. The training procedure in figure 1 never includes any perturbations, while the one from figure 5 includes a wide range of perturbations of different magnitudes, timing and directions. The evaluation procedure of figure 1 includes center-out reaches with permanent viscous (proportional to velocity) external dynamics, while that of figure 5 are fixed, transient, square-shaped perturbation orthogonal to the reach direction. Finally, the networks in figure 1 undergo a second training procedure after evaluation while the network of figure 5 do not.

      While we agree that some variation of effectors would be beneficial, we do show examples of a point-mass effector in figure 6. Overall, figure 5 shows a task that is quite different from that of figure 1 with a similar effector, while the opposite is true for figure 6. We have modified the text to clarify this for the reader, by adding the following.

      End of 1st paragraph, section 2.4.

      Therefore, the training protocol used for this task largely differed from section 2.1 in that the networks are exposed to a wide range of mechanical perturbations with varying characteristics.

      1st paragraph of section 2.5

      […] this asymmetrical representation of PMDs during reaching movements did not occur when RNNs were trained to control an effector that lacked the geometrical properties of an arm such as illustrated in Figure 4c-e and section 2.1.

      b. I missed a discussion on the relevance of the results shown in Figure 4. The moment arms are barely mentioned outside section 2.3. Are these results new? How can they help with motor control research?

      We thank the reviewer for this comment. This relates to a point from reviewer 2 indicating that the purpose of each section was sometimes difficult to grasp as one reads. Section 2.3 explains the biomechanical properties that the toolbox implements to improve realism of the effector. They are not new results in the sense that other toolboxes implement these features (though not in differentiable formats) and these properties of biological muscles are empirically well-established. However, they are important to understand what the toolbox provides, and consequently what constraints networks must accommodate to learn efficient control policies. An example of this is the results in figure 6, where a simple effector versus a more biomechanically complex effector will yield different neural representations.

      Regarding the manuscript itself, we agree that more clarity on the goal of every paragraph may improve the reader’s experience. Consequently, we ensured to specify such goals at the start of each section. Particularly, we clarify the purpose of section 2.3 by adding several sentences on this at the end of the first paragraph in that section. We also now clearly state the purpose of section 2.3 with the results of figure 6 and reference figure 4 in that section.

      c. The results in Figure 6 are important, since one key asset of ANNs is that they provide access to the activity of the whole population of units that produces a given behavior. For this reason, I think it would be interesting to show the actual "empirical observations" that the results shown in Fig. 6 are replicating, hence allowing a direct comparison between the results obtained for biological and simulated neurons.

      These empirical observations are available from previous electrophysiological and modelling work. Particularly, polar histograms across reaching directions like panel C are displayed in figures 2 and 3 of Scott, Gribble, Graham, Cabel (2001, Nature). Colormaps of modelled unit activity across time and reaching directions like panel F are also displayed in figure 2 of Lillicrap, Scott (2013, Neuron). Electrophysiological recordings of M1 neurons during a similar task in non-human primates can also be seen on “Preserved neural population dynamics across animals performing similar behaviour” figure 2 B (https://doi.org/10.1101/2022.09.26.509498) and “Nonlinear manifolds underlie neural population activity during behaviour” figure 2 B as well (https://doi.org/10.1101/2023.07.18.549575). Note that these two pre-prints use the same dataset.

      We have added these citations to the text and made it explicit that they contain visualizations of similar modelling and empirical data for comparison:

      This heterogeneous set of responses matches empirical observations in non-human primate primary motor cortex recordings (Churchland & Shenoy, 2007; Michaels et al., 2016) and replicate similar visualizations from previously published work (Fortunato et al., 2023; Lillicrap & Scott, 2013; Safaie et al., 2023).

      (4) All examples in the paper use the arm26 plant as effector. Although the authors say that "users can easily declare their own custom-made effector and task objects if desired by subclassing the base Plant and Task class, respectively", this does not sound straightforward. Table 1 does not really clarify how to do it. Maybe an example that shows the actual code (see point 2) that creates a new plant (e.g. the 3-joint arm in Figure 7) would be useful.

      Subclassing is a Python process more than a MotorNet process, as python is an object-oriented language. Therefore, there are many Python tutorials on subclassing in the general sense that would be beneficial for that purpose. We have amended the main text to ensure that this is clearer to the reader.

      Subclassing a MotorNet object, in a more specific sense, requires overwriting some methods from the base MotorNet classes (e.g., Effector or Environment classes, which correspond to the original Plant and Task object, respectively). Since we made the decision (mentioned above) to not include code in the main text, we added tutorials to the online documentation, which include dedicated tutorials for MotorNet class subclassing. For instance, this tutorial showcases how to subclass Environment classes:

      https://colab.research.google.com/github/OlivierCodol/MotorNet/blob/master/examples/3-environments.ipynb

      (5) One potential limitation of the toolbox is that it is based on Tensorflow, when the field of Computational Neuroscience seems to be, or at least that's my impression, transitioning to pyTorch. How easy would it be to translate MotorNet to pyTorch? Maybe the authors could comment on this in the discussion.

      We have received a significant amount of feedback asking for a PyTorch implementation of the toolbox. Consequently, we decided to enact this, and the next version of the toolbox will be exclusively in PyTorch. We will maintain the Application Programming Interface (API) and tutorial documentation for the TensorFlow version of the toolbox on the online website. However, going forward we will focus exclusively on bug-fixing and expanding from the latest version of MotorNet, which will be in PyTorch. We now believe that the greater popularity of PyTorch in the academic community makes that choice more sustainable while helping a greater proportion of research projects.

      These changes led to a significant alteration of the MotorNet structure, which are reflected by changes made throughout the manuscript, notably in Figure 3 and Table 1.

      (6) Supervised learning (SL) is widely used in Systems Neuroscience, especially because it is faster than reinforcement learning (RL). Thus providing the possibility of training the ANNs with SL is an important asset of the toolbox. However, SL is not always ideal, especially when the optimal strategy is not known or when there are different alternative strategies and we want to know which is the one preferred by the subject. For instance, would it be possible to implement a setup in which the ANN has to choose between 2 different paths to reach a target? (e.g. Kaufman et al. 2015 eLife). In such a scenario, RL seems to be a more natural option Would it be easy to extend MotorNet so it allows training with RL? Maybe the authors could comment on this in the discussion.

      The new implementation of MotorNet that relies on PyTorch is already standardized to use an API that is compatible with Gymnasium. Gymnasium is a standard and popular interfacing toolbox used to link RL agents to environments. It is very well-documented and widely used, which will ensure that users who wish to employ RL to control MotorNet environments will be able to do so relatively effortlessly. We have added this point to accurately reflect the updated implementation, so users are aware that it is now a feature of the toolbox (new section 3.2.4.).

      Impact:

      MotorNet aims at simplifying the process of simulating complex experimental setups to rapidly test hypotheses about how the brain produces a specific movement. By providing an end-to-end pipeline to train ANNs on the simulated setup, it can greatly help guide experimenters to decide where to focus their experimental efforts.

      Additional context:

      Being the main result a toolbox, the paper is complemented by a GitHub repository and a documentation webpage. Both the repository and the webpage are well organized and easy to navigate. The webpage walks the user through the installation of the toolbox and the building of the effectors and the ANNs.

      Reviewer #2 (Public Review):

      MotorNet aims to provide a unified interface where the trained RNN controller exists within the same TensorFlow environment as the end effectors being controlled. This architecture provides a much simpler interface for the researcher to develop and iterate through computational hypotheses. In addition, the authors have built a set of biomechanically realistic end effectors (e.g., an 2 joint arm model with realistic muscles) within TensorFlow that are fully differentiable.

      MotorNet will prove a highly useful starting point for researchers interested in exploring the challenges of controlling movement with realistic muscle and joint dynamics. The architecture features a conveniently modular design and the inclusion of simpler arm models provides an approachable learning curve. Other state-of-the-art simulation engines offer realistic models of muscles and multi-joint arms and afford more complex object manipulation and contact dynamics than MotorNet. However, MotorNet's approach allows for direct optimization of the controller network via gradient descent rather than reinforcement learning, which is a compromise currently required when other simulation engines (as these engines' code cannot be differentiated through).

      The paper could be reorganized to provide clearer signposts as to what role each section plays (e.g., that the explanation of the moment arms of different joint models serves to illustrate the complexity of realistic biomechanics, rather than a novel discovery/exposition of this manuscript). Also, if possible, it would be valuable if the authors could provide more insight into whether gradient descent finds qualitatively different solutions to RL or other non gradient-based methods. This would strengthen the argument that a fully differentiable plant is useful beyond improving training time / computational power required (although this is a sufficiently important rationale per se).

      We thank the reviewer for these comments. We agree that more clarity on the section goals may improve the reader’s experience and ensured this is the case throughout the manuscript. Particularly, we added the following on the first paragraph of section 2.3, for which an explicit goal was most missing:

      In this section we illustrate some of these biomechanical properties displayed by MotorNet effectors using specific examples. These properties are well-characterised in the biology and are often implemented in realistic biomechanical simulation software.

      Regarding the potential difference in solutions obtained from reinforcement or supervised learning, this would represent a non-trivial amount of work to do so conclusively and so may not be within the scope of the current article. We do appreciate however that in some situations RL may be a more fitting approach to a given task design. In relation to this point we now specify in the discussion that the new API can accommodate interfacing with reinforcement learning toolboxes for those who may want to pursue this type of policy training approach when appropriate (new section 3.2.4.).

      Reviewer #3 (Public Review):

      Artificial neural networks have developed into a new research tool across various disciplines of neuroscience. However, specifically for studying neural control of movement it was extremely difficult to train those models, as they require not only simulating the neural network, but also the body parts one is interested in studying. The authors provide a solution to this problem which is built upon one of the main software packages used for deep learning (Tensorflow). This allows them to make use of state-of-the-art tools for training neural networks.

      They show that their toolbox is able to (re-)produce several commonly studied experiments e.g., planar reaching with and without loads. The toolbox is described in sufficient detail to get an overview of the functionality and the current state of what can be done with it. Although the authors state that only a few lines of code can reproduce such an experiment, they unfortunately don't provide any source code to reproduce their results (nor is it given in the respective repository).

      The possibility of adding code snippets to the article is something we originally considered, and which aligns with comment two from reviewer one (see above). Hopefully this provides a good overview of the motivation behind our choice not to add code to the article.

      The modularity of the presented toolbox makes it easy to exchange or modify single parts of an experiment e.g., the task or the neural network used as a controller. Together with the open-source nature of the toolbox, this will facilitate sharing and reproducibility across research labs.

      I can see how this paper can enable a whole set of new studies on neural control of movement and accelerate the turnover time for new ideas or hypotheses, as stated in the first paragraph of the Discussion section. Having such a low effort to run computational experiments will be definitely beneficial for the field of neural control of movement.

      We thank the reviewer for these comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The main goal of the authors was to study the testis-specific role of the protein FBXO24 in the formation and function of the ribonucleoprotein granules (membraneless electron-dense structures rich in RNAs and proteins).

      We appreciate the summary comment of reviewer #1.

      Strengths:

      The wide variety of methods used to support their conclusions (including transgenic models)

      We appreciate the positive comment of reviewer #1.

      Weaknesses:

      The lack of specific antibodies against FBXO24. Some of the experiments showing a specific phenotype are descriptive and lack of logical explanation about the possible mechanism (i.e. AR or the tail structure).

      Because we could not obtain specific antibodies against FBXO24, we generated Fbxo24-FLAG transgenic mice, which can be used to show the interaction between FBXO24 and IPO5. For the mechanism of impaired acrosome reaction, we added some results and discussion as written in the response to the question (1) of reviewer #1 (public review). For the mechanism of abnormal flagellar structure, we added new results and fixed the manuscript as written in the response to the major comments of reviewer #3 (recommendations for the authors).

      Questions:

      The paper is excellent and employs a wide variety of methods to substantiate the conclusions. I have very few questions to ask:

      (1) KO mice cannot undergo acrosome reaction (AR) even spontaneously. How do you account for this, given that no visible defects were observed in the acrosome?

      One possibility is that Fbxo24 KO spermatozoa cannot undergo capacitation; however, it is difficult to analyze the capacitation status such as tyrosine phosphorylation because most Fbxo24 KO spermatozoa are not alive (Figure S3A). Other possibility is that AR-related proteins are affected in Fbxo24 KO spermatozoa. Therefore, we analyzed the amounts of AR-related proteins with mass spectrometry (Figure S3C). Although previous studies indicate that the assembly of the SNARE complex is a key event prior to AR [Hutt et al., 2005 (PMID: 15774481); Katafuchi et al., 2000 (PMID: 11066067); Schulz et al., 1997 (PMID: 9356173); Tomes et al., 2002 (PMID: 11884041)], no clear differences were detected for SNARE proteins (Figure S3C and D). PLCD4 that is important for AR [Fukami et al., 2001 (PMID: 11340203)) was also detected in Fbxo24 KO spermatozoa (Figure S3C). Although we could not find differences in the amounts of AR-related proteins, it is still possible that FER1L5, another AR-related protein [Morohoshi et al., 2023 (PMID: 36696506)] not detected in the mass spectrometry analyses, or AR-related proteins not yet identified are affected in Fbxo24 KO spermatozoa. We added these results and discussion (line 160-166 and 305-312).

      (2) KO sperm are unable to migrate in the female tract, and, more intriguingly, they do not pass through the utero-tubal junction (UTJ). The levels of ADAM3 are normal, suggesting that the phenotype is influenced by other factors. The authors should investigate the levels of Ly6K since mice also exhibit the same phenotype but with normal levels of ADAM3.

      We detected LY6K in Fbxo24 KO spermatozoa with immunoblotting, but no difference was found.

      We added the results (Figure S3E and line 172–175).

      (3) In Figure 4A, the authors assert that "RBGS Tg mice revealed that mitochondria were abnormally segmented in Fbxo24 KO spermatozoa." I am unable to discern this from the picture shown in that panel. Could you please provide a more detailed explanation or display the information more explicitly?

      We are sorry for the ambiguous explanation on the morphology of sperm mitochondria sheath. Fbxo24 KO cauda epidydimal spermatozoa shows disorganized mitochondria sheath rather than “segmented”. We fixed the sentence (line 190-192) and added white arrowheads that indicate the disorganized regions (Figure 4A).

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kaneda et al "FBXO24 ensures male fertility by preventing abnormal accumulation of membraneless granules in sperm flagella" is a significant paper on the role of FBXO24 in murine male germ cell development and sperm ultrastructure and function. The body of experimental evidence that the authors present is extraordinarily strong in both breadth and depth. The authors investigate the protein's functions in male germ cells and sperm using a wide variety of approaches but focusing predominantly on their novel mouse model featuring deletion of the Fbxo24 gene and its product. Using this mouse, and a cross of it with another model that expresses reporters in the head and midpiece, they logically build from one experiment to the next. Together, their data show that this protein is involved in the regulation of membraneless electron-dense structures; loss of FBXO24 led to an accumulation of these materials and defects in the sperm flagellum and fertilizing ability. Interestingly, the authors found that several of the best-known components of electron-dense ribonucleoprotein granules that are found in the intermitochondrial cement and chromatoid body were not disrupted in the Fbxo24 knockout, suggesting that the electron-dense material and these structures are not all the same, and the biology is more complicated than some might have thought. They found evidence for the most changes in IPO5 and KPNB1, and biochemical evidence that FBXO24 and IPO5 could interact.

      We appreciate the summary comment of reviewer #2.

      Strengths:

      The authors are to be commended for the thoroughness of their experimental approaches and the extent to which they investigated impacts on sperm function and potential biochemical mechanisms. Very briefly, they start by showing that the Fbxo24 message is present in spermatids and that the protein can interact with SKP1, in a way that is dependent on its F-box domain. This points toward a potential function in protein degradation. To test this, they next made the knockout mouse, validated it, and found the males to be sterile, although capable of plugging a female. Looking at the sperm, they identified a number of ultrastructural and morphological abnormalities, which they looked at in high resolution using TEM. They also cross their model with RBGS mice so that they have reporters in both the acrosome and mitochondria. The authors test a variety of sperm functions, including motility parameters, ability to fertilize by IVF, cumulus-free IVF, zona-free-IVF, and ICSI. They found that ICSI could rescue the knockout but not other assisted reproductive technologies. Defects in male fertility likely resulted from motility disruption and failure to get through the utero-tubal junction but defects in acrosome exocytosis also were noted. The authors performed thorough investigations including both targeted and unbiased approaches such as mass spectrometry. These enabled them to show that although the loss of the FBXO24 protein led to more RNA and elevated levels of some proteins, it did not change others that were previously identified in the electron-dense RNP material.

      The manuscript will be highly significant in the field because the exact functions of the electron-dense RNP materials have remained somewhat elusive for decades. Much progress has been made in the past 15 years but this work shows that the situation is more complex than previously recognized. The results show critical impacts of protein degradation in the differentiation process that enables sperm to change from non-descript round cells into highly polarized and compartmentalized mature sperm, with an equally highly compartmentalized flagellum. This manuscript also sets a high bar for the field in terms of how thorough it is, which reveals wide-ranging impacts on processes such as mitochondrial compaction and arrangement in the midpiece, the correct building of the major cytoskeletal elements in the flagellum, etc.

      We appreciate the positive comment of reviewer #2.

      Weaknesses:

      There are no real weaknesses in the manuscript that result from anything in the control of the authors. They attempted to rescue the knockout by expressing a FLAG-tagged Fbxo24 transgene, but that did not rescue the phenotype, either because of inappropriate levels/timing/location of expression, or because of interference by the tag. They also could not make anti-FBXO24 that worked for coimmunoprecipitation experiments, so relied on the FLAG epitope, an approach that successfully showed co-IP with IPO5 and SKP1.

      We could not rescue the phenotype with Fbxo24-FLAG transgene, but different Fbxo24 mutant mice show the same phenotypes (Figure S6G). Further, another group showed that Fbxo24 KO mice exhibited abnormal mitochondrial coiling [Li et al., 2024 (PMID: 38470475)], confirming that

      FBXO24 is involved in the mitochondrial sheath formation.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors found that FBXO24, a testis-enriched F-box protein, is indispensable for male fertility. Fbxo24 KO mice exhibited malformed sperm flagellar and compromised sperm motility.

      We appreciate the summary comment of reviewer #3.

      Strengths:

      The phenotype of Fbxo24 KO spermatozoa was well analyzed.

      We appreciate the positive comment of reviewer #3.

      Weaknesses:

      The authors observed numerous membraneless electron-dense granules in the Fbxo24 KO spermatozoa. They also showed abnormal accumulation of two importins, IPO5 and KPNB1, in the Fbxo24 KO spermatozoa. However, the data presented in the manuscript do not support the conclusion that FBXO24 ensures male fertility by preventing the abnormal accumulation of membraneless granules in sperm flagella, as indicated in the manuscript title.

      Fbxo24 KO mice showed abnormal accumulation of membraneless granules in sperm flagella and male infertility, suggesting that FBXO24 is involved in these processes, but there are no results that show the direct relationship as reviewer #3 mentioned. Therefore, we fixed the title.

      Recommendations For The Authors:

      Reviewer #2 (Recommendations For The Authors):

      On page 4, lines 152-154, the authors introduce the RBGS mouse model and use it in their experiments.

      However, they left out an obvious but helpful sentence that tells the reader that they crossed the Fbxo24-null mouse with the RBGS. As one continues reading it is clear, but best to avoid even slight confusion.

      We revised the explanation in the result section (line 150-153).

      Reviewer #3 (Recommendations For The Authors):

      In this manuscript, the authors found that FBXO24, a testis-enriched F-box protein, is indispensable for male fertility. Fbxo24 KO mice exhibited malformed sperm flagellar and compromised sperm motility. The phenotype of Fbxo24 KO spermatozoa was well analyzed.

      The authors observed numerous membraneless electron-dense granules in the Fbxo24 KO spermatozoa. They also showed abnormal accumulation of two importins, IPO5 and KPNB1, in the Fbxo24 KO spermatozoa. However, the data presented in the manuscript do not support the conclusion that FBXO24 ensures male fertility by preventing the abnormal accumulation of membraneless granules in sperm flagella, as indicated in the manuscript title.

      Fbxo24 KO mice showed abnormal accumulation of membraneless granules in sperm flagella and male infertility, suggesting that FBXO24 is involved in these processes, but there are no results that show the direct relationship as reviewer #3 mentioned. Therefore, we fixed the title.

      Major comments:

      In the title, abstract, introduction, and some sections such as lines 275-276, the authors conclude that FBXO24 prevents the accumulation of importins and RNP granules during spermiogenesis. However, the provided data do not substantiate this claim. To provide conclusive evidence to support the current title, the authors need to present evidence supporting: 1) direct degradation of IPO5 and KPNB1 by FBXO24; 2) the direct requirement of IPO5 for the formation of the membraneless granules, and 3) infertility resulting from the presence of membraneless granules, rather than other issues such as abnormal ODF and AX.

      (1) direct degradation of IPO5 and KPNB1 by FBXO24.

      To examine if IPO5 can be degraded by FBXO24, we performed a ubiquitination assay using HEK293T cells. Ubiquitination of IPO5 was upregulated in the presence of WT FBXO24 but not with the mutant ΔF-box FBXO24, suggesting that IPO5 can be ubiquitinated by FBXO24. We did not examine the ubiquitination of KPNB1 because we failed to construct a plasmid vector expressing mouse KPNB1. We think that KPNB1 is not the substrate because we did not detect the interaction between FBXO24 and KPNB1 (Figure 5E). We added the results of the ubiquitination assay (Figure

      5F and line 261-265) and mentioned it in the abstract (line 35).

      (2) the direct requirement of IPO5 for the formation of the membraneless granules.

      (3) infertility resulting from the presence of membraneless granules, rather than other issues such as abnormal ODF and AX.

      We revealed that IPO5 aggregate under stress condition in COS7 cells (Figure 6C and D); however, we did not examine whether IPO5 is required for the formation of the membraneless granules. We consider that protein degradation systems such as PROTAC or Trim-Away to knockdown IPO5 at the protein level in Fbxo24 KO mice could be a good way to see if the membraneless granules are diminished and male fertility is rescued. However, it takes time to apply the degradation systems in vivo. Therefore, we would like to leave this rescue experiment for future studies. We fixed the title and  abstract (line 37-38), and removed the last sentence of the introduction.

      Also, the other group reported the analyses of Fbxo24 KO mice [Li et al., 2024 (PMID: 38470475)] right after we submitted our manuscript to the eLife. They reported not only disorganized flagellar structures but also abnormal head morphology, which may lead to male infertility. The differences from our study may be due to different mouse genetic backgrounds. We mentioned it in the discussion section (line 348-353).

      Minor comments:

      (1) The authors claimed a significant increase in the total amount of RNAs in Fbxo24 KO spermatozoa (lines 259-261), suggesting that the ...contain RNAs. More direct evidence supporting this claim should be provided.

      We show that the amounts of IPO5 and KBNB1 increased in Fbxo24 KO spermatozoa (Figure 5A and B), both of which could be incorporated into RNP granules in COS7 cells (Figure 6C and D), supporting the idea that membraneless electron-dense structures may be RNP granules. However, because we did not show direct evidence that electron-dense structures contain RNAs, we removed the sentences (line 259-261 of the 1st submission manuscript). 

      (2) The author should provide an explanation for the absence of a FLAG band in the input Tg in Figure 5D and the larger size of the IPO5 band in the FLAG-IP group compared to the input. Similar observations are also noted in Figure 5E.

      The FLAG band is weak because the protein amount is low. When we increase the contrast, we can see the FLAG band. We added an image with high contrast (Figure 5D). Sometimes, proteins run differently with SDS-PAGE after immunoprecipitation, likely due to varying protein composition in the sample. We explained it in the figure legend (line 868-869).

      (3) In Line 526, clarify the procedure for sperm purification, and determine the potential for contamination from somatic cells.

      We did not perform sperm purification, but when we observed spermatozoa obtained from cauda epididymis, we rarely observed either somatic cells or immature spermatogenic cells. We added  pictures in Figure S7. Further, we added detailed explanation about how to collect spermatozoa from the epididymis (line 549-550).

      (4) Define the Y-axis in Figure 2E, F, and G.

      We have revised the figures.

    1. Author response:

      Reviewer #1 (Public Review):

      Using the UK Biobank, this study assessed the value of nuclear magnetic resonance measured metabolites as predictors of progression to diabetes. The authors identified a panel of 9 circulating metabolites that improved the ability in risk prediction of progression from prediabetes to diabetes. In general, this is a well-performed study, and the findings may provide a new approach to identifying those at high risk of developing diabetes. I have some comments that may improve the importance of this study.

      We deeply appreciate the reviewer's invaluable time dedicated to the review of this manuscript and the insightful comments to enhance its overall quality.

      (1) It is unclear why the authors only considered the top 20 variables in the metabolite selection and why they did not set a wider threshold.

      Thank you for the comment. We set the top 20 variables in the metabolite selection balancing the performance of the final diabetes risk prediction model and the clinical applicability due to measurement costs. We have added this explanation in the “Methods” section.

      “We chose the intersection set of the top 20 most important variables selected by the three machine learning models, after balancing the performance of the final diabetes risk prediction model and the clinical applicability associated with measurement costs of metabolites.”

      (2) The methods section would benefit from a more detailed exposition of how parameter tuning was conducted and the range of parameters explored during the training of the RSF model.

      According to the reviewer’s suggestion, we have added a more detailed description of parameters tunning and the range of parameters explored during the training of the RSF model in the “Method S2” section in the Supplementary material.

      “The RSF model was fitted using the “randomForestSRC” package and the grid search method was used for hyperparameter tuning. Specifically, the grid search method was used to tune hyperparameters among the RSF model, through minimizing out-of-sample or out-of-bag error1. Each tree in the RSF is constructed from a random sample of the data, typically a bootstrap sample or 63.2% of the sample size (as in the present study). Consequently, not all observations are used to construct each tree. The observations that are not used in the construction of a tree are referred to as out-of-bag observations. In an RSF model, each tree is built from a different sample of the original data, so each observation is “out-of-bag” for some of the trees. The prediction for an observation can then be obtained using only those trees for which the observation was not used for the construction. A classification for each observation is obtained in this way and the error rate can be estimated from these predictions. The resulting error rate is referred to as the out-of-bag error. Through calculating the out-of-bag error in each iteration, the best hyperparameters were finally determined.

      The hyperparameters to be tuned and range of grid search in the present study were below: number of trees (50-1000, by 50), number of variables to possibly split at each node (3-6, by 1), and minimum size of terminal node (1-20, by 1)2.”

      (3) It is hard to understand the meaning of the decision curve analysis and the clinical implications behind the net benefit, which are required to clarify the application values of models.

      Thank you for the comment. We have added more description and discussion about the decision curve analysis in the “Methods” and “Discussion” sections.

      “Furthermore, we used decision curve analysis (DCA) to assess the clinical usefulness of prediction model-based guidance for prediabetes management, which calculates a clinical “net benefit” for one or more prediction models in comparison to default strategies of treating all or no patients3.”

      “Most importantly, a model with good discrimination does not necessarily have high clinical value. Hence, DCA was used to compare the clinical utility of the model before and after adding the metabolites, and this showed a higher net benefit for the latter than the basic model, suggesting the addition of the metabolites increased the clinical value of prediction, i.e., the potential benefit of guiding management in individuals with prediabetes3,4. These results provided novel evidence supporting the value of metabolic biomarkers in risk prediction and stratification for the progression from prediabetes to diabetes.”

      (4) Notably, the NMR platform utilized within the UK Biobank primarily focused on lipid species. This limitation should be discussed in the manuscript to provide context for interpreting the results and acknowledge the potential bias from the measuring platform.

      Thank you for the comment. We acknowledged this limitation that NMR platform within the UK Biobank primarily focused on lipid species and the potential bias from the measuring platform and have added this in “Discussion” section.

      “Third, the Nightingale metabolomics platform primarily focused on lipids and lipoprotein sub-fractions, and thus the predictive value of other metabolites in the progression from prediabetes to diabetes warranted further research using an untargeted metabolomics approach.”

      (5) The manuscript should explain the potential influence of non-fasting status on the findings, particularly concerning lipoprotein particles and composition. There should be a detailed discussion of how non-fasting status may impact the measurement and the findings.

      According to the reviewer’s suggestion, we have added more details to explain the potential influence of non-fasting status on our findings in the “Discussion” section.

      “Additionally, the use of non-fasting blood samples might increase inter-individual variation in metabolic biomarker concentrations, however, fasting duration has been reported to account for only a small proportion of variation in plasma metabolic biomarker concentrations5. Therefore, we believe the impact of non-fasting samples on our findings would be minor.”

      (6) Cross-platform standardization is an issue in metabolism, and further descriptions of quality control are recommended.

      Thank you for the comment. We have added more description of quality control in the “Method S1” section in the Supplementary material.

      “Metabolic biomarker profiling by Nightingale Health’s NMR platform provides consistent results over time and across spectrometers. Furthermore, the sample preparation is minimal in the Nightingale Health’s metabolic biomarker platform, circumventing all extraction steps. These aspects result in highly repeatable biomarker measurements. Pre-specified quality metrics were agreed between UK Biobank and Nightingale Health to ensure consistent results across the samples, and pilot measurements were conducted. Nightingale Health performed real-time monitoring of the measurement consistency within and between spectrometers throughout the UK Biobank samples. Two control samples provided by Nightingale Health were included in each 96-well plate for tracking the consistency across multiple spectrometers. Furthermore, two blind duplicate samples provided by the UK Biobank were included in each well plate, with the position information unlocked only after results delivery. Coefficient of variation (CV) targets across the metabolic biomarker profile were pre-specified for both Nightingale Health’s internal control samples and UK Biobank’s blind duplicates. The targets were met for each consecutively measured batch of ~25,000 samples. For the majority of the metabolic biomarkers, the CVs were below 5% (https://biobank.ndph.ox.ac.uk/showcase/refer.cgi?id=3000). Further, the distributions of measured biomarkers from 5 sample batches indicated absence of batch effects (https://biobank.ctsu.ox.ac.uk/ukb/ukb/docs/nmrm_app1).”

      Reviewer #2 (Public Review):

      Deciphering the metabolic alterations characterizing the prediabetes-diabetes spectrum could provide early time windows for targeted preventive measures to extend precision medicine while avoiding disproportionate healthcare costs. The authors identified a panel of 9 circulating metabolites combined with basic clinical variables that significantly improved the prediction from prediabetes to diabetes. These findings provided insights into the integration of these metabolites into clinical and public health practice. However, the interpretation of these findings should take account of the following limitations.

      We appreciate the reviewer’s positive comments and encouragement.

      (1) First, the causal relationship between identified metabolites and diabetes or prediabetes deserves to be further examined particularly when the prediabetic status was partially defined. Some metabolites might be the results of prediabetes rather than the casual factors for progression to diabetes.

      Thank you for your insightful comments. We agree with you that the panel of metabolites in this study might not be the causal factor for progression from prediabetes to diabetes, which needs further validation in experimental studies. We have added this limitation in the “Discussion” section.

      “Fifth, we could not draw any conclusion about the causality between the identified metabolites and the risk for progression to diabetes due to the observational nature, which remained to be validated in further experimental studies.”

      (2) The blood samples were taken at random (not all in a non-fasting state) and so the findings were subjected to greater variability. This should be discussed in the limitations.

      According to the reviewer’s suggestion, we have added more details to explain the potential influence of non-fasting status on our findings in the “Discussion” section.

      “Additionally, the use of non-fasting blood samples might increase inter-individual variation in metabolic biomarker concentrations, however, fasting duration has been reported to account for only a small proportion of variation in plasma metabolic biomarker concentrations5. Therefore, we believe the impact of non-fasting samples on our findings would be minor.”

      (3) The strength of NMR in metabolic profiling compared to other techniques (i.e., mass spectrometry [MS], another commonly used metabolic profiling method) could be added in the Discussion section.

      According to the reviewer’s suggestion, we have added the strength of NMR in metabolic profiling compared to other techniques in the “Discussion” section.

      “Circulating metabolites were quantified via NMR-based metabolome profiling within the UK Biobank, which offers metabolite qualification with relatively lower costs and better reproducibility6.”

      (4) Fourth, the applied platform focuses mostly on lipid species which may be a limitation as well.

      Thank you for the comment. We acknowledged this limitation that NMR platform within the UK Biobank primarily focused on lipid species and the potential bias from the measuring platform and have added this in the “Discussion” section.

      “Third, the Nightingale metabolomics platform primarily focused on lipids and lipoprotein sub-fractions, and thus the predictive value of other metabolites in the progression from prediabetes to diabetes warranted further research using an untargeted metabolomics approach.”

      (5) it is a very large group with pre-diabetes, but the results only apply to prediabetes and not to the general population. This should be clear, although the authors have also validated the predictive value of these metabolites in the general population.

      Thank you for the comment. We agree with you that the results only apply to prediabetes and not to the general population, though they also showed potential predictive value among participants with normoglycemia. We have accordingly modified the relevant expressions in the “Conclusion” section to restrict these findings to participants with prediabetes.

      “In this large prospective study among individuals with prediabetes, we detected a panel of circulating metabolites that were associated with an increased risk of progressing to diabetes.”

      References

      (1) Janitza S, Hornung R. On the overestimation of random forest's out-of-bag error. PLoS One. 2018;13(8):e0201904.

      (2) Tian D, Yan HJ, Huang H, et al. Machine Learning-Based Prognostic Model for Patients After Lung Transplantation. JAMA Netw Open. 2023;6(5):e2312022.

      (3) Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res. 2019;3:18.

      (4) Li J, Xi F, Yu W, Sun C, Wang X. Real-Time Prediction of Sepsis in Critical Trauma Patients: Machine Learning-Based Modeling Study. JMIR Form Res. 2023;7:e42452.

      (5) Li-Gao R, Hughes DA, le Cessie S, et al. Assessment of reproducibility and biological variability of fasting and postprandial plasma metabolite concentrations using 1H NMR spectroscopy. PLoS One. 2019;14(6):e0218549.

      (6) Geng T-T, Chen J-X, Lu Q, et al. Nuclear Magnetic Resonance–Based Metabolomics and Risk of CKD. American Journal of Kidney Diseases. 2023.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors investigate the impact of fecal microbiota transfer (FMT) on intestinal recovery from enterotoxigenic E. coli infection following antibiotic treatment. Using a piglet model of intestinal infection, the authors demonstrate that FMT reduces weight loss and diarrhea and enhances the expression of tight junction proteins. Sequencing analysis of the intestinal microbiota following FMT showed significant increases in Akkermansia muciniphila and Bacteroides fragilis. Using additional mouse and organoid models, the authors examine the impact of these microbes on intestinal recovery and modulation of the Wnt signaling pathway. Overall, the data support the notion that FMT following ETEC infection is beneficial, however, additional investigation is required to fully elucidate the mechanisms involved.

      Strengths:

      Initial experiments used a piglet model of infection to test the value of FMT on recovery from E. coli. The FMT treatment was beneficial and the authors provide solid evidence that the treatment increased the diversity of the microbiota and enhanced the recovery of the intestinal epithelium. Sequencing data highlighted an increase in Akkermansia muciniphila and Bacteroides fragilis after FMT.

      The mouse data are consistent with the observations in pigs, and reveal that daily gavage with A. muciniphila or B. fragilis enhances intestinal recovery based on histological analysis, expression of tight junction proteins, and analysis of intestinal barrier function.

      The authors demonstrate the benefit of probiotic treatment following infection using a range of model systems.

      Weaknesses:

      Without sequencing the pre-infection pig microbiota or the FMT input material itself, it's challenging to firmly say that the observed bloom in Akkermansia muciniphila and Bacteroides fragilis stemmed from the FMT.

      Response: We have determined the relative abundance of each bacterium in fecal bacterial suspension, referring to Hu et al. (2018). The absolute abundances of Akkermansia muciniphila and Bacteroides fragilis in the FMT were 1.3 × 103 ± 2.6 × 103 and 4.5 × 103 ± 6.1 × 103 respectively.

      Reference:

      Hu LS, Geng SJ, Li Y, et al. Exogenous Fecal Microbiota Transplantation from Local Adult Pigs to Crossbred Newborn Piglets. Front. Microbiol. 2018, 8.

      The lack of details for the murine infection model, such as weight loss and quantification of bacterial loads over time, make it challenging for a reader to fully appreciate how treatment with Akkermansia muciniphila and Bacteroides fragilis is altering the course of infection. Bacterial loads of E. coli were only quantified at one time point, and the mice that received A. muciniphila and B. fragilis had very low levels of E. coli. Therefore, it is not clear if all mice were subjected to the same level of infection in the first place. The reduced translocation of E. coli to the organs and enhanced barrier function may just reflect the low level of infection in these mice. Further, the authors' conclusion that the effect is specific to A. muciniphila or B. fragilis would be more convincing if the experiments included an inert control bacterium, to demonstrate that gavage with any commensal microbe would not elicit a similar effect.

      The weight loss was added in Figure S2A. All mice were subjected to the same level of infection in the first place.

      Many of the conclusions in the study are drawn from the microscopy results. However, the methods describing both light microscopy and electron microscopy lack sufficient detail. For example, it is not clear how many sections and fields of view were imaged or how the SEM samples were prepared and dehydrated. The mucus layer does not appear to be well preserved, which would make it challenging to accurately measure the thickness of the mucus layer.

      For light microscopy, 3-4 fields were selected from each mouse to count about 30 crypts. The method of electron microscopy was complemented on line 263-270. We have removed data of the mucus layer.

      Gene expression data appears to vary across the different models, for example, Wnt3 expression in mice versus organoids. Additional experiments may be required to clarify the mechanisms involved. Considering that both of the bacteria tested elicited similar changes in Wnt signaling, this pathway might be broadly modulated by the microbiota.

      The reason why the Wnt3 expression pattern is different in mice and in porcine intestinal organoids may be caused by the different infection periods of ETEC in vivo and in vitro. Furthermore, in vivo, the stem cell niche of intestinal stem cells is not only regulated by intestinal epithelial cells, but also affected by mesenchymal cells in connective tissues (Luo et al., 2022). However, in vitro models, stem cell niche is only regulated by epithelial secretory factors, which may also account for the differences in in vitro and in vivo results.

      It has been reported that B. fragilis pretreatment significantly increased the relative abundance of A. muciniphila in the intestine of CDI mice, and the growth and maintenance of A. muciniphila were involved in the restoration of intestinal barrier integrity after CDI infection, indicating that there might exist a bacterial metabolic symbiosis between A. muciniphila and B. fragilis (Deng et al., 2018).

      References:

      Luo HM, Li MX, Wang F, et al. The role of intestinal stem cell within gut homeostasis: Focusing on its interplay with gut microbiota and the regulating pathways. Int. J. Biol. Sci. 2022, 18(13): 5185-5206.

      Deng HM, Yang SQ, Zhang YC, et al. Bacteroides fragilis Prevents Clostridium difficile Infection in a Mouse Model by Restoring Gut Barrier and Microbiome Regulation. Front. Microbiol. 2018, 9.

      The unconventional choice to not include references in the results section makes it challenging for the reader to put the results in context with what is known in the field. Similarly, there is a lack of discussion acknowledging that B. fragilis is a potential pathogen, associated with intestinal inflammation and cancer (Haghi et al. BMC Cancer 19, 879 (2019) ), and how this would impact its utility as a potential probiotic.

      Bacteroides fragilis is one of the symbiotic anaerobes within the mammalian gut and is also an opportunistic pathogen which often isolated from clinical specimens. Bacteroides fragilis was first isolated from the pathogenic site and considered to be pathogenic bacteria. However, with the deepening of research, it is gradually realized that in the long-term evolution process, Bacteroides fragilis colonized in the gut has established a friendly relationship with the host, which is an essential component for maintaining the health of the host, especially for obesity, diabetes and immune deficiency diseases. We have supplemented the discussion on line 598-603.

      Reviewer #2 (Public Review):

      Ma X. et al proposed that A. muciniphila was a key strain that promotes the proliferation and differentiation of intestinal stem cells by acting on the Wnt/β-catenin signaling pathway. They used various models, such as the piglet model, mouse model, and intestinal organoids to address how A. muciniphila and B. fragilis offer protection against ETEC infection. They showed that FMT with fecal samples, A. muciniphila or B. fragilis protected piglets and/or mice from ETEC infection, and this protection is manifested as reduced intestinal inflammation/bacterial colonization, increased tight junction/Muc2 proteins, as well as proper Treg/Th17 cells. Additionally, they demonstrated that A. muciniphila protected basal-out and/or apical-out intestinal organoids against ETEC infection via Wnt signaling. While a large body of work has been performed in this study, there are quite a few questions to be addressed.

      Major comments:

      - The similar protective effect of FMT with fecal samples, A. muciniphila or B. fragilis is perhaps not that surprising, considering that FMT likely restores microbiota-mediated colonization resistance against ETEC infection. While FMT with fecal samples increases SCFAs, it is unclear whether/how FMT with A. muciniphila or B. fragilis alter the microbiota composition/abundance as well as metabolites in the current models in a way that offers protection.

      We examined changes in the gut microbiota of mice treated with A. muciniphila and B. fragilis through 16s rRNA, and results showed that both A. muciniphila and B. fragilis improved the alpha and beta diversities of the microbiota, while these results were not included in this manuscript.

      - Does ETEC infection in piglets/mice cause histological damage in the intestines? These data should be shown.

      The results of scanning electron microscopy (Figure 3A) showed the intestinal damage of piglets after ETEC infection. H&E staining and transmission electron microscopy (Figure 5A and 5B) showed the intestinal damage of mice after ETEC infection.

      - Line 447, "ETEC adheres to intestinal epithelial cells". However, there is no data showing the adherence (or invasion) of ETEC to intestinal epithelial cells, irrespective of piglets/mouse/organoids.

      The scanning electron microscope (Figure 3A bottom) showed that ETEC K88 infected piglets existed obvious rod-shaped bacterial adhesion on the surface of microvilli. Figure 2C showed the colonization of ETEC K88 in the jejunum and colon of piglets. Figure S2A showed the E. coli colonization in intestines and other tissues of mice.

      - In both basal-out and apical-out intestinal organoid models, A. muciniphila protects organoids against ETEC infection. Did ETEC enter into intestinal epithelial cells at all after only one hour of infection? Is the protection through certain A. muciniphila metabolites?

      It has been reported that the duration of the co-culture for studying the host-microbiota cross-talk by apical-out organoids model is 1 hour (Poletti et al., 2021). In addition, Co et al. (2019) used apical-out organoids model to study host-pathogen interactions, with Salmonella enterica serovar Typhimurium or Listeria monocytogenes invading organoids for an hour.

      References:

      Poletti M, Arnauts K, Ferrante M, et al. Organoid-based Models to Study the Role of Host-microbiota Interactions in IBD. J. Crohns Colitis. 2021, 15(7): 1222-1235.

      Co JY, Margalef-Catala M, Li XN, et al. Controlling Epithelial Polarity: A Human Enteroid Model for Host-Pathogen Interactions. Cell Reports. 2019, 26(9): 2509-2520.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.

      Strengths:

      The strengths of this manuscript include the use of multiple model systems and follow-up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.

      Weaknesses:

      The major weakness is that, as presented, the manuscript is quite difficult to follow, even for someone familiar with the field. The lack of detail in figure legends, organization of the text, and frequent use of non-intuitive abbreviated group names without a clear key (ex. EP/EF, or C E A B) make comprehension challenging. The results section is perhaps too succinct and does not provide sufficient information to understand experimental design and interpretation without reading the methods section first or skipping to the discussion (as an example: WNT-c59 treatment). Extensive revisions could be encouraged to aid in communicating the potentially exciting findings.

      The abbreviations of experimental groups are firstly defined in the Methods and Materials, and we have supplemented the experimental design in the results section on line 397-399, 439-442 and 516-520.

      The bioinformatics section of the methods requires revision and may indicate issues in the pipeline. Merging the forward and reverse reads may represent a problem for denoising. Also since these were sequenced on a NovaSeq, the error learning would have to be modified or the diversity estimates would be inappropriately multiplied. "Alpha diversity and beta diversity were calculated by normalized to the same sequence randomly." Not sure what this means, does this mean subsampled? "Blast was used for sequence alignment", does this mean the taxonomic alignment? This would need to be elaborated on and database versions should be included. The methods, including if any form of multiple testing was included, for LEFSE was also not included.

      Denoising was conducted using UNOISE3 to correct for sequencing errors. Subsequent analysis of alpha diversity and beta diversity were all performed based on the output normalized data. Multiple sequence alignment was performed using MUSCLE (v3.8.31) software to obtain the phylogenetic relationships of all OTUs sequences. We have supplemented the method of multiple testing on line 323-328.

      Reviewer #1 (Recommendations For The Authors):

      At some points, the rationale for using both porcine and murine models was unclear, and it would be helpful for the reader to elaborate on the benefits of these models and why they were used in the introduction. Similarly, it would be helpful to describe the benefits of basal-in organoids versus injecting standard organoids with bacteria.

      The main subject of this study was piglets, supplemented by a mouse model for validation. Interpretation of measurements from organoid microinjection experiments must account for multiple confounding variables such as heterogeneous exposure concentrations and durations, as well as impacts of disrupting the organoid wall. We have added the description in the introduction on line 88-90.

      Line 165 -- The number of piglets used seems high, is it correct approximately 100 pigs were used?

      Nine litters were selected for processing, while only 18 piglets were finally slaughtered.

      There is very little discussion of the preliminary experiment that the authors used to determine how much bacteria to use. I recommend either discussing the data and how the doses were chosen or omitting it. It was not clear if the authors used pasteurized or live bacteria in the experiments. It would also be interesting to include a discussion of the observation that relatively low levels of Akkermansia (10^6 CFU) appeared more beneficial than the higher doses, typically used in these types of experiments.

      We removed these results. The experiments used live bacteria.

      Microscopy methods for both light microscopy and EM would be stronger with added details including how many sections and fields of view were imaged and how the numbers of goblet cells normalized across samples. Without having a clear cross-section of a crypt, it is not clear to me how the images can be used to accurately quantify the number of cells per crypt. Additional details in the methods on how many total crypts were counted should also be included.

      For light microscopy, 3-4 fields were selected from each mouse to count about 30 crypts. We have removed the data of the mucus layer and goblet cells.

      Line 236 -- missing which gene was used.

      The Genbank Accession was added on line 232-233.

      Line 310 -- OTU nomenclature.

      We have supplemented the OTU nomenclature on line 314.

      Line 413 -- This line seems inconsistent with the data analysis described in the methods section. The authors may need to expand their description of the 16S data analysis to be clear and reproducible.

      We have redescribed the 16S data analysis on line 312-328.

      Line 413 -- it is not surprising that 16s analysis did not capture species, it will have limited resolution beyond the genus level.

      We deleted this sentence.

      Methods are missing some details on the data analysis, eg. methods/programs and statistical analysis of PCoA and NMDS, LefSe.

      The methods and statistical analysis of PCoA, NMDS and LEfSe were supplemented on line 323-328.

      Fig 4C -- The images do not clearly capture the mucus layer or how it was analyzed. The sections appear to be cut at a slight angle, with multiple partial sections of crypts. I think this might make it challenging to count goblet cells, especially if the counts are normalized over the number of crypts or villi. The mucus layer does not appear well preserved. For example, I would expect to see an intact mucus layer lining the colon in the PBS control group. Re-cutting sections with a clean cross-section through the tissue will make data analysis easier.

      We have removed data of the mucus layer.

      Fig 4D -- The images appear to be of the mouse proximal colon, whereas the mucus layer and most muc2 will be in the distal colon. If the authors have tissue sections of the distal colon, this may give a clearer image of the mucus layer and might be more consistent with the TEM images in Fig. 4B.

      We apologize for the absence of the distal colon sections.

      To fully preserve the mucus layer, in addition to fixing in Carnoy's solution, the embedding process must be run without the standard washes in 70% ethanol (see: Johansson and Hansson. Methods Mol Biol. (2012) 229; doi: 10.1007/978-1-61779-513-8_13). The mucus will wash away during standard paraffin embedding if the tissue is washed with 70% ethanol, and I wonder if that has occurred in these samples.

      The tissue wasn’t washed with 70% ethanol.

      Fig 6A and 6B -- Although the legend indicates that the data is representative of two independent experiments, it is not clear how many fields of view or cells were imaged. In the bar graphs, it is not clear how many crypts were analyzed and from how many fields of view.

      3-4 fields were selected from each mouse to count about 30 crypts.

      **For all of the bar graphs, this could be addressed by displaying all of the data points, rather than just the mean, to give the reader a sense of how many cells were counted. (as was done in Fig 7B).

      We have changed the bar graphs with data points.

      498-501 -- The text says that the gene expression patterns in the organoids are consistent with the in vivo data, but the data patterns of gene expression appear to be different. For example, patterns for Wnt3 and B-catenin expression in mice, appear to be the opposite of what was observed in the organoid?

      Lines 509-512 mean that the expression patterns of mice in organoids and in vivo is consistent. Figure 7C was incorrectly written as Figure 8C, we have changed it.

      Since Akkermansia does not grow under aerobic conditions, it should be made clear that the organoid co-culture treatment does not involve actively growing bacterial cultures.

      Reunanen et al. found that Akkermansia can tolerate oxygen, more than 90% Akkermansia can keep for 1 h under oxic, 5% CO2 conditions.

      Reference:

      Reunanen J, Kainulainen V, Huuskonen L, et al. Akkermansia muciniphila Adheres to Enterocytes and Strengthens the Integrity of the Epithelial Cell Layer. Appl. Environ. Microbiol. 2015, 81(11): 3655-3662.

      Minor points

      Line 50 -"evidence".

      We have changed to “evidence” on line 49.

      Line 64, 422 - italicize, check italics throughout.

      We have checked italics throughout the manuscript.

      Line 64 - may need to be reworded.

      We have changed to “Clostridioides difficile” on line 66.

      Line 77 - pathogen.

      We have changed to “pathogen” on line 77.

      Line 161 - the.

      We have removed “the” on line 161.

      Line 178 - mouse.

      We have changed to “mouse” on line 179.

      Line 313 -- wording is confusing.

      We have changed the description on line 319-320.

      Line 318 -- Silva version #.

      The version is Silva 132. We have added it on line 316.

      Line 334 - Manufacturer for Live/Dead cell stain?

      The Live/Dead cell stain was used BD Biosciences FVS510. We have added it on line 345.

      Line 433 -- FD4 not defined until here.

      We have refined the FD4 on line 218-219.

      Line 512 -- but did not promote.

      We have changed to “but did not promote” on line 526.

      Line 517 -- Looks like this should be "basal-in organoids" instead of basal-out?

      We have changed the "basal-out" to "apical-to" on line 531.

      Line 546 -- induced neonatal should be protected?

      They are in separate pens.

      Jumps from Fig 7B to Fig 8C in the text.

      We apologize for the wrong writing, and we have change it.

      Reviewer #2 (Recommendations for The Authors):

      The title itself is a bit misleading. Please consider changing it. The authors meant that A. muciniphila prevents pathogen invasion, but does not function in pathogen invasion.

      We have changed the title.

      Major comments:

      - Figures 4A, 4D, and 6B should include presentation of cross-section pictures.

      We provided cross-section pictures to the journal.

      - Figures 7, 8, and 9 should indicate clearly whether mouse or piglet organoids are used. For instance, in the main text, line 490, it indicates piglet organoids, but in Figure 7A legend, it indicates mouse tissue.

      We apologize for the misspelling, and have changed to “mice” on line 501-502.

      - In Figure 7A, the 3rd row, 2nd panel, crypts formed into spherical organoids; whereas in Figure 8, ETEC infection of basal-out organoids formed budding organoids. This needs to be better explained.

      Mouse intestinal organoids were cultured ex vivo from crypts isolated from mice infected with ETEC, while porcine intestinal organoids were co-cultured with ETEC in vitro.

      Minor comments:

      - In the result section, the numbering of Figures or supplementary Figures is problematic, i.e it should start with Figure 1..., Figure S1, but not directly go to Figure S2A etc.

      The Figure 1 was in Materials and Methods.

      - Line 458, please add the gating strategy used in the flow cytometry study.

      The gating strategy was added on line 351-356.

      - The effect of A. muciniphila on the proliferation of intestinal epithelium through the Wnt/β-catenin signaling pathway is well known (such as PMID: 32138776). The authors should discuss this in detail.

      We have supplemented the discussion on line 637-639.

      Reviewer #3 (Recommendations For The Authors):

      It is somewhat unusual that the results from the piglets are in the supplement as this is a major strength of the manuscript (Fig S2).

      We have put these results into Figure 2 of the manuscript.

      "Collectively, our results may provide theoretical basis that FMT is a promising mitigation method for pathogenic bacteria infection and a new strategy for precise application of FMT in clinical and livestock production"- This is somewhat of an odd statement as the introduction of the manuscript completely skips over most of what is known about FMTs in the context of C. difficile. Also if anything, does the authors' own data not point mostly at using A. muciniphila on its own? Clinical trials are well underway in humans.

      We have changed the sentences to “Collectively, our results may provide theoretical basis that A. muciniphila is a promising method to repair intestinal barrier damage and a new strategy for the precise application of A. muciniphila in livestock production.” on line 98-100.

      Line 26: I am not sure probiotic is the right word here given its strict scientific definition. Perhaps beneficial or protective would be more appropriate.

      We have changed “probiotic” to “beneficial” on line 25.

      Line 27: I believe AIMD is antibiotic-induced microbiome-depletion in most usages which may be more accurate and informative than dysregulated.

      The type, dosing, and time of antibiotic we used were applied to induce microbiota disorder.

      It would appear that there are issues in the reference formatting where a number of journal names are missing.

      We have re-edited the reference formatting.

      Line 64- I believe eLife requires the standard practice of italicizing genus and species names. Also Clostridium difficile should now be referred to as Clostridioides difficile.

      We have changed to “Clostridioides difficile” and italicized it on line 66 and 569. The italicizing genus and species names were checked throughout the manuscript.

      Figure S2C: is it not clear why the melt curve was included here, but the legend should make it more clear what is being shown. I assume this is to provide evidence of specificity?

      The melting curve was used to demonstrate that only the ETEC K88 could be amplified by the primers we used. We have added an illustration in the figure legend.

      Figure 2D: there should be a quantitative analysis done on the staining of Muc2.

      We have quantified the staining of MUC2 in Figure 3D.

      Figure 3: The legends are not sufficient. For example: it is not clear what Figure 3A actually shows as the y-axis is not labelled and it is not clear what the relationship is between this and the anosim which is a function for permanova.

      Anosim analysis was performed using the R software with anosim package function based on the rank order of Bray-Curtis distance values to test the significance of differences between groups. The y-axis is the rank of the distance between samples.

      Line 416- OTU not OUT.

      We have changed to “OTU” on line 428.

      Figure 4- the naming key needs to be included in the figure legend. C, E, A, and B are immediately obvious.

      The naming key was included in the figure legend.

      Methods: additional information on the flow cytometry gating strategy/controls should be included.

      The gating strategy was added on line 351-356.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      Recent studies have used optical or electrophysiological techniques to chronically measure receptive field properties of sensory cortical neurons over long time periods, i.e. days to weeks, to ask whether sensory receptive fields are stable properties. Akritas et al expand on prior studies by investigating whether nonlinear contextual sensitivity, a property not previously investigated in the context of so-called 'representational drift,' remains stable over days or weeks of recording. They performed chronic tetrode recordings of auditory cortical neurons over at least five recording days while also performing daily measurements of both the linear spectro-temporal receptive field (principal receptive field, PRF) and non-linear 'contextual gain field' (CGF), which captures the neuron's sensitivity to acoustic context. They found that spike waveforms could be reliably matched even when recorded weeks apart. In well-matched units, by comparing the correlation between tuning within one day's session to sessions across days, both PRFs and CGFs showed remarkable stability over time. This was the case even when recordings were performed over weeks. Meanwhile, behavioral and brain state, measured with locomotion and pupil diameter, respectively, resulted in small but significant shifts in the ability of the PRF/CGF model to predict fluctuations in the neuronal response over time.

      Strengths:

      The study addresses a fundamental question, which is whether the neural underpinnings of sensory perception, which encompasses both sensory events and their context, are stable across relevant timescales over which our experiences must be stable, despite biological turnover. Although two-photon calcium imaging is ideal for identifying neurons stably regardless of their activity levels and tuning, it lacks temporal precision and is therefore limited in its ability to capture the complexity of sensory responses. Akritas et al performed painstaking chronic extracellular recordings in the auditory cortex with the temporal resolution to investigate complex receptive field properties, such as neural sensitivities to acoustic context. Prior studies, particularly in the auditory cortex, focused on basic tuning properties or sensory responsivity, but Akritas et al expand on this work by showing that even the nonlinear, contextual elements of sensory neurons' responses can remain stable, providing a mechanism for the stability of our complex perception. This work is both novel and broadly applicable to those investigating cortical stability across sensory modalities.

      Weaknesses:

      Apart from some aspects such as single-unit versus multi-unit, the study largely treats their dataset as a monolith rather than showing how factors such as firing rate, depth, and cell type could define more or less stable subpopulations. It is likely that their methodology did not enable an even sampling over these qualities, and the authors should discuss these biases to put their findings more in context with related studies.

      We did, in fact, investigate whether firing rate and other physiological response properties of units might differentiate subpopulations with different stability. This analysis is shown in Figure 7B-D. There was no apparent relationship between stability of nonlinear contextual gain fields and physiological properties such as mean evoked firing rate, signal-to-noise ratio for evoked firing, or predictive power of the context model (a measure of model goodness-of-fit).

      The reviewer is correct, however, that we did not address possible differences between units recorded at different cortical depths or of different cell types, due to limitations of our methodology and sampling.

      Reviewer #2 (Public Review):

      Summary:

      This study explores the fundamental neuroscience question of the stability of neuronal representation. The concept of 'representational-drift' has been put forward after observations made using 2-photon imaging of neuronal activity over many days revealed that neurons contribute in a time-limited manner to population representation of stimuli or experiences. The authors contribute to the still contested concept of 'drifts' by measuring representation across days using electrophysiology and thus with sufficient temporal resolution to characterize the receptive fields of neurons in timescales relevant to the stimuli used. The data obtained from chronic recordings over days combined with nonlinear stimulus-response estimation allows the authors to conclude that both the spectrotemporal receptive fields as well as contextual gain fields dependent on combination sensitivity to complex stimuli were stable over time. This suggests that when a neuron is responsive to experimental parameters across long periods of time (days), its sensory receptive field is stable.

      Strengths:

      The strength of this study lies in the capacity to draw novel conclusions on auditory cortex representation based on the experimentally difficult combination of stable recordings of neuronal activity, behavior, and pupil over days and state-of-the-art analysis of receptive fields.

      Weaknesses:

      It would have been desirable, but too ambitious in the current setting, to be able to assess what proportion if any of the neurons drop out or in to draw a closer parallel with the 2-photon studies.

      We certainly agree that this comparison would have been desirable in principle. In practice, however, it was technically infeasible and would have been likely to produce misleading results. Our criteria for spike waveform matching across days were extremely conservative, to minimise the potential for a false positive match (which could artifactually decrease apparent stability of unit responses). Therefore, we were likely to have missed some neurons that did in fact remain active over days, due to small changes in extracellular waveform or just noise (which could artifactually decrease apparent stability of population representations). Two-photon imaging is more appropriate for analysing population stability, because cell identity is determined by spatial location. However, as we mention in the paper, electrophysiology is more appropriate for analysing receptive-field stability, because the temporal resolution is sufficient to resolve structure at the millisecond timescales relevant to auditory perception.

      Reviewer #3 (Public Review):

      Summary:

      In their study on "Nonlinear sensitivity to acoustic context is a stable feature of neuronal responses to complex sounds in auditory cortex of awake mice", Akritas et al. investigate the stability of the response properties of neurons in the auditory cortex of mice. They estimate a model with restricted non-linearities for individual neurons and compare the model properties between recordings on the same day and subsequent days. They find that both the linear and nonlinear components of the model stay rather constant over this period and conclude that on the level of the tuning properties, there is no evidence for representational drift on this time scale.

      Strengths:

      - The study has a clear analytical approach that goes beyond linear models and investigates this in a rigorous way, in particular comparing across-day variability to within-day variability.

      - The use of tetrodes is a rather reliable way in electrophysiological recordings to assess neuron identity over multiple days.

      - The comparison with pupil and motion activity was useful and insightful.

      - The presentation of the study is very logical and pretty much flawless on the writing level.

      Weaknesses:

      - The stability results across cells show a good amount of variability, which is only partially addressed.

      - In particular, no attempt is made to localize the cells in space, in order to check whether these differences could be layer or area-dependent.

      - The full context model also includes the possibility to estimate the input non-linearity, which was not done here, but could have been insightful.

      We agree with these comments and acknowledge these limitations, which arise from technological constraints. In particular, the tangential trajectory of our chronic tetrode implant, used to maximise stability of chronic recordings, limited our ability to sample cells from different cortical layers/areas and to explore how these factors might relate to variability in stability across units. Estimating input nonlinearities would have been valuable but also would have increased the number of parameters in the model and the data required to obtain reliable, predictive model fits.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors explored how galanin affects whole-brain activity in larval zebrafish using wide-field Ca2+ imaging, genetic modifications, and drugs that increase brain activity. The authors conclude that galanin has a sedative effect on the brain under normal conditions and during seizures, mainly through the galanin receptor 1a (galr1a). However, acute "stressors(?)" like pentylenetetrazole (PTZ) reduce galanin's effects, leading to increased brain activity and more seizures. The authors claim that galanin can reduce seizure severity while increasing seizure occurrence, speculated to occur through different receptor subtypes. This study confirms galanin's complex role in brain activity, supporting its potential impact on epilepsy.

      Strengths:

      The overall strength of the study lies primarily in its methodological approach using whole-brain Calcium imaging facilitated by the transparency of zebrafish larvae. Additionally, the use of transgenic zebrafish models is an advantage, as it enables genetic manipulations to investigate specific aspects of galanin signaling. This combination of advanced imaging and genetic tools allows for addressing galanin's role in regulating brain activity.

      Weaknesses:

      The weaknesses of the study also stem from the methodological approach, particularly the use of whole-brain Calcium imaging as a measure of brain activity. While epilepsy and seizures involve network interactions, they typically do not originate across the entire brain simultaneously. Seizures often begin in specific regions or even within specific populations of neurons within those regions. Therefore, a whole-brain approach, especially with Calcium imaging with inherited limitations, may not fully capture the localized nature of seizure initiation and propagation, potentially limiting the understanding of Galanin's role in epilepsy.

      Furthermore, Galanin's effects may vary across different brain areas, likely influenced by the predominant receptor types expressed in those regions. Additionally, the use of PTZ as a "stressor" is questionable since PTZ induces seizures rather than conventional stress. Referring to seizures induced by PTZ as "stress" might be a misinterpretation intended to fit the proposed model of stress regulation by receptors other than Galanin receptor 1 (GalR1).

      The description of the EAAT2 mutants is missing crucial details. EAAT2 plays a significant role in the uptake of glutamate from the synaptic cleft, thereby regulating excitatory neurotransmission and preventing excitotoxicity. Authors suggest that in EAAT2 knockout (KO) mice galanin expression is upregulated 15-fold compared to wild-type (WT) mice, which could be interpreted as galanin playing a role in the hypoactivity observed in these animals.

      Indeed, our observation of the unexpected hypoactivity in EAAT2a mutants, described in our description of this mutant (Hotz et al., 2022), prompted us to initiate this study formulating the hypothesis that the observed upregulation of galanin is a neuroprotective response to epilepsy.

      However, the study does not explore the misregulation of other genes that could be contributing to the observed phenotype. For instance, if AMPA receptors are significantly downregulated, or if there are alterations in other genes critical for brain activity, these changes could be more important than the upregulation of galanin. The lack of wider gene expression analysis leaves open the possibility that the observed hypoactivity could be due to factors other than, or in addition to, galanin upregulation.

      We have performed a transcriptome analysis that we are still evaluation. We can already state that AMPA receptor genes are not significantly altered in the mutant.

      Moreover, the observation that in double KO mice for both EAAT2 and galanin, there was little difference in seizure susceptibility compared to EAAT2 KO mice alone further supports the idea that galanin upregulation might not be the reason for the observed phenotype. This indicates that other regulatory mechanisms or gene expressions might be playing a more pivotal role in the manifestation of hypoactivity in EAAT2 mutants.

      We agree that upregulation of galanin transcripts is at best one of a suite of regulatory mechanisms that lead to hypoactivity in EAAT2 zebrafish mutants.

      These methodological shortcomings and conceptual inconsistencies undermine the perceived strengths of the study, and hinders understanding of Galanin's role in epilepsy and stress regulation.

      Reviewer #2 (Public Review):

      Summary:

      This study is an investigation of galanin and galanin receptor signaling on whole-brain activity in the context of recurrent seizure activity or under homeostatic basal conditions. The authors primarily use calcium imaging to observe whole-brain neuronal activity accompanied by galanin qPCR to determine how manipulations of galanin or the galr1a receptor affect the activity of the whole-brain under non-ictal or seizure event conditions. The authors' Eaat2a-/- model (introduced in their Glia 2022 paper, PMID 34716961) that shows recurrent seizure activity alongside suppression of neuronal activity and locomotion in the time periods lacking seizures is used in this paper in comparison to the well-known pentylenetetrazole (PTZ) pharmacological model of epilepsy in zebrafish. Given the literature cited in their Introduction, the authors reasonably hypothesize that galanin will exert a net inhibitory effect on brain activity in models of epilepsy and at homeostatic baseline, but were surprised to find that this hypothesis was only moderately supported in their Eaat2a-/- model. In contrast, under PTZ challenge, fish with galanin overexpression showed increased seizure number and reduced duration while fish with galanin KO showed reduced seizure number and increased duration. These results would have been greatly enriched by the inclusion of behavioral analyses of seizure activity and locomotion (similar to the authors' 2022 Glia paper and/or PMIDs 15730879, 24002024). In addition, the authors have not accounted for sex as a biological variable, though they did note that sex sorting zebrafish larvae precludes sex selection at the younger ages used. It would be helpful to include smaller experiments taken from pilot experiments in older, sex-balanced groups of the relevant zebrafish to increase confidence in the findings' robustness across sexes. A possible major caveat is that all of the various genetic manipulations are non-conditional as performed, meaning that developmental impacts of galanin overexpression or galanin or galr1a knockout on the observed results have not been controlled for and may have had a confounding influence on the authors' findings. Overall, this study is important and solid (yet limited), and carries clear value for understanding the multifaceted functions that neuronal galanin can have under homeostatic and disease conditions.

      Strengths:

      - The authors convincingly show that galanin is upregulated across multiple contexts that feature seizure activity or hyperexcitability in zebrafish, and appears to reduce neuronal activity overall, with key identified exceptions (PTZ model).

      - The authors use both genetic and pharmacological models to answer their question, and through this diverse approach, find serendipitous results that suggest novel underexplored functions of galanin and its receptors in basal and disease conditions. Their question is well-informed by the cited literature, though the authors should cite and consider their findings in the context of Mazarati et al., 1998 (PMID:982276). The authors' Discussion places their findings in context, allowing for multiple interpretations and suggesting some convincing explanations.

      - Sample sizes are robust and the methods used are well-characterized, with a few exceptions (as the paper is currently written).

      - Use of a glutamatergic signaling-based genetic model of epilepsy (Eaat2a-/-) is likely the most appropriate selection to test how galanin signaling can alter seizure activity, as galanin is known to reduce glutamatergic release as an inhibitory mechanism in rodent hippocampal neurons via GalR1a (alongside GIRK activation effects). Given that PTZ instead acts through GABAergic signaling pathways, it is reasonable and useful to note that their glutamate-based genetic model showed different effects than did their GABAergic-based model of seizure activity.

      Weaknesses:

      - The authors do not include behavioral assessments of seizure or locomotor activity that would be expected in this paper given their characterizations of their Eaat2a-/- model in the Glia 2022 paper that showed these behavioral data for this zebrafish model. These data would inform the reader of the behavioral phenotypes to expect under the various conditions and would likely further support the authors' findings if obtained and reported.

      We agree that a thorough behavioral assessment would have strengthened the study, but we deemed it outside of the scope of this study.

      - No assessment of sex as a biological variable is included, though it is understood that these specific studied ages of the larvae may preclude sex sorting for experimental balancing as stated by the authors.

      The study was done on larval zebrafish (5 days post fertilization). The first signs of sexual differentiation become apparent at about 17 days post fertilization (reviewed in Ye and Chen, 2020). Hence sex is no biological variable at the stage studied. 

      - The reported results may have been influenced by the loss or overexpression of galanin or loss of galr1a during developmental stages. The authors did attempt to use the hsp70l system to overexpress galanin, but noted that the heat shock induction step led to reduced brain activity on its own (Supplementary Figure 1). Their hsp70l:gal model shows galanin overexpression anyways (8x fold) regardless of heat induction, so this model is still useful as a way to overexpress galanin, but it should be noted that this galanin overexpression is not restricted to post-developmental timepoints and is present during development.

      The developmental perspective is an important point to consider. Due to the rapid development of the zebrafish it is not trivial to untangle this. In the zebrafish we first observe epileptic seizures as early as 3 days post fertilization (dpf), where the brain is clearly not well developed yet (e.g. behavioral response to light are still minimal). Even the 5 dpf stage, where most of our experiments have been conducted, cannot by far not be considered post-development.  

      Reviewer #3 (Public Review):

      Summary:

      The neuropeptide galanin is primarily expressed in the hypothalamus and has been shown to play critical roles in homeostatic functions such as arousal, sleep, stress, and brain disorders such as epilepsy. Previous work in rodents using galanin analogs and receptor-specific knockout has provided convincing evidence for the anti-convulsant effects of galanin.

      In the present study, the authors sought to determine the relationship between galanin expression and whole-brain activity. The authors took advantage of the transparent nature of larval zebrafish to perform whole-brain neural activity measurements via widefield calcium imaging. Two models of seizures were used (eaat2a-/- and pentylenetetrazol; PTZ). In the eaat2a-/- model, spontaneous seizures occur and the authors found that galanin transcript levels were significantly increased and associated with a reduced frequency of calcium events. Similarly, two hours after PTZ galanin transcript levels roughly doubled and the frequency and amplitude of calcium events were reduced. The authors also used a heat shock protein line (hsp70I:gal) where galanin transcript levels are induced by activation of heat shock protein, but this line also shows higher basal transcript levels of galanin. Again, the higher level of galanin in hsp70I:gal larval zebrafish resulted in a reduction of calcium events and a reduction in the amplitude of events. In contrast, galanin knockout (gal-/-) increased calcium activity, indicated by an increased number of calcium events, but a reduction in amplitude and duration. Knockout of the galanin receptor subtype galr1a via crispants also increased the frequency of calcium events.

      In subsequent experiments in eaat2a-/- mutants were crossed with hsp70I:gal or gal-/- to increase or decrease galanin expression, respectively. These experiments showed modest effects, with eaat2a-/- x gal-/- knockouts showing an increased normalized area under the curve and seizure amplitude.

      Lastly, the authors attempted to study the relationship between galanin and brain activity during a PTZ challenge. The hsp70I:gal larva showed an increased number of seizures and reduced seizure duration during PTZ. In contrast, gal-/- mutants showed an increased normalized area under the curve and a stark reduction in the number of detected seizures, a reduction in seizure amplitude, but an increase in seizure duration. The authors then ruled out the role of Galr1a in modulating this effect during PTZ, since the number of seizures was unaffected, whereas the amplitude and duration of seizures were increased.

      Strengths:

      (1) The gain- and loss-of function galanin manipulations provided convincing evidence that galanin influences brain activity (via calcium imaging) during interictal and/or seizure-free periods. In particular, the relationship between galanin transcript levels and brain activity in Figures 1 & 2 was convincing.

      (2) The authors use two models of epilepsy (eaat2a-/- and PTZ).

      (3) Focus on the galanin receptor subtype galr1a provided good evidence for the important role of this receptor in controlling brain activity during interictal and/or seizure-free periods.

      Weaknesses:

      (1) Although the relationship between galanin and brain activity during interictal or seizure-free periods was clear, the manuscript currently lacks mechanistic insight in the role of galanin during seizure-like activity induced by PTZ.

      We completely agree and concede that this study constitutes only a first attempt to understand the (at least for us) perplexing complexity of galanin function on the brain.

      (2) Calcium imaging is the primary data for the paper, but there are no representative time-series images or movies of GCaMP signal in the various mutants used.

      We are in the process of preparing some time series images and will include them in the next revision.

      (3) For Figure 3, the authors suggest that hsp70I:gal x eaat2a-/-mutants would further increase galanin transcript levels, which were hypothesized to further reduce brain activity. However, the authors failed to measure galanin transcript levels in this cross to show that galanin is actually increased more than the eaat2a-/- mutant or the hsp70I:gal mutant alone.

      This is an excellent suggestion. We will perform the necessary qPCR experiments and will include the data in the next revision.

      (4) Similarly, transcript levels of galanin are not provided in Figure 2 for Gal-/- mutants and galr1a KOs. Transcript levels would help validate the knockout and any potential compensatory effects of subtype-specific knockout.

      (5) The authors very heavily rely on calcium imaging of different mutant lines. Additional methods could strengthen the data, translational relevance, and interpretation (e.g., acute pharmacology using galanin agonists or antagonists, brain or cell recordings, biochemistry, etc).

      Again, we agree and concede that a number of additional approaches are needed to get more insight into the complex role of galanin in regulation overall brain activity. These include, among others, also behavioral, multiple single cell recordings and pharmacological interventions.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript addresses a fundamental question about how different types of communication signals differentially affect brain state and neurochemistry. In addition, their manuscript  highlights the various processes that modulate brain responses to communication signals, including prior experience, sex, and hormonal status. Overall, the manuscript is well-written and the research is appropriately contextualized.

      That being said, it remains important for the authors to think more about their analytical approaches. In particular, the effect of normalization and the explicit outlining and interpretations of statistical models. As mentioned in the original review, the normalization of neurochemical data seems unnecessary given the repeated-measures design of their analysis and by normalizing all data to the baseline data and including this baseline data in the repeated measures analysis,   one artificially creates a baseline period with minimal variation that dramatically differs in variance from other periods (akin to heteroscedasticity). If the authors want to analyze how a stimulus changes neurochemical concentrations, they could analyze the raw data but depict normalized data in their figures (similar to other papers). Or they could analyze group differences in the normalized data of the two stimulus periods (i.e., excluding the baseline period used for normalization).

      We appreciate the reviewer’s point on the difference in variance caused by including the 100% baseline values in the analysis. After consulting with our statistician, we chose the latter of the two approaches suggested by the reviewer. Specifically, we reran the analysis to exclude the baseline and focus only on the playback windows and the group differences. The text in the results, the significance signs in the figures, and the discussion are corrected accordingly. Despite these changes, our major conclusions remains as before.

      We also followed this reviewer’s suggestions to clarify the statistical model in studying the experience effect. After further consultation with our statistician, we reran the analysis on experience effect, including all the groups of EXP and INEXP animals together. We have corrected text in the figure captions, results, discussion, and data analysis sections of the manuscript related to the effect of experience and its interactions. This has not changed the conclusion made related to the experience effect in the dataset.

      It would also be useful for the authors to provide further discussion of the potential contributions of different types of experiences (mating vs. restraint) to the change in behavior and neurochemical responses to the vocalization playbacks and to try to disentangle sensory and  motor contributions to neurochemical changes.

      We have acknowledged in the Discussion that previous studies suggest that the effect of experience involving stress could be generalized. We believe that this is an important area of future research. Our Discussion acknowledges that the relationship between sensory and motor contributions to neurochemical changes remains an area of interest. We further point out that the time resolution of microdialysis data renders the suggested discussion highly speculative. We plan to use other methods to assess this in future experiments.

      Reviewer #3 (Public Review):

      The work by Ghasemahmad et al. has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the basolateral amygdala (BLA) while an animal listens to social vocalizations.

      Ghasemahmad et al. made changes to the manuscript that have significantly improved the work. In particular, the transparency in showing the underlying levels of Ach, DA, and 5HIAA is excellent. My previous concerns have been adequately addressed.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I appreciate the authors responses to my previous queries (and to the comments by other reviewers). The introduction does a better job contextualizing the data, and the additional details in the results and Methods sections help readers digest the material. I continue to think the topic  is interesting and the manuscript is potentially impactful. However, I continue to be concerned about their analytical approaches and other aspects of the revised manuscript.

      (a) Normalization

      In my original review I wrote: "The normalization of neurochemical data seems unnecessary   given the repeated-measures design of their analysis and could be problematic; by normalizing     all data to the baseline data (p. 24), one artificially creates a baseline period with minimal   variation (all are "0"; Figures 2, 3 & 5) that could inflate statistical power." I continue to feel that an analysis of normalized data that includes the baseline data is inappropriate because of the minimal variation in the normalized data for the baseline period. When the normalized data for   the baseline period is included in the analysis, there is clearly variation in the extent of variability within each of the time periods (no variability at baseline, variability during periods 1 & 2; analogous to heteroscedasticity). For example, when analyzing the RAW DATA about the change in ACh release in experienced males listening to restraint vocalizations (thank you for releasing the raw data), there was a non-significant effect of time (baseline, period 1, and period 2; linear mixed effects model; F(2,12)=3.2, p=0.0793). However, when the normalized data for  this dataset was analyzed (with baseline values being set at 100% for each mouse), there was a statistically significant effect (F(2,12)=4.5, p=0.0352). This example is just to illustrate how normalization can affect (e.g., inflate) statistical power.

      That being said, I do think that it is reasonable to analyzed normalized data if the period used for normalization is NOT included in the analysis (see Figure 3 of one of the paper the authors listed in their response to reviewers: Galvez-Marquez et al., 2022). However, from the reading of this manuscript, it does seem like normalized baseline data are analyzed to assess how stimuli affect neurochemical concentrations.

      We appreciate the reviewer’s point on the difference in variance caused by including the 100% baseline values in the analysis. After consulting with our statistician, we chose one of the two approaches suggested by the reviewer. Specifically, we reran the analysis to exclude the baseline and focus only on the playback windows and the group differences. The text in the results, the significance signs in the figures, and the discussion are corrected accordingly. Despite these changes, our major conclusions remains as before. We have included some descriptive statistics in the text because we think these are informative.

      We decided to take this approach because the inter-individual variability in the raw data levels, caused by non-experimental factors, is too great to be useful. As we have stated before, these values are affected by probe placement, collection process, or differences in the HPLC or LC/MS runs. These effects are widely recognized in the field.

      It is worth pointing out a few things about the papers listed by the authors. Li et al. (2023) does depict normalized microanalysis data but it isn't clear that any analysis of the normalized data is conducted. The same can be said about Holly et al. (2016). Further, in Bagley et al (2011), the authors depict normalized data in the figures but conduct analyses on the raw data ("After  chronic morphine treatment, systemic naloxone injection increased GABA outflow in PAG by 41% (from 24.6 {plus minus} 2.9 nM to a peak of 34.8 {plus minus} 3.8 nM, n = 6, P = 0.016), but did not alter GABA levels after vehicle treatment (39.8 {plus minus} 8.3 to 38.6 {plus  minus} 7.4 nM with naloxone at matched peak time, n = 4; Fig. 3a)". This latter approach (analyzing raw data in a repeated-measures manner and depicted normalized data) seems reasonable for the authors of the current study.

      (b) Clarification and modification of statistical models

      When analyzing the effect of experience on neuromodulator release, the authors analyze the experienced and inexperienced mice independently (e.g., figure 3 vs. 6). The ideal way to assess the effects of experience is to create a factorial model. For example, one could analyze a full factorial model with experience (exp vs. inexp), stimulus time (mating vs. restraint) and time  (baseline, period 1 vs period 2, assuming raw data are used). If one wanted to exclude the  baseline period because group differences in baseline are not informative, conducting a factorial analysis of normalized data with just the data from period 1 and 2 seems fine. I believe an analysis like this will help increase the legitimacy of the analysis. For example, when analyzing the normalized data (periods 1 and 2) of experienced and inexperienced males in response to mating or restraint vocalizations, you find a significant interaction between experience and stimulus type. Finding an effect of experience in an analysis that includes both experienced and inexperienced mice is ideal from an analytical framework.

      In Figure 6, it is not clear what the statistical model is and what the interactions mean. For example, in the figure legend for figure 6, the authors report time*context and time*sex interactions. However, in this analysis there are two groups of inexperienced males (males that   are listening to restraint vocalizations, males that are listening to mating vocalizations) and one group of females (females that are listening to mating vocalizations); in other words, this is an unbalanced analysis. So, when the authors indicate a time*context interaction, does that mean  they are comparing the male-restraint group to the combination of males and females listening to mating vocalizations? And when they talk about a time*sex interaction, are they analyzing how males listening to either mating or restraint vocalizations differ from females listening to a   mating vocalization? This all seems peculiar to me.

      - A similar set of questions could be raised about interaction effects depicted in Figure 4.

      Overall, I would like this manuscript to be reviewed by a statistician to provide additional input on how best to analyze the data.

      We followed the reviewer’s suggestions to clarify the statistical model in studying the experience effect. After further consultation with the statistician, we reran the analysis on experience effect, including all the groups of EXP and INEXP animals together.

      Design: Intercept + Sex +Context + Experience+ Sex* Experience + Context* Experience.

      The model is not full factorial as recommended by the statistician, because we don’t have females in the restraint group and that would make an unbalanced design. Therefore, running GLM based on the above model and included factors, as advised by the statistician, is the best way of approaching the analysis for the current dataset.

      We have corrected text in the figure captions, results, discussion, and data analysis sections of the manuscript related to the effect of experience and its interactions. The GLM models are clarified for all the figures in the “data analysis” section of the manuscript. We have clarified that the major effect of experience on neuromodulators was seen in the ACh data.

      (c) Analysis of post-stimulus period

      I agree with Reviewer 3 that analyzing the post-stimulus period would be useful. As mentioned     in the original review, these data could serve as an opportunity to show that the neurochemical levels returned to baseline and add further support for the model described in Figure 6. In   addition, these data could help reveal the link  between  neurochemical  release,  auditory responses, and behavior. If neurochemical changes reflect auditory responses, then these should back to baseline during the post-stimulus period. In addition, if behavioral variation (e.g.,    between mice hearing mating vs. restraint stimuli) persists following the termination of playback, then one could similarly assess whether neurochemical variation persists following playback. If   the latter is the case, then the neurochemical release could be more related to the behavior than to the playback stimulus itself.

      We did not change this analysis. Our response to Reviewer 3’s comment is shown below.

      “We decided not to include analyses of the post-stimulus period because this period is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback. We agree that the general question is of interest to the field, but we don’t think our study is best designed to answer that question.”

      This was accepted by Reviewer 3. We also note that release patterns have multiple time courses (e.g., Aitta-aho et al., 2018 for ACh), and thus may not support an assumption that levels should return to baseline shortly after playback offset.

      Minor comments:

      Page 7, line 15: I suggest changing "vocalization-dependent" to "stimulus-dependent" because the former could connote patterns of release related to the animal itself vocalizing.

      Changed to: “There were also distinct patterns of ACh and DA release into the BLA depending on the type of vocalization playback (Fig 3C,D).”

      Discussion section: The authors should point out a few caveats with their experiments in the Discussion section. First, experienced animals received both mating (social) and restraint experiences, and it is not clear to what degree each type of experience affected neural and behavioral responses (i.e., specificity of experience effects). For example, mating experience can lead to a wide range of physiological changes, including a resilience to stress (e.g., Leuner et al., PLoS One, 2010; Arnold et al., Hormones and Behavior, 2019), so it is possible that mating experiences by themselves could have induced these changes. Or it could be that experiencing restraint stress affects responses to mating stimuli. This could be added to the first paragraph in page 16. (The authors could also discuss which aspects of the sexual encounters might be most important for the behavioral and neural plasticity.)

      We have added text to raise this issue, stating that it is unknown wither the experience effects are specific and citing the above references concerning the generalized effects of certain experiences.

      Discussion section: It would also be useful for the authors to discuss the extent to which behavior might be driving the neurochemical changes. Some of the analyses suggest that the release is independent of the behavior (e.g., reflects a sensory responses) but this could be emphasized    more in the Discussion.

      We believe that we have addressed this issue sufficiently in our previous response to related issues raised by this reviewer. As we note, there are limitations in the time resolution of microdialysis data that render the suggested discussion highly speculative. We plan to use other methods to assess this in future experiments.

      Figure 2, legend: Please note that the text above the images describes the stimulus played back to these animals and their hormonal state, and not the type of experienced they underwent (i.e.,  clarify the titles)

      Changed as requested.

      I also agree with Reviewer 3 that "mating experience" is a misnomer for this manuscript. "Social experience with a female" is a more accurate descriptor. If they wanted to specifically provide mating experience, males should have only been tested with estrus (receptive females). I don't think this wording change detracts from their findings.

      We have not changed this term. As noted in our previous response to Reviewer #3, we stated: “In the mating experience, mounting or attempted mounting was required for the animal to be included in subsequent testing.” Due to this requirement, the term “mating behavior” is informative and appropriate. In our view, “Social experience with a female” does not adequately describe our inclusion criterion or the experience.

      Reviewer #3 (Recommendations For The Authors):

      The work by Ghasemahmad et al. has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the basolateral amygdala (BLA) while an animal listens to social vocalizations.

      Ghasemahmad et al. made changes to the manuscript that have significantly improved the work. In particular, the transparency in showing the underlying levels of Ach, DA, and 5HIAA is excellent. My previous concerns have been adequately addressed. I only have a few minor suggestions for the text and one figure.

      Minor suggestions:

      Page 2, Ln 9: add adult before male and female mice

      Changed as requested

      Page 4, Ln 10: add a period after Tsukano et al., 2019)

      Changed as requested

      Page 6, Ln 9: what did you mean by "their interaction"? Being more specific, but concise, would help the readers.

      We revised the wording to clarify that the neuromodulatory systems interact in the emission of positive and negative vocalizations.

      Page 6, Ln 17: You mention Stim 1 and Stim 2, but the stimuli are not defined at this point. The clear explanation is provided in the following paragraph. Maybe consider switching the order  and define the stimuli before you describe the liquid chromatography/mass spectrometry technique.

      We have revised and merged these paragraphs so that Stim 1 and Stim 2 are defined on first use. We also revised our description of the depiction and analysis of neurochemical data.

      Page 11, Ln 12: replace well-proven with well-documented

      Changed as requested

      Figure 2: There are two arrows pointing towards a single track. I assume one of the arrows is a duplicate. If so, delete one of the arrows. If not, please explain what the second arrow represents.

      Arrow removed

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors have studied the effects of platelets in OPC biology and remyelination. For this, they used mutant mice with lower levels of platelets as a demyelinating/remyelinating scenario, as well as in a model with large numbers of circulating platelets.

      Strengths:

      -The work is very focused, with defined objectives.

      -The work is properly done.

      Weaknesses:

      -There is no clear effect on a single cell type and/or mechanism involved.

      We appreciate the reviewer’s feedback. We understand that from our in vivo studies we are unable to distinguish whether the effects of platelets are directly exerted on OPCs or indirectly through a different cell type. However, data obtained from the platelet depleted model as well as the new data provided in this revised version in CALRHet mice indicate that, at least, macrophages / microglia do not contribute to the observed effects in OPCs. In addition to this, in vitro data support the direct effects of platelets on OPC function.

      Reviewer #2 (Public Review):

      Summary:

      This paper examined whether circulating platelets regulate oligodendrocyte progenitor cell (OPC) differentiation for the link with multiple sclerosis (MS). They identified that the interaction with platelets enhances OPC differentiation although persistent contact inhibits the process in the longterm. The mouse model with increased platelet levels in the blood reduced mature oligodendrocytes, while how platelets might regulate OPC differentiation is not clear yet.

      Strengths:

      The use of both partial platelet depletion and thrombocytosis mouse models gives in vivo evidence. The presentation of platelet accumulation in a time-course manner is rigorous. The in vitro co-culture model tested the role of platelets in OPC differentiation, which was supportive of in vivo observations.

      Weaknesses:

      How platelets regulate OPC differentiation is not clear. What the significance of platelets is in MS progression is not clear.

      We thank reviewer’s view and assessment of our manuscript. We understand both of the reviewer’s concerns. Firstly, we performed additional in vitro studies and we have confirmed that platelet-contained factors are, at least in part, responsible for modulating OPC differentiation and, thus, direct cell-cell contact is not essential. Secondly, in this revised version, we added references arguing that the plasma levels of platelet microparticles and platelet-specific factors correlate with MS progression and severity.  

      Reviewer #1 (Recommendations For the Authors):

      To ameliorate the quality of their work and make it suitable for its publication in eLife, I strongly suggest the authors to: 

      (1) In vitro co-culture platelets and OPCs to check the effects on this latter cell type biology. 

      Response: We have performed in vitro studies, in which OPCs were co-cultured with washed platelets (WP). We observed that OPC differentiation was boosted after a short exposure to WP, however, prolonged exposure to WP suppressed this effect (revised Figure 3A and B). Also, our new data using platelet lysate (PL) indicate that platelet-contained molecules are responsible for this effect (revised Figure 3C and D). Finally, we showed that by removing PL after sustained exposure (6 DIV) the ability of platelets to promote OPC differentiation is rescued (revised Figure 3E and F).

      (2) In the CALR model, can the authors check effect of chronic exposure to large numbers of platelets? Is this affecting macrophages (including their polarization)? 

      Response: Yes, compared to wild type mouse, in the CALRHET model we confirmed the presence of larger number of platelets within demyelinated lesions (Figure 4A and C). Also, in this revised version we added data showing in the CALRHET model that thrombocytosis does not affect macrophage / microglia numbers and polarization (revised Supplementary Figure 2). 

      (3) Some aspects of the Introduction section seems too old-fashioned (ex.: the use of bFGF instead of FGF2 to refer to Fibroblast Growth Factor 2), as well as it would be convenient to include more recent references on the role of FGF2 and PDGFa in OPC biology. 

      Response: We agree with the reviewer. In this revised version we have changed bFGF for FGF2 and we added more recent references addressing the role of FGF2 and PDGFa in OPC biology.

      (4) There are some constructions and typos that could be corrected. 

      Response: We have checked language constructions as well as typos, and these have been corrected.

      (5) Please revise spelling of some names in the bibliography list (ex.: the correct surname is ffrenchConstant, not Ffrench-Constant).

      We have revised the spelling of names within the bibliography, and we have corrected them accordingly.

      Reviewer #2 (Recommendations For the Authors):

      Mechanisms of platelet-OPC interactions 

      -  transwell co-culture assay will examine if the OPC phenotype is through direct or indirect interactions with platelets; 

      We have performed additional in vitro studies, in which OPCs were exposed to platelet lysate (PL). New results indicate that a short exposure to PL can promote OPC differentiation (revised Figure 3C and D), while a sustained exposure supresses this effect (revised Figure 3E and F). These findings indicate that platelet-contained factors are, at least in part, responsible for modulating OPC differentiation and, thus, direct cell-cell contact is not essential for such an effect.

      -  can you revert the phenotype of OPCs co-cultured long with platelets (maturation blocked) by removing platelet (then OPC differentiate again?) to see if the phenotype is reversible or not? 

      We would like to thank the reviewer for bringing up this interesting question. We have performed additional in vitro studies to address this issue. We found that by removing PL upon 6-days of sustained exposure rescues the ability of platelets to promote OPC differentiation (revised Figure 3E and F). These findings indicate that the supressing effect of prolonged exposure to platelets in OPC differentiation is reversible.  

      Clinical correlation 

      -  How many MS patients has an abnormal number of or exposure to platelets? 

      We have added new information in the introduction section. Indeed, previous studies have shown that MS patients display higher levels of circulating platelet microparticles (PMPs) (MarcosRamiro et al., 2014) as well as increased plasma levels of platelet-specific factors such as, P-selectin and PF4 (Cananzi et al., 1987; Kuenz et al., 2005).

      do platelets amount correlate with diseases severeness? 

      We have added new information in the introduction section. Indeed, PMPs are indicative of the clinical status of the disease (Saenz-Cuesta et al., 2014). Also, plasma levels of P-selectin and PF4 correlate with disease course and severity, respectively (Cananzi et al., 1987; Kuenz et al., 2005).

      Image quantification 

      -  please state how many sections were counted. How many animals were used per condition. Is the practice of blinded observers done for each dataset?

      We added this information in the figure legends and in methods section. We counted between 3-5 sections per lesion. We used 3 to 6 animals per experimental group and data was analysed by blinded observers.

    1. Author response:

      Reviewer #1 (Public Review): 

      Summary: 

      In this paper, Behruznia and colleagues use long-read sequencing data for 335 strains of the Mycobacterium tuberculosis complex to study genome evolution in this clonal bacterial pathogen. They use both a "classical" pangenome approach that looks at the presence and absence of genes, and a more general pangenome graph approach to investigate structural variants also in non-coding regions. The two main results of the study are that (1) the MTBC has a small pangenome with few accessory genes, and that (2) pangenome evolution is driven by deletions in sublineage-specific regions of difference. Combining the gene-based approach with a pangenome graph is innovative, and the former analysis is largely sound apart from a lack of information about the data set used. The graph part, however, requires more work and currently fails to support the second main result. Problems include the omission of important information and the confusing analysis of structural variants in terms of "regions of difference", which unnecessarily introduces reference bias. Overall, I very much like the direction taken in this article, but think that it needs more work: on the one hand by simply telling the reader what exactly was done, on the other by taking advantage of the information contained in the pangenome graph. 

      Thank you for your constructive feedback. We have hopefully positively addressed all your concerns. Please see our detailed responses below.

      Strengths: 

      The authors put together a large data set of long-read assemblies representing most lineages of the Mycobacterium tuberculosis context, covering a large geographic area. State-of-the-art methods are used to analyze gene presence-absence polymorphisms (Panaroo) and to construct a pangenome graph (PanGraph). Additional analysis steps are performed to address known problems with misannotated or misassembled genes in pangenome analysis. 

      Thank you for your positive feedback. We are pleased that you found these aspects of our work noteworthy and valuable.

      Weaknesses: 

      The study does not quite live up to the expectations raised in the introduction. Firstly, while the importance of using a curated data set is emphasized, little information is given about the data set apart from the geographic origin of the samples (Figure 1). A BUSCO analysis is conducted to filter for assembly quality, but no results are reported. It is also not clear whether the authors assembled genomes themselves in the cases where, according to Supplementary Table 1, only the reads were published but not the assemblies. In the end, we simply have to trust that single-contig assemblies based on long-reads are reliable. 

      The BUSCO results are present for all the genomes in Supplementary Table S1. Genome assemblies were obtained from public databases and other studies that performed the assemblies. We did not perform assemblies for any of the public datasets except the 11 genomes sequenced by ourselves, for which we included the assembly statistics. The public genomes from NCBI were marked as closed based on the NCBI pipelines so there are additional checks on quality undertaken there before we included in our analysis. Marin et al (2024; BioRxiv) also performed additional checks on the vast majority of the genomes before they were included here.  We are confident that these genomes represent the highest quality M. tuberculosis dataset possible, but we will check that all genomes are present in the GTDB list, which performs additional tests including CheckM, to add another layer of confidence. Some of the accessions to the final genomes were not included as these papers were not released yet but will be in the next version. Supplementary Table S1 will be updated to include the assembly information for each genome.

      One issue with long read assemblies could be that high rates of sequencing errors result in artificial indels when coverage is low, which in turn could affect gene annotation and pangenome inference (e.g. Watson & Warr 2019, https://doi.org/10.1038/s41587-018-0004-z). Some of the older long-read data used by the authors could well be problematic (PacBio RSII), but also their own Nanopore assemblies, six of which have a mean coverage below 50 (Wick et al. 2023 recommend 200x for ONT, https://doi.org/ 10.1371/journal.pcbi.1010905). Could the results be affected by such assembly errors? Are there lineages, for example, for which there is an increased proportion of RSII data? Given the large heterogeneity in data quality on the NCBI, I think more information about the reads and the assemblies should be provided. 

      We have shown elsewhere (Marin et al (2024; BioRxiv)) that short read sequencing is significantly worse for these types of problems. For this reason, we have included only closed genomes which we believe will reduce the potential for such errors. However, we agree that older sequencing technologies, such as PacBio RSII, can introduce errors in the assemblies and subsequent downstream analyses. We will look for correlation between platform and accessory genome presence/absence to see if the type of sequencing influences the results.

      Wick et al. (2023) recommend a coverage of 200x for ONT sequencing; however, newer analyses from Wick have shown that with modern basecalling and sequencing very low error rates can be achieved with much lower coverage (see https://rrwick.github.io/2023/10/24/ont-only-accuracy-update.html). We are quite confident that gene presence/absence patterns should be robust to this in our analysis but will confirm with some additional analysis on our sequenced genomes.

      The part of the paper I struggled most with is the pangenome graph analysis and the interpretation of structural variants in terms of "regions of difference". To start with, the method section states that "multiple whole genomes were aligned into a graph using PanGraph" (l.159/160), without stating which genomes were for what reason. From Figure 5 I understand that you included all genomes, and that Figure 6 summarizes the information at the sublineage level. This should be stated clearly, at present the reader has to figure out what was done.

      All genomes were included in the pangenome graph construction and to look for regions of differences. We then grouped genomes into sub-lineages to undertake the additional analyses as there is not enough genomes per sub-sub-lineages and lower for robust analyses. We will make this clearer in the next version, likely with a flowchart of analyses.

      It was also not clear to me why the authors focus on the sublineage level: a minority of accessory genes (107 of 506) are "specific to certain lineages or sublineages" (l. 240), so why conclude that the pangenome is "driven by sublineage-specific regions of difference", as the title states? What does "driven by" mean? Instead of cutting the phylogeny arbitrarily at the sublineage level, polymorphisms could be described more generally by their frequencies. 

      We acknowledge the importance of polymorphisms, but our study primarily aimed to investigate the presence and absence of genes/genomic regions, as highlighted in our focus on structural differences rather than SNPs (L67-69). We attempted to clarify our goal of exploring gene content variation both between and within lineages (L69) to avoid confusion.

      Our focus on the sub-lineage level addresses the gap in understanding gene content distribution beyond the broad lineage level, where previous pangenome studies have concentrated. The decision to focus on sub-lineages allows for a more detailed exploration of genetic diversity. Due to the limited number of genomes available to represent all sub-sub-lineages and lower levels of classification, we aimed to investigate gene content differences at the sub-lineage level. This decision allows for a more detailed and comprehensive exploration of gene content differences within the MTBC.

      I fully agree that pangenome graphs are the way to go and that the non-coding part of the genome deserves as much attention as the coding part, as stated in the introduction. Here, however, the analysis of the pangenome graph consists of extracting variants from the graph and blasting them against the reference genome H37Rv in order to identify genes and "regions of difference" (RDs) that are variable. It is not clear what the authors do with structural variants that yield no blast hit against H37Rv. Are they ignored? Are they included as new "regions of difference"? How many of them are there? etc. The key advantage of pangenome graphs is that they allow a reference-free, full representation of genetic variation in a sample. Here reference bias is reintroduced in the first analysis step. 

      Genomic analysis of Mycobacterium tuberculosis is H37Rv reference-centric, meaning that RDs are typically defined based on their presence or absence relative to the reference strain. Our approach comparing variants to the H37Rv reference was primarily to identify and name the known regions of differences (RDs). For structural variants that did not yield a BLAST hit against H37Rv, we assigned them as new RDs in Supplementary Table S4 to provide a reference-free approach for investigating gene content differences. Further clarifications on the definition and identification of RDs will be added.

      Along similar lines, I find the interpretation of structural variants in terms of "regions of difference" confusing, and probably many people outside the TB field will do so. For one thing, it is not clear where these RDs and their names come from. Did the authors use an annotation of RDs in the reference genome H37Rv from previously published work (e.g. Bespiatykh et al. 2021)? This is important basic information, its lack makes it difficult to judge the validity of the results. The Bespiatykh et al. study uses a large short-read data (721 strains) set to characterize diversity in RDs and specifically focuses on the sublineage-specific variants. While the authors cite the paper, it would be relevant to compare the results of the two studies in more detail. 

      Indeed the term regions of difference (RDs) is somewhat M. tuberculosis specific. These are large polymorphisms which are differentially present in clades (primarily lineages) of M. tuberculosis. Annotations and naming of these is based on Bespiatykh et al. (2021) and RDscan tool which identify RD regions based on the H37Rv genomic coordinates. We obtained the corresponding Rv locus for RD regions by matching their genomic coordinates on the H37Rv genome and confirmed the RDs using the bed file from RDscan. We have used their names where our findings overlap and any new RDs we report are not found in their data. We will ensure this is clearer in the next version.

      As far as I understand, "regions of difference" have been used in the tuberculosis field to describe structural variants relative to the reference genome H37Rv. Colloquially, regions present in H37Rv but absent in another strain have been called "deletions". Whether these polymorphisms have indeed originated through deletion or through insertion in H37Rv or its ancestors requires a comparison with additional strains. While the pangenome graph does contain this information, the authors do not attempt to categorize structural variants into insertions and deletions but simply seem to assume that "regions of difference" are deletions. This, as well as the neglect of paralogs in the "classical" pangenome analysis, puts a question mark behind their conclusion that deletion drives pangenome evolution in the MTBC. 

      The term regions of difference or RDs has traditionally been used to describe structural variants relative to the H37Rv genome, often interpreted as deletions. Consistent with our study, Bespiatykh et al. (2021) observed two types of deletions: those associated with repeat sequences or mobile genetic elements, and conserved RDs that are phylogenetically informative deletions inherited by all descendants of a strain.

      In our study, we employed a phylogenetic approach to identify deletions. If RDs are present in genomes both upstream and downstream of a phylogenetic branch but are absent in one specific branch, we interpret this as evidence of gene deletion (Figure 5B). This method was systematically applied to all RDs identified as deletions in our study; we will clarify this better in the next version.

      We acknowledge the importance of considering paralogs in pangenome analysis. While the evolution of genomes is driven by duplication, loss and transfer, we know that transfer is not a mechanism in modern MTBC evolution and we have focussed here on loss. Duplication (paralog) analysis from annotations continues to be difficult to quantify due to the difficult of reliably confirming paralogy. We have addressed the effect of different Panaroo options, including merge paralogs, on the genomic diversity and pangenome estimation of MTBC in our associated paper (Marin et al 2024). This study showed that most structural variation in Mycobacterium tuberculosis is attributed to rearrangements of existing sequences rather than novel sequence content. For example, the transposable element IS6110 accounts for a significant portion of sequence variation. This hints that paralogs are not very important in terms of gene content differences in MTBC.

      However, we will attempt to build on this by looking at Panaroo outputs without merged paralogs and looking for potentially duplicated genomic stretches in the Pangraph analyses. This will hopefully show more robustly that the MTBC diversity is primarily deletion driven.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors attempted to investigate the pangenome of MTBC by using a selection of state-of-the-art bioinformatic tools to analyse 324 complete and 11 new genomes representing all known lineages and sublineages. The aim of their work was to describe the total diversity of the MTBC and to investigate the driving evolutionary force. By using long read and hybrid approaches for genome assembly, an important attempt was made to understand why the MTBC pangenome size was reported to vary in size by previous reports. 

      Strengths: 

      A stand-out feature of this work is the inclusion of non-coding regions as opposed to only coding regions which was a focus of previous papers and analyses which investigated the MTBC pangenome. A unique feature of this work is that it highlights sublineage-specific regions of difference (RDs) that were previously unknown. Another major strength is the utilisation of long-read whole genomes sequences, in combination with short-read sequences when available. It is known that using only short reads for genome assembly has several pitfalls. The parallel approach of utilizing both Panaroo and Pangraph for pangenomic reconstruction illuminated the limitations of both tools while highlighting genomic features identified by both. This is important for any future work and perhaps alludes to the need for more MTBC-specific tools to be developed. 

      Thank you for recognising the strengths of our work.

      Weaknesses: 

      The only major weakness was the limited number of isolates from certain lineages and the over-representation others, which was also acknowledged by the authors. However, since the case is made that the MTBC has a closed pangenome, the inclusion of additional genomes would not result in the identification of any new genes. This is a strong statement without an illustration/statistical analysis to support this. 

      The language around open and closed pangenomes is difficult to convey and indeed we will improve this for the next version. We aimed to show that with a set of highly curated genomes that span the breadth of known diversity within the MTBC, we see no evidence for a large, open pangenome as has been previously suggested. We instead suggest that adding new genomes is unlikely to bring large additions to the accessory genome, therefore showing that the MTBC pangenome tends towards being closed. We will add additional visualisations such as gene accumulation plots to better support this argument.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show that upon treatment with Doxorubicin (Doxo), there is an increase in senescence and inflammatory markers in the muscles. They also show these genes get upregulated in C2C12 myoblasts when treated with conditioned media or 15d-PGJ2. 15dPGJ2 induces cell death in the myoblasts, decreases proliferation (measured by cell numbers), and decreases differentiation and fusion. 15d-PGJ2 modified Cys184 of HRas, which is required for its activation as indicated by the FRET analysis with RAF RBD. They also showed that 15d-PGJ2 activates ERK signaling, but not Akt signaling, through the electrophilic center. 15d-PGJ2 inhibits Golgi localization of HRAS (only WT, not C181 or C184 mutant). They also showed that expressing the WT HRas followed by 15d-PGJ2 treatment led to a decrease in the levels of MHC mRNA and protein, and this defect is dependent on C184. This is a well-written manuscript with interesting insights into the mechanism of action of 15d-PGJ2. However, some clarification and experiments will help the paper advance the field significantly.

      Strengths:

      The data clearly shows that 15d-PGJ2 has a negative role in the myoblast cells and that it leads to modification of HRas protein. Moreover, the induction of biosynthetic enzymes in the PGD2 pathway also supports the induction of 15d-PGJ2 in Doxorubicin-treated cells. Both conditioned media experiments and the 15d-PGJ2 experiments show that 15d-PGJ2 could be the active component secreted by the senescent myoblasts.

      Weaknesses:

      The genes that are upregulated in the muscles upon injection with Doxo are also markers for inflammation. Since Doxo is also known to induce systemic inflammation, it is important to delineate these two effects (Inflammatory cells vs senescent cells). The expression of beta Gal and other markers of senescence in the tissue sections will help to delineate these.

      As pointed out Doxo induces systemic inflammation along with inducing DNA damage-mediated senescence. Therefore, along with the inflammatory markers of the SASP (CXCL1/2, TNF1α, IL6, PTGS1/2, PTGDS) we also observed an increase in the mRNA levels of canonical markers of DNA damage-mediated senescence. We observed an increase in the mRNA levels of cell cycle and senescence associated proteins p16 and p21 (Fig. 1C). We also observed an increased nuclear accumulation of p21 (Fig. 1A) and increased levels of phosphorylated H2A.X in the nucleus (Fig. 1B).

      In Figure 2, where the defect in the differentiation of myoblasts upon treatment with 15d-PGJ2 is shown, most of the cells die within 48 hours at higher concentrations, making it difficult to perform the experiments. This also shows that 15d-PGJ2 was toxic to these cells. Lower concentrations show a decrease in the differentiation based on the lower number of nuclei in fibers and low expression of MyoD, MyoG, and MHC. However, it is unclear if this is due to increased cell death or defective differentiation. It would be a lot more informative if the cell count, cell division, and cell death could be plotted for these concentrations of the drug during the experiment.

      We measured the viability of C2C12 cells after 24 hours of treatment with 15d-PGJ2 using the MTT assay and observed that the viability of cells was decreased after treatment with 15d-PGJ2 (10 µM) but not with 15d-PGJ2 (1 µM, 2 µM, 4 µM, or 5 µM) (see Fig. S2A of the updated manuscript). The results and figures of the manuscript have been updated accordingly.

      Also, in the myoblast experiments, are the effects of treatment with Dox reversible?

      The treatment with Doxorubicin is irreversible as the senescent phenotype was not reversed after withdrawal of Doxorubicin, even after 20 days.

      In Figure 3, most of the experiments are done at a high concentration, which induces almost complete cell death within 48 hours.

      Figure 3 is an acute experiment for only 1 hour, at which time no cell death was observed. Specifically, we measured the phosphorylation of Erk and Akt proteins after 1 hour of treatment with 15d-PGJ2 (10 µM) during which we did not observe any cell death.

      Even at such a high concentration of 15dPGJ2, the increase in ERK phosphorylation is minimal.

      We observe a ~30% increase in the phosphorylation of Erk proteins after treatment with 15d-PGJ­2 in 0.2% serum medium compared to treatment with vehicle (DMSO). This is reproducible and significant.

      The experiment Figure 4C shows that C181 and C84 mutants of the HRas show higher levels in Golgi compared with WT. However, this could very well be due to the defect in palmitoylation rather than the modification with 15d-PGJ2.

      Our data does not suggest higher levels of C184S mutant in the Golgi compared with WT (Fig. S4A). We observed that the ratio of HRas levels in the Golgi to the HRas levels in the plasma membrane were similar in C2C12 cells expressing HRas C184S and HRas WT (Fig. S4A graph columns 1 and 5).

      Though the authors allude to the possibility that intracellular redistribution of HRas by 15d-PGJ2 requires C181 palmitoylation, the direct influence of C184 modification on C181 palmitoylation is not shown. To have a meaningful conclusion, the authors need to compare the palmitoylation and modification with 15d-PGJ2.

      Palmitoylation of HRas C181S is required for the localization of HRas at the plasma membrane. The inhibition of palmitoylation of C181, either by mutation (C181S) or treatment with protein palmitoyl transferase inhibitor (2-Bromopalmitate), results in the accumulation of HRas at Golgi(Rocks et al., 2005) (Fig. S4A). Modification of HRas at C184 by 15d-PGJ2 (Fig. 3A) could inhibit the palmitoylation of HRas at C181. However, our data does not support this hypothesis as modification of HRas WT by 15d-PGJ2 does not increase the level of HRas at the Golgi, like in the case of inhibition of cysteine palmitoylation due to C181S mutation.

      To test if the inhibition of myoblast differentiation depends on HRas, they overexpressed the HRas and mutants in the C2C12 lines. However, this experiment does not take the endogenous HRAs into consideration, especially when interpreting the C184 mutant. An appropriate experiment to test this would be to knock down or knock out HRas (or make knock-in mutations of C184) and show that the effect of 15d-PGJ2 disappears.

      Endogenous HRas (wild type) is present in the C2C12 cells overexpressing the EGFP-tagged HRas constructs. Therefore, we only observe a partial rescue in the differentiation after 15d-PGJ2 treatment in C2C12 cells expressing the C184S mutant (Fig. 4D and E). However, since HRas is expressed under high expression CMV promoter and in the absence of other regulatory elements, the overexpressed constructs do show a dominant effect over the endogenous HRas, showing cysteine mutant dependent inhibition of differentiation of myoblasts after treatment with 15d-PGJ2 (Fig. 4D and E).

      Moreover, in this specific experiment, it is difficult to interpret without a control with no HRas construct and another without the 15d-PGJ2 treatment.

      The mRNA levels of MyoD, MyoG, and MHC in C2C12 cells expressing HRas constructs after treatment with 15d-PGJ2 were normalized to the mRNA levels in C2C12 cells expressing corresponding constructs and were treated with vehicle (DMSO). mRNA levels in C2C12 cells treated with vehicle were not shown as they were normalized to 1. MHC protein levels in C2C12 cells expressing HRas constructs after 15d-PGJ2 treatment were normalized to that in C2C12 cells treated with vehicle (DMSO). Since the hypothesis to study the effect of HRas cysteine mutations on the differentiation of myoblasts after treatment with 15d-PGJ2, C2C12 cells expressing HRas WT serve as adequate control. Fig. 2 shows the effect of 15d-PGJ2 on muscle differentiation when HRas was not overexpressed.

      Moreover, the overall study does not delineate the toxic effects of 15d-PGJ2 from its effect on the differentiation.

      The inhibition of differentiation in C212 cells after treatment with 15d-PGJ2 cannot be attributed to the general toxicity of 15d-PGJ2 in cells. We show that the inhibition of differentiation of myoblasts after 15d-PGJ2 depends on modification of HRas at C184 i.e. failure to modify HRas at C184 (Fig. 3A) and resultant activation (Fig. 3B) by 15d-PGJ2 rescues this inhibition of differentiation of C2C12 cells (Fig. 4D and E), dissecting the inhibition of differentiation of myoblasts by 15d-PGJ2 from general toxic effects of 15d-PGJ2 on cell physiology.

      Please note that the effect of 15d-PGJ2 on cell physiology is context-specific. On one hand, 15d-PGJ2 has been shown to exert tumor-suppressor effects by inhibiting the proliferation of ovarian cancer cells and lung adenocarcinoma cells (de Jong et al., 2011; Slanovc et al., 2024), 15d-PGJ2 also exerts pro-carcinogenic effects by induction of epithelial to mesenchymal transition in breast cancer cells MCF7 and inhibition of tumor-suppressor protein p53 in MCF7 and PC-3 cells (Choi et al., 2020; Kim et al., 2010).

      Reviewer #2 (Public Review):

      Summary:

      In this study, Swarang and colleagues identified the lipid metabolite 15d-PGJ2 as a potential component of senescent myoblasts. They proposed that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas, suggesting its potential as a target for restoring muscle homeostasis post-chemotherapy.

      Strengths:

      The regulation of HRas by 15d-PGJ2 is well controlled.

      Weaknesses:

      (1) I still think the novelty is limited by previous published findings. The authors themselves noted that the accumulation of 15d-PGJ2 in senescent cells has been reported in various cell types, including human fibroblasts, HEPG2 hepatocellular carcinoma cells, and HUVEC endothelial cells (PMCID: PMC8501892). Although the current study observed similar activation of 15d-PGJ2 in myoblasts, it appears to be additive rather than fundamentally novel. The covalent adduct of 15d-PGJ2 with Cys-184 of H-Ras was reported over 20 years ago (PMID: 12684535), and the biochemical principles of this interaction are likely universal across different cell types. The regulation of myogenesis by both HRas and 15d-PGJ2 has also been previously extensively reported (PMID: 2654809, 1714463, 17412879, 20109525, 11477074). The main conceptual novelty may lie in the connection between these points in myoblasts. But as discussed in another comment, the use of C2C12 cells as a model for senescence study is questionable due to the lack of the key regulator p16. The findings in C2C12 cells may not accurately represent physiological-relevant myoblasts. It is recommended that these findings be validated in primary myoblasts to strengthen the study's conclusions.

      This is the first study to show a molecular mechanism where activation of HRas signaling in skeletal myoblasts due to covalent modification by 15d-PGJ2 at C184 of HRas inhibits the differentiation of skeletal myoblasts.

      (2) The C2C12 cell line is not an ideal model for senescence study.

      C2C12 cells are a well-established model for studying myogenesis. However, their suitability as a model for senescence studies is questionable. C2C12 cells are immortalized and do not undergo normal senescence like primary cells as C2C12 cells are known to have a deleted p16/p19 locus, a crucial regulator of senescence (PMID: 20682446). The use of C2C12 cells in published studies does not inherently validate them as a suitable senescence model. These studies may have limitations, and the appropriateness of the C2C12 model depends on the specific research goals.

      Several reports have shown that cells undergo senescence independent of p16 expression. MCF7 human breast adenocarcinoma cells have been shown to undergo DNA damage mediated and Oncogene induced senescence as seen after treatment with Doxorubicin (PMID: PMC7025418) and expression of constitutively active HRas (PMID: 17135242), despite the homozygous deletion of p16 locus (ISBN 9780124375512 Chapter 17 Table 2) by upregulation of cell cycle inhibitor protein p21. In this study, we observe an increase in the senescence markers in C2C12 cells after treatment with Doxo (Fig. 1). We also observed an increase in the markers of DNA damage-mediated senescence in MCF7 after treatment with Doxo (Data will be included in the revised manuscript). Based on these observations, we have concluded that C2C12 cells undergo senescence despite lacking the p16/p19 locus.

      In the study by Moustogiannis et al. (PMID: 33918414), they claimed to have aged C2C12 cells through multiple population doublings. However, the SA-β-gal staining in their data, which is often used to confirm senescence, showed almost fully confluent "aged" C2C12 cells. This confluent state could artificially increase SA-β-gal positivity, suggesting that these cells may not truly represent senescence. Moreover, the "aged" C2C12 cells exhibited normal proliferation, which contradicts the definition of senescence. Similar findings were reported in another study of C2C12 cells subjected to 58 population doublings (PMID: 21826704), where even at this late stage, the cells were still dividing every 2 or 3 days, similar to younger cells at early passages. More importantly, I do know how the p16 was detected in that paper since the locus was already mutated. In terms of p21, there was no difference in the proliferative C2C12 cells at day 0.

      In the study by Moiseeva et al. in 2023 (PMID: 36544018), C2C12 cells were used for senescence modeling for siRNA transfection. However, the most significant findings were obtained using primary satellite cells or confirmed with complementary data.

      In conclusion, while molecular changes observed in studies using C2C12 cells may be valid, the use of primary myoblasts is highly recommended for senescence studies due to the limitations and questionable senescence characteristics of the C2C12 cell line.

      (3) Regarding source of increased PGD in the conditioned medium, I want to emphasize that it's unclear whether the PGD or its metabolites increase in response to DNA damage or the senescence state. Thus, using a different senescent model to exclude the possibility of DNA damage-induced increase will be crucial.

      Though Senescence can be induced by several stress stimuli like DNA damage, Oncogene expression, ROS, Mitochondrial Dysfunction, etc., DNA damage remains critical for the induction of the SASP (reviewed in PMID: 20078217). Also, other models of senescence, like Oncogene Induced Senescence (reviewed in PMID: 17671427), ROS Induced Senescence (PMID: 24934860), Mitochondrial Dysfunction Associated Senescence (MiDAS) (PMID: 26686024) have shown upregulation of DNA damage-associated signaling pathways. In this study, we have explored the SASP of cells undergoing senescence upon chemotherapy drug Doxorubicin-mediated DNA damage.

      (4) Similarly for the in vivo Doxorubicin (Doxo) injection, both reviewers have raised concerns about the potential side effects of Doxo, including inflammation, DNA damage, and ROS generation. These effects could potentially confound the results of the study. The physiological significance of this study will heavily rely on the in vivo data. However, the in vivo senescence component is confounded by the side effects of Doxo.

      We concur that this is a limitation of this study and the subsequent work will demonstrate the origin of prostaglandin biosynthesis after treatment with Doxo in vivo.

      (5) Figure 2A lacks an important control from non-senescent cells during the measurement of C2C12 differentiation in the presence of conditioned medium. The author took it for granted that the conditioned medium from senescent cells would inhibit myogenesis, relying on previous publications (PMID: 37468473). However, that study was conducted in the context of myotonic dystrophy type 1. To support the inhibitory effect in the current experimental settings, direct evidence is required. It would be necessary to include another control with conditioned medium from normal, proliferative C2C12 cells.

      Conditioned medium of senescent cells of several types, like senescent myoblasts in case of DM1 (PMID: 37468473), adipocytes undergoing senescence due to H2O2 treatment, Insulin Resistance, and Replicative senescence (PMID: 37321332), has been shown to inhibit the differentiation of myoblasts. Therefore, in this study, we measured the effect of prostaglandin PGD2 and its metabolites on the differentiation of myoblasts by inhibiting the biosynthesis of PGD2 in senescent myoblasts by treatment with AT-56. We inhibited the synthesis of PGD2 in senescent cells by treatment with AT-56, and then collected the conditioned medium. Conditioned medium collected from senescent C2C12 cells treated with vehicle (DMSO) served as a control for the experiment.

      (6) Statistical analyses problems.

      Only t-test was used throughout the study even when there are more than two groups. Please have a statistician to evaluate the replicates and statistical analyses used.

      In experiments with more than two groups, the t-test was used for column-wise comparison of the experiment samples to the control sample. Multiple sample comparisons using one-way or two-way ANOVA were avoided as experimental samples were individually compared to the control sample.

      For the 15d-PGJ2/cell concentration measurements in Figure 1F, there were only two replicates, which was provided in the supplementary table after required. Was that experiment repeated with more biological replicates?

      Additional replicates of the experiment will be included in the revised manuscript.

      For figure 1C, Fig 1F, 1G, 1J, 2C, 2E, 3A, 3E, 3F, 4D, 4E, please include each data points in bar graphs as used in Fig 1D, or at least provide how many biological replicates were used for each experiment?

      Appropriate revisions will be made in the figure legends of the revised manuscript.

      There is no error bar in a lot of control groups (Fig 2C, 2E, 3EF, 4E, S4B).

      There are no error bars for the control groups in the figures 2C, 2E, 3E, 3F, 4E, and S4B as the experimental samples of each replicate were normalized to the corresponding control sample, rendering the values for the control sample of each replicate to 1.

      For qPCR data in Figure 1C, the author responded in that the data in was plotted using 2-ΔCT instead of 2-ΔΔCT to show the variability in the expression of mRNAs isolated from animals treated with Saline. This statement does not align with the method section. Please revise.

      Appropriate revisions will be made to the method sections of the revised manuscript.

      (7) For Figure 1, the title may not be appropriate as there is insufficient data to support the inhibition of myoblast differentiation.

      Appropriate revisions will be made to the revised manuscript.

      Recommendations for the authors:

      After careful review, the editors advise you to carefully address the following concerns.

      (1) There were concerns that in the revised manuscript, the DMSO and Doxo experiments depicted in Figure 1H appeared quite homogenous despite the author's description to the contrary. This leads to concerns about the type of statistics employed and the possible low number of replicates of experiments shown in Fig. 1.

      (2) Experiments in Figure 1F, 1I, and 1J had as few as n=2 experiments. Figures 1C, 1D, 1F, 1G, and 1J, the statistics used a two-tailed student's t-test; for all other experiments, they marked N/A for statistics. Using a t-test for multi-group comparisons (as indicated in the figure legend) and relying on only 2 replicates for many experiments are not appropriate.

      Additional replicates for the experiments shown in figures 1F, 1I, and 1J have been done and the data will be revised along with updated statistical tests during the revision of the manuscript.

      (3) In several experiments, the difference between technical replicates is too high.

      Reviewer #1 (Recommendations For The Authors):

      Most of my concerns were addressed in the revised manuscript.

      We thank the reviewer for their time in reviewing the manuscript and consideration of the author’s response to their comments in during the previous round of review.

      Reviewer #2 (Recommendations For The Authors):

      Validating the findings in a primary myoblast is highly recommended for senescence studies due to the limitations and questionable senescence characteristics of the C2C12 cell line.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      Validate the finding in a different senescent model to exclude the possibility of DNA damage-response.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      For Fig 2A, add another control with a conditioned medium from normal, proliferative C2C12 cells.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      Please have a statistician to evaluate the replicates and statistical analyses used.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      For the barplots (figure 1C, Fig 1F, 1G, 1J, 2C, 2E, 3A, 3E, 3F, 4D, 4E), please include each data points, or at least provide how many biological replicates were used for each experiment.

      Appropriate revisions will be made in the figure legends of the revised manuscript.

      For Figure 1, the title may not be appropriate as there is insufficient data to support the inhibition of myoblast differentiation.

      Appropriate revisions will be made to the revised manuscript.


      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript provides useful information about the lipid metabolite 15d-PGJ2 as a potential regulator of myoblast senescence. The authors provide experimental evidence that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas. However, the manuscript is incomplete in its current form, as it lacks robust support from the data regarding the main conclusions related to senescence and technical concerns related to the senescence models used in this study.

      We are grateful to the editors and the reviewers for their time and comments in sharpening the science and the writing of the manuscript. We have attached a detailed response to emphasize that the manuscript does include robust evidence regarding the claims, which could have been missed during the review process. We have provided a better context for these points now.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show that upon treatment with Doxorubicin (Doxo), there is an increase in senescence and inflammatory markers in the muscles. They also show these genes get upregulated in C2C12 myoblasts when treated with conditioned media or 15d-PGJ2. 15dPGJ2 induces cell death in the myoblasts, decreases proliferation (measured by cell numbers), and decreases differentiation and fusion. 15d-PGJ2 modified Cys184 of HRas, which is required for its activation as indicated by the FRET analysis with RAF RBD. They also showed that 15d-PGJ2 activates ERK signaling, but not Akt signaling, through the electrophilic center. 15d-PGJ2 inhibits Golgi localization of HRAS (only WT, not C181 or C184 mutant). They also showed that expressing the WT HRas followed by 15d-PGJ2 treatment led to a decrease in the levels of MHC mRNA and protein, and this defect is dependent on C184. This is a well-written manuscript with interesting insights into the mechanism of action of 15d-PGJ2. However, some clarification and experiments will help the paper advance the field significantly.

      Strengths:

      The data clearly shows that 15d-PGJ2 has a negative role in the myoblast cells and that it leads to modification of HRas protein. Moreover, the induction of biosynthetic enzymes in the PGD2 pathway also supports the induction of 15d-PGJ2 in Doxorubicin-treated cells. Both conditioned media experiments and the 15d-PGJ2 experiments show that 15d-PGJ2 could be the active component secreted by the senescent myoblasts.

      Weaknesses:

      The genes that are upregulated in the muscles upon injection with Doxo are also markers for inflammation. Since Doxo is also known to induce systemic inflammation, it is important to delineate these two effects (inflammatory cells vs senescent cells). The expression of beta Gal and other markers of senescence in the tissue sections will help to delineate these.

      As pointed out Doxo induces systemic inflammation along with inducing DNA damage-mediated senescence. Therefore, along with the inflammatory markers of the SASP (CXCL1/2, TNF1α, IL6, PTGS1/2, PTGDS) we also observed an increase in the mRNA levels of canonical markers of DNA damage-mediated senescence. We observed an increase in the mRNA levels of cell cycle and senescence associated proteins p16 and p21 (Fig. 1C). We also observed an increased nuclear accumulation of p21 (Fig. 1A) and increased levels of phosphorylated H2A.X in the nucleus (Fig. 1B).

      In Figure 2, where the defect in the differentiation of myoblasts upon treatment with 15d-PGJ2 is shown, most of the cells die within 48 hours at higher concentrations, making it difficult to perform the experiments. This also shows that 15d-PGJ2 was toxic to these cells. Lower concentrations show a decrease in the differentiation based on the lower number of nuclei in fibers and low expression of MyoD, MyoG, and MHC. However, it is unclear if this is due to increased cell death or defective differentiation. It would be a lot more informative if the cell count, cell division, and cell death could be plotted for these concentrations of the drug during the experiment.

      We measured the viability of C2C12 cells after 24 hours of treatment with 15d-PGJ2 using the MTT assay and observed that the viability of cells was decreased after treatment with 15d-PGJ2 (10 µM) but not with 15d-PGJ2 (1 µM, 2 µM, 4 µM, or 5 µM) (see Fig. S2A of the updated manuscript). The results and figures of the manuscript have been updated accordingly.

      Also, in the myoblast experiments, are the effects of treatment with Dox reversible?

      The treatment with Doxorubicin is irreversible as the senescent phenotype was not reversed after withdrawal of Doxorubicin, even after 20 days.

      In Figure 3, most of the experiments are done at a high concentration, which induces almost complete cell death within 48 hours.

      Figure 3 is an acute experiment for only 1 hour, at which time no cell death was observed. Specifically, we measured the phosphorylation of Erk and Akt proteins after 1 hour of treatment with 15d-PGJ2 (10 µM) during which we did not observe any cell death. 

      Even at such a high concentration of 15dPGJ2, the increase in ERK phosphorylation is minimal.

      We observe a ~30% increase in the phosphorylation of Erk proteins after treatment with 15d-PGJ2 in 0.2% serum medium compared to treatment with vehicle (DMSO). This is reproducible and significant.

      The experiment Figure 4C shows that C181 and C84 mutants of the HRas show higher levels in Golgi compared with WT. However, this could very well be due to the defect in palmitoylation rather than the modification with 15d-PGJ2.

      Our data does not suggest higher levels of C184S mutant in the Golgi compared with WT (Fig. S4A). We observed that the ratio of HRas levels in the Golgi to the HRas levels in the plasma membrane were similar in C2C12 cells expressing HRas C184S and HRas WT (Fig. S4A graph columns 1 and 5).

      Though the authors allude to the possibility that intracellular redistribution of HRas by 15d-PGJ2 requires C181 palmitoylation, the direct influence of C184 modification on C181 palmitoylation is not shown. To have a meaningful conclusion, the authors need to compare the palmitoylation and modification with 15d-PGJ2.

      Palmitoylation of HRas C181S is required for the localization of HRas at the plasma membrane. The inhibition of palmitoylation of C181, either by mutation (C181S) or treatment with protein palmitoyl transferase inhibitor (2-Bromopalmitate), results in the accumulation of HRas at Golgi(Rocks et al., 2005) (Fig. S4A). Modification of HRas at C184 by 15d-PGJ2 (Fig. 3A) could inhibit the palmitoylation of HRas at C181. However, our data does not support this hypothesis as modification of HRas WT by 15d-PGJ2 does not increase the level of HRas at the Golgi, like in the case of inhibition of cysteine palmitoylation due to C181S mutation.

      To test if the inhibition of myoblast differentiation depends on HRas, they overexpressed the HRas and mutants in the C2C12 lines. However, this experiment does not take the endogenous HRAs into consideration, especially when interpreting the C184 mutant. An appropriate experiment to test this would be to knock down or knock out HRas (or make knock-in mutations of C184) and show that the effect of 15d-PGJ2 disappears. 

      Endogenous HRas (wild type) is present in the C2C12 cells overexpressing the EGFP-tagged HRas constructs. Therefore, we only observe a partial rescue in the differentiation after 15d-PGJ2 treatment in C2C12 cells expressing the C184S mutant (Fig. 4D and E). However, since HRas is expressed under high expression CMV promoter and in the absence of other regulatory elements, the overexpressed constructs do show a dominant effect over the endogenous HRas, showing cysteine mutant dependent inhibition of differentiation of myoblasts after treatment with 15dPGJ2 (Fig. 4D and E).

      Moreover, in this specific experiment, it is difficult to interpret without a control with no HRas construct and another without the 15d-PGJ2 treatment.

      The mRNA levels of MyoD, MyoG, and MHC in C2C12 cells expressing HRas constructs after treatment with 15d-PGJ2 were normalized to the mRNA levels in C2C12 cells expressing corresponding constructs and were treated with vehicle (DMSO). mRNA levels in C2C12 cells treated with vehicle were not shown as they were normalized to 1. MHC protein levels in C2C12 cells expressing HRas constructs after 15d-PGJ2 treatment were normalized to that in C2C12 cells treated with vehicle (DMSO). Since the hypothesis to study the effect of HRas cysteine mutations on the differentiation of myoblasts after treatment with 15d-PGJ2, C2C12 cells expressing HRas WT serve as adequate control. Fig. 2 shows the effect of 15dPGJ2 on muscle differentiation when HRas was not overexpressed.

      Moreover, the overall study does not delineate the toxic effects of 15d-PGJ2 from its effect on the differentiation.

      The inhibition of differentiation in C212 cells after treatment with 15d-PGJ2 cannot be attributed to the general toxicity of 15d-PGJ2 in cells. We show that the inhibition of differentiation of myoblasts after 15d-PGJ2 depends on modification of HRas at C184 i.e. failure to modify HRas at C184 (Fig. 3A) and resultant activation (Fig. 3B) by 15d-PGJ2 rescues this inhibition of differentiation of C2C12 cells (Fig. 4D and E), dissecting the inhibition of differentiation of myoblasts by 15d-PGJ2 from general toxic effects of 15d-PGJ2 on cell physiology.

      Please note that the effect of 15d-PGJ2 on cell physiology is context-specific. On one hand, 15d-PGJ2 has been shown to exert tumor-suppressor effects by inhibiting the proliferation of ovarian cancer cells and lung adenocarcinoma cells (de Jong et al., 2011; Slanovc et al., 2024), 15d-PGJ2 also exerts pro-carcinogenic effects by induction of epithelial to mesenchymal transition in breast cancer cells MCF7 and inhibition of tumor-suppressor protein p53 in MCF7 and PC-3 cells (Choi et al., 2020; Kim et al., 2010).

      Reviewer #2 (Public Review):

      Summary:

      In this study, Swarang and colleagues identified the lipid metabolite 15d-PGJ2 as a potential component of senescent myoblasts. They proposed that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas, suggesting its potential as a target for restoring muscle homeostasis post-chemotherapy.

      Strengths:

      The regulation of HRas by 15d-PGJ2 is well controlled.

      Weaknesses:

      The novelty of the study is compromised as the activation of PGD and 15d-PGJ2, as well as the regulation of HRas and cell proliferation, have been previously reported. 

      Literature does not support this statement, and it is important to clarify this misimpression for the field as a whole. 

      Let us clarify- 

      Covalent modification of HRas by 15d-PGJ2 has been reported only twice in the literature(Luis Oliva et al., 2003; Yamamoto et al., 2011) in fibroblasts and neurons respectively. 

      Interaction between Hras and 15d-PGJ2 in skeletal muscles has not been shown before, even though both Hras and 15d-PGJ2 are shown to be key regulators of muscle homeostasis. 

      Activation of Hras by 15d-PGJ2 was reported first by Luis Oliva et al (Luis Oliva et al., 2003). However, this study does not comment on the functional implications of activation of Hras signaling. 

      Recently, our lab contributed to a study where the functional implication of activation of Hras signaling due to covalent modification by 15d-PGJ2 was shown in the maintenance of senescence phenotype (Wiley et al., 2021). 

      15d-PGJ2 was shown to inhibit the differentiation of myoblasts by Hunter et al (Hunter et al., 2001). This study hypothesized that the inhibition of myoblast differentiation is via 15d-PGJ2 mediated activation of the PPARγ signaling, the study also showed inhibition of myoblast differentiation independent of PPARγ activity, suggesting the presence of other mechanisms.

      This is the first study to show a molecular mechanism where activation of Hras signaling in skeletal myoblasts due to covalent modification by 15d-PGJ2 at C184 of Hras inhibits the differentiation of skeletal myoblasts.

      Additionally, there are major technical concerns related to the senescence models, limiting data interpretation regarding the relevance to senescent cells.

      Major concerns:

      (1) The C2C12 cell line is not an ideal model for senescence study due to its immortalized nature and lack of normal p16 expression. A more suitable myoblasts model is recommended, with a more comprehensive characterization of senescence features.

      C2C12 is a good model for DNA damage-based senescence that is used in this manuscript. Several reports in the literature have shown the induction of senescence in C2C12 cells. Moiseeva et al 2023 show induction of senescence in C2C12 cells after etoposide-mediated DNA damage. Moustogiannis et al 2021 show the induction of replicative senescence in C2C12 cells. In this study, we show that C2C12 cells undergo DNA damage-mediated senescence after treatment with Doxo. We measured the induction of senescence in C2C12 cells upon DNA damage using several physiological (Nuclear Size, Cell Size, and SA β-gal) and molecular markers (mRNA levels of p21 and SASP factors (IL6 and TGFβ), protein levels of p21) of senescence (see Fig. 1 of the updated manuscript). The results and the figures in the manuscript have been updated accordingly.

      (2) The source of increased PGD or its metabolites in the conditioned medium is unclear. Including other senescence models, such as replicative or oncogeneinduced senescence, would strengthen the study.

      Fig. 1E shows time-dependent increase in the expression of PGD2 biosynthetic enzymes in senescent C2C12 cells. Fig. 1F shows an increase in the levels of 15dPGJ2 secreted by senescent C2C12 cells in the conditioned medium. This data shows that senescent C2C12 cells are the source of PGD and its metabolites in the conditioned medium.

      Again, C2C12 is not suitable for replicative senescence due to its immortalized status.

      We and others have shown that C2C12 cells undergo senescence, and this manuscript only used DNA damage induced senescence.

      (3) In the in vivo part, it is unclear whether the increased expression of PTGS1, PTGS2, and PTGDS is due to senescence or other side effects of DOXO.

      We concur that this is a limitation of this study and the subsequent work will demonstrate the origin of prostaglandin biosynthesis after treatment with Doxo in vivo.

      (4) Figure 2A lacks an important control from non-senescent cells during the measurement of C2C12 differentiation in the presence of a conditioned medium.

      Figure 2A tests the effect of prostaglandin PGD2 and its metabolites secreted by the senescent cells on the differentiation of myoblasts. Therefore, we inhibited the synthesis of PGD2 in senescent cells by treatment with AT-56, and then collected the conditioned medium. Conditioned medium collected from senescent C2C12 cells treated with vehicle (DMSO) served as a control for the experiment, whereas differentiation of C2C12 cells without any treatment serves as a positive control.

      There is no explanation of how differentiation was quantified or how the fusion index was calculated.

      The fusion index was calculated using a published myotube analyzer software (Noë et al., 2022). Appropriate information has been added to the materials and methods section of the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 3: Expand SA in "SA β-gal".

      The manuscript has been updated accordingly (See line 3).

      Line 68: HRas is highly regulated by lipid modifications.

      The manuscript has been updated accordingly (See line 67).

      Figures

      Figure S1A seemed incomplete (maybe some processing issue).

      The Figure has been updated in the revised manuscript (See Fig. S1A).

      Figure S1B-H are mislabeled.

      The figure has been updated in the revised manuscript (See Fig. S1C, D, E, and F).

      Figures S1E-H are not mentioned in the manuscript.

      The manuscript has been updated accordingly (See line 120).

      Many supplementary figures are not cited in the article.

      The manuscript has been updated accordingly. (See lines 85, 120, 123, 166, 225, 356, 364, 412, and 413)

      Reviewer #2 (Recommendations For The Authors):

      (1) Clarify the injection method for Doxorubicin in B6J mice on line 83 (IP or IM).

      Mice were injected intraperitoneally with Doxorubicin (as mentioned in the materials and methods, see lines 83 and 794)

      (2) Address missing information in figures or figure legends.

      There is missing piece in Sup Fig 1A.

      The figure has been updated in the revised manuscript (See Fig. S1A).

      Correct labels in Sup Fig 1C and 1D.

      The figure has been updated in the revised manuscript (See Fig. S1C, D, E, and F).

      How would the authors explain the dramatic differences in the morphology of C2C12 cells treated with DOXO between bright field and SA-beta-gal staining images in Sup Fig 1B and 1C.

      The SA β-gal image after treatment with Doxo does show a flattened cell morphology. Another field of view from the same experiment has been added in the figure to show the difference in the cell morphology more prominently in the revised manuscript (See Fig. 1H).

      Provide explanations for Sup Fig 1E-1G, including the meaning of the y-axis and the blue dots and red lines.

      We have provided an explanation for the multiple reaction monitoring mass spectrometry used to measure the concentration of 15d-PGJ2 in the conditioned medium in the revised manuscript (see lines 119-130 and the legends of Fig. S1C, D, and E)

      (3) Please review the calculation of qPCR data in Figure 1C for correctness, ensuring reference samples with an average expression level of 1.

      The data in Fig. 1C was plotted using 2-ΔCT instead of 2-ΔΔCT to show the variability in the expression of mRNAs isolated from animals treated with Saline.

      (4) Please explain the calculation of 15d-PGJ2/cell concentration in Figure 1F and provide raw data for review, considering the substantial changes and small error bars. The method or result section lacks an explanation of how this calculation was performed. Additionally, there is no mention of the cell number count.

      All the raw values (concentration of 15d-PGJ2 measured using mass spec and cell numbers counted at the time of collection of conditioned medium) are provided in the supplementary table 1. The standard curve to calculate the concentration of 15dPGJ2 in the conditioned medium is shown in Fig. S1F. The cell number was counted after trypsinization using a hemocytometer on the day of collection of the conditioned medium.

      (5) Please clarify how cell number normalization and doubling time calculation were done in Fig 2B. Consider replacing the figure with a growth curve showing confluence on the y-axis for easier interpretation.

      Cells were counted every 24 hours and the normalization was done to the number of cells counted on day 0 of the treatment (to consider attaching efficiency and other cell culture parameters). Doubling time was calculated as the reciprocal of the slope of the graph of log2(normalized cell number) vs time.

    1. Author response:

      Please find below our provisional author response, outlining the revisions we plan to undertake to address the Recommendations received:

      Reviewer #1 (Recommendations For The Authors):

      (1) A set of recent advances have shown that embeddings of unsupervised/self-supervised speech models aligned to auditory responses to speech in the temporal cortex (e.g. Wav2Vec2: Millet et al NeurIPS 2022; HuBERT: Li et al. Nat Neurosci 2023; Whisper: Goldstein et al. bioRxiv 2023). These models are known to preserve a variety of speech information (phonetics, linguistic information, emotions, speaker identity, etc) and perform well in a variety of downstream tasks. These other models should be evaluated or at least discussed in the study.

      We plan to evaluate two of these other models, Wav2Vec2 and HuBERT, in the brain encoding and RSA parts.

      (2) The test statistics of the results in Fig 1c-e need to be revised. Given that logistic regression is a convex optimization problem typically converging to a global optimum, these multiple initializations of the classifier were likely not entirely independent. Consequently, the reported degrees of freedom and the effect size estimates might not accurately reflect the true variability and independence of the classifier outcomes. A more careful evaluation of these aspects is necessary to ensure the statistical robustness of the results.

      We plan to address this point to ensure the statistical robustness of our results.

      (3) In Line 198, the authors discuss the number of dimensions used in their models. To provide a comprehensive comparison, it would be informative to include direct decoding results from the original spectrograms alongside those from the VLS and LIN models. Given the vast diversity in vocal speech characteristics, it is plausible that the speaker identities might correlate with specific speech-related features also represented in both the auditory cortex and the VLS. Therefore, a clearer understanding of the original distribution of voice identities in the untransformed auditory space would be beneficial. This addition would help ascertain the extent to which transformations applied by the VLS or LIN models might be capturing or obscuring relevant auditory information.

      We plan to include direct decoding results from the original spectrograms in addition from the VLS and LIN models.

      Reviewer #2 (Recommendations For The Authors):

      We plan to address the following points raised by Reviewer #2:

      (1) English mistakes, rewordings:

      a. L31: 'in voice' > consider rewording (from a voice?).

      b. L33: consider splitting sentence (after interactions).

      c. L39: 'brain' after parentheses.

      d. L45-: certainly DNNs 'as a powerful tool' extend to audio (not just image and video) beyond their use in brain models.

      e. L52: listened to / heard.

      f. L63: use second/s consistently.

      g. L64: the reference to Figure 5D is maybe a bit confusing here in the introduction.

      h. L79-88: this section is formulated in a way that is too detailed for the introduction text (confusing to read). Consider a more general introduction to the VLS concept here and the details of this study later.

      i. L99-: again, I think the experimental details are best saved for later. It's good to provide a feel for the analysis pipeline here, but some of the details provided (number of averages, denoising, preprocessing), are anyway too unspecific to allow the reader to fully follow the analysis.

      We will correct the mistakes, apply the suggested rewordings, and clarify the points raised.

      (2) Clarification.

      • L159: what was the motivation for classifying age as a 2-class classification problem? Rather than more classes or continuous prediction? How did you choose the age split?

      • L263: Is the test of RDM correlation>0 corrected for multiple comparisons across ROIs, subjects, and models?

      • L379: 'these stimuli' - weren't the experimental stimuli different from those used to train the V/AE?

      • L443: what are 'technical issues' that prevented subject 3 from participating in 48 runs??

      • L444: participants were instructed to 'stay in the scanner'!? Do you mean 'stay still', or something?

      • L463: Hearing thresholds of 15 dB: do you mean that all had thresholds lower than 15 dB at all frequencies and at all repeated audiogram measurements?

      • L472: were the 4 category levels balanced across the dataset (in number of occurrences of each category combination)?

      • L482: the test stimuli were selected as having high energy by the amplitude envelope. It is unclear what this means (how is the envelope extracted, what feature of it is used to measure 'high energy'?)

      • L500 was the audio filtered to account for the transfer function of the Sensimetrics headphones?

      • L500: what does 'comfortable level' correspond to and was it set per session (i.e. did it vary across sessions)?

      • L526- does the normalization imply that the reconstructed spectrograms are normalized? Were the reconstructions then scaled to undo the normalization before inversion?

      • L606: does the identity GLM model the denoised betas from the first GLM or simply the BOLD data? The text indicates the latter, but I suspect the former.

      • L704: could you unpack this a bit more? It is not easy to see why you specify the summing in the objective. Shouldn't this just be the ridge objective for a given voxel/ROI? Then you could just state it in matrix notation.

      • L716: you used robust scaling for the classifications in latent space but haven't mentioned scaling here. Are we to assume that the same applies?

      • L720: Pearson correlation as a performance metric and its variance will depend on the choice of test/train split sizes. Can you show that the results generalize beyond your specific choices? Maybe the report explained variance as well to get a better idea of performance.

      • Could you specify (somewhere) the stimulus timing in a run? ISI and stimulus duration are mentioned in different places, but it would be nice to have a summary of the temporal structure of runs.

      We will clarify the points raised.

      Reviewer #3 (Recommendations For The Authors):

      We plan to address the following points raised by Reviewer #3:

      Comments:

      • Code and data are not currently available.

      • In the supplementary material, it would be beneficial to present the different analyses as boxplots, as in the main text, but with the ROIs in the left and right hemispheres separated, to better show potential hemispheric effect. Although this information is available in the Supplementary Tables, it is currently quite tedious to access it.

      • In Figure 3a, it might be beneficial to order the identities by age for each gender in order to more clearly illustrate the structure of the RDMs,

      • In Figure 3b, the variance for the correlations for the aTVA is higher than in other regions, why?

      • Please make sure that all acronyms are defined, and that they are redefined in the figure legends.

      • Gender and age are primarily encoded by different brain regions (Figure 5, pTVA vs aTVA). How does this finding compare with existing literature?

      We will upload the code and the preprocessed data; improve the supplementary material figures; Fix Figure 3 according to the Reviewer’s suggestion, and clarify the points raised.

    1. Author response:

      We thank the reviewers for their comments and will revise the manuscript to provide more comprehensive clarifications to aide readers’ understanding of behaviorMate. Additionally, we intend to take several steps which could provide further insights and improve the ease of use for new behaviorMate users: (1) to release an expanded and annotated library of existing settings and VR scene files, (2) improve the online documentation of context lists and decorators which allow behaviorMate to run custom experimental paradigms without writing code, and (3) release online API details of the JSON messaging protocol that is used between behaviorMate, the Arduinos, and the VRMate program which could be especially helpful to developers interested in expanding or modifying the system. Here we provide a few brief points of clarification to some of the concerns raised by the reviewers.

      Firstly, we clarify the system’s focus on modularity and flexibility. behaviorMate leverages the “Intranet of Things” framework to provide a low-cost platform that relies on asynchronous message passing between independent networked devices. While our current VR implementation typically involves a PC, 2 Arduinos, and an Android device per VR display, the behaviorMate GUI can be configured without editing any source code to listen on additional ports for UDP messages which will be automatically timestamped and logged. Since the current implementation of the behaviorMate GUI can be configured through the settings file to send and receive JSON-formatted messages on arbitrary ports, third-party devices could be configured to listen and respond to these messages also without editing the UI source code. More specialized responsibilities or tasks that require higher temporal precision (such as position tracking) are handled by dedicated circuits so as to not overload the general purpose one. This provides a level of encapsulation/separation of concerns since components can be optimized for performance of a single tasks—a feature that is especially desirable given resource limitations on the most common commercially available microcontrollers.

      A number of methods exist for synchronizing recording devices like microscopes or electrophysiology recordings with behaviorMate’s time-stamped logs of actuators and sensors. For example, the GPIO circuit can be configured to send sync triggers, or receive timing signals as input, alternatively a dedicated circuit could record frame start signals and relay them to the PC to be logged indecently of the GPIO (enabling a high-resolution post-hoc alignment of the time stamps). The optimal method to use varies based on the needs of the experiment. For example, if very high temporal precision is needed, such as during electrophysiology experiments, a high-speed data acquisition (DAQ) circuit to capture a fixed interval readout might be beneficial. behaviorMate could still be set up as normal to provide closed and open-loop task control at behaviorally relevant timescales alongside a DAQ circuit recording events at a consistent temporal resolution. While this would increase the relative cost of the recording setup, identical rigs for training animals could still be configured without the DAQ circuit avoiding the additional cost and complexity.

      VRMate provides the interface between Unity and behaviorMate—therefore using the two systems together mean that no Unity or C# programming is necessary. VRMate provides a prespecified set of visual cues that can be scaled in 3 dimensions and have textures applied to them, permitting a wide variety of different scenes to be displayed. All VRMate scene details are additionally logged by behaviorMate to allow for consistency checks across experiments. The VRMate project also includes “editor scripts” that provide a drag-and-drop utility in Unity Editor for developing new scenes. Since the details pertaining to specific scenes and view angle are loaded at runtime via JSON-formatted UDP messages, it is not necessary to recompile VRMate in order to use this feature. Since we send individual position updates to VRMate from the PC, any issues with clock drift would be limited to the refresh rate of the Unity program that fast enough to be perceived as instantaneous and we have thoroughly tested the timing differences between displays using high-speed cameras and found them to be negligible. While we find using 5 separate Android computers to render scenes as described an optimal solution to maximize flexibility, it would also be possible to render all scenes on a single PC to further mitigate this concern depending on experimental demands. Finally, our treadmill implementations of behaviorMate use no monitor displays, however due to the modular design of behaviorMate virtual cues could be seamlessly added by added to any such setup by a VR context to the settings files.

      One last point to mention is that while our project is not affected by the recent changes in pricing structure of the Unity project, since the compiled software does not need to be regenerated to update VR scenes, or implement new task logic since this is handled by the behaviorMate GUI. This means the current state of the VRMate program is robust to any future pricing changes or other restructuring of the Unity program and does not rely on continued support of Unity. Additionally, the solution presented in VRMate has many benefits, however, a developer could easily adapt any open-source VR Maze project to receive the UDP-based position updates from behaviorMate or develop their own novel VR solutions. We intend to update the VR section of the manuscript to make all of this information clearer in the document as well as to provide the additional online documentation in the materials linked in the supplemental information.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The present paper introduces Oscillation Component Analysis (OCA), in analogy to ICA, where source separation is underpinned by a biophysically inspired generative model. It puts the emphasis on oscillations, which is a prominent characteristic of neurophysiological data.

      Strengths:

      Overall, I find the idea of disambiguating data-driven decompositions by adding biophysical constrains useful, interesting and worth-pursuing. The model incorporates both a component modelling of oscillatory responses that is agnostic about the frequency content (e.g., doesn’t need bandpass filtering or predefinition of bands) and a component to map between sensor and latent space. I feel these elements can be useful in practice.

      Thank you for the positive evaluation!

      Weaknesses:

      Lack of empirical support: I am missing empirical justification of the advantages that are theoretically claimed in the paper. I feel the method needs to be compared to existing alternatives.

      Thank you for bringing up this important issue.  We agree that a direct performance comparison would be important to demonstrate.  We performed additional analyses to compare OCA with ICA and one easy frequency domain exploratory technique in both simulated and real human data (see Section How does OCA compare to conventional approaches? and Supporting Text: Comparison of OCA to traditional approaches in experimental EEG data).  The results of the simulated data are shown in the revised Figure 3.  Although the slow and alpha oscillations in this simulation are statistically independent under the generative model, ICA identifies components that mix these independent signals, as one would expect based on the above discussion (i.e., all components are Gaussian).  Meanwhile, OCA is able to recover distinct slow and alpha components.  We repeated this analysis in real human EEG during propofol-induced unconsciousness and found a similar result where ICA produced components that mixed slow and alpha band signals whereas OCA identified distinct oscillatory components (see Figure S4.1).

      Reviewer #1 (Recommendations For The Authors):

      Major

      Theoretical justification. About the limitation of ICA In M/EEG, lines 24-28 seem to suggest that, almost by necessity (if Gaussianity approximately holds as argued), ICA doesn’t work on these modalities. But a body of work indicates that it does work to a reasonable extent, and that it is useful in practice; see https://www.pnas.org/doi/pdf/10.1073/pnas.1112685108?download=true. How then this theoretical claim be reconciled with the empirical evidence suggesting otherwise? I am putting this as a major comment because the limitations of ICA are one of the main motivations for this work, so it needs to be well-justified.

      Thanks for bringing this forward this important point and for suggesting the reference Brookes, et al. Their work actually supports our claim. In the fifth paragraph of the discussion section, Brookes, et al. states “ICA has been used previously and extensively for artifact rejection in MEG; however, its use in identification of oscillatory signals has remained limited. This limitation is likely due to its susceptibility to interference and the fact that amplitude-modulated oscillatory signals exhibit a largely Gaussian statistical distribution (and ICA relies on non-Gaussianity in recovered sources).” For this reason, they use the Hilbert envelope as the input to the ICA procedure rather than the original time-series. These Hilbert envelopes represent the instantaneous amplitude of neural oscillatory activity, i.e., they follow the amplitude modulation of the oscillatory activity. The method does not extract any oscillatory activity or disambiguate different oscillatory sources, but only assess the connectivity pattern within pre-defined bands, i.e., how different areas of the brain are harmonized through modulation of the oscillations or vice-versa inside those pre-defined bands. The paper did not show extracted independent time signals (tICs), focusing instead on the spatial pattern that these tICs activated. In that way, their use of ICA was totally justified.  Overall, our assessment of the limitations of ICA are very well aligned with Brookes, et al. We have added the against our claim in the introduction (see page 3 line 23) and revised the discussion section to refer to this paper (see page 21 lines 426-432).

      Empirical justification. The synthetic example is good, but I’m not quite sure what to make out of the real data examples. One can see reasonable spectra in the different bands and not-soeasy to interpret spatial topologies. But the main question is how OCA compares to more standard, easier approaches. Could the authors show explicitly how the benefits that were spelled out in the introduction/discussion manifest in practice, when compared to other methods?

      Thank you for bringing up this important issue.  We agree that a direct performance comparison would be important to demonstrate. We performed additional analyses to compare OCA with ICA and one easy frequency domain exploratory technique in both simulated and real human data (see Section How does OCA compare to conventional approaches? and Supporting Text: Comparison of OCA to traditional approaches in experimental EEG data).  The results of the simulated data are shown in the revised Figure 3 in page 12. Although the slow and alpha oscillations in this simulation are statistically independent under the generative model, ICA identifies components that mix these independent signals, as one would expect based on the above discussion (i.e., all components are Gaussian).  Meanwhile, OCA is able to recover distinct slow and alpha components. We repeated this analysis in real human EEG during propofol-induced unconsciousness and found a similar result where ICA produced components that mixed slow and alpha band signals whereas OCA identified distinct oscillatory components (see Figure S4.1 in Supporting Text: Comparison of OCA to traditional approaches in experimental EEG data).

      Minor

      "a recently-described class of state-space models" -> of the three references, one is from the sixties, another from the eighties, and the last one is 21 years old. Is this really a recent idea?

      Maybe rephrase "recently-described", or else think of more recent references that bring something new?

      We have amended the wording as suggested. (See page 4, line 53)

      Lines 72-74. It might be useful to unwrap in *intuitive* terms why the elements of this vector are closely related to the real and imaginary parts of the analytic signal.

      Thanks for the helpful comment. The sentence now reads:

      “These elements of this state vector traces out two time-series that maintains an approximate π/ 2 radian phase difference and therefore are closely related to the real and imaginary parts of an analytic signal…”. (See page 5, lines 72-75)

      Also, relatedly, I don’t seem to have access to the SI which is supposed to explain this. It doesn’t show up in the BiorXiv preprint either.

      We are sorry to hear that. BiorXiv merges all the supporting information and posts them under the Supplementary Material.

      In Eq(1) should it be R(f) instead of R(2 \pi f / f_s) ?

      Thank you for catching this typo.

      As I understand from lines 182-195, the input for the method is not channels but PCA components. Since R is learned, presumably the variance of the lower-order PCs (i.e. the latest elements of the diagonal of R) will estimated to be small. This, in turn, would make the likelihood to be heavily weighed on these components (because one basically divides their contribution by their variance). Would this potentially bias the estimation towards these lower-order PCs, at the expense of higher-order PCs. In a different context, this is shown here: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008580 Maybe it would be worth commenting on this?

      We agree with reviewer’s initial observations but disagree with the assessment. Our loglikelihood calculation reweights the components appropriately to counter the weighting coming due to spatial whitening, thus negating the above-mentioned bias. The main contribution of the spatial whitening and PCA are to make the learning numerically stable, i.e., it does not encounter underflow or overflow in the iterative steps. We also note that this spatial whitening, and the PCA are also reverted at the end to obtain the spatial components and estimated noise covariance. So, as long as we use all the components with strictly positive variances, we will not bias the log-likelihood one way or other.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study identifies new types of interactions between Drosophila gustatory receptor neurons (GRNs) and shows that these interactions influence sensory responses and behavior. The authors find that HCN, a hyperpolarization-activated cation channel, suppresses the activity of GRNs in which it is expressed, preventing those GRNs from depleting the sensillum potential, and thereby promoting the activity of neighboring GRNs in the same sensilla. HCN is expressed in sugar GRNs, so HCN dampens the excitation of sugar GRNs and promotes the excitation of bitter GRNs. Impairing HCN expression in sugar GRNs depletes the sensillum potential and decreases bitter responses, especially when flies are fed on a sugar-rich diet, and this leads to decreased bitter aversion in a feeding assay. The authors' conclusions are supported by genetic manipulations, electrophysiological recordings, and behavioral assays.

      Strengths:

      (1) Non-synaptic interactions between neurons that share an extracellular environment (sometimes called "ephaptic" interactions) have not been well-studied, and certainly not in the insect taste system. A major strength of this study is the new insight it provides into how these interactions can impact sensory coding and behavior.

      We appreciate the reviewer’ view that our findings may allow researchers to better understand sensory coding and behavior. However, we respectfully disagree that the SP homeostasis in Drosophila gustation we describe here pertains to ephaptic interaction. Although SP reduction was proposed as the basis of post-ephaptic hyperpolarization in Drosophila olfaction, we find that SP changes are found to be too slow to mediate the fast action of ephaptic inhibition in gustation, reported in the ref#17. We observed a slow, sweet-dependent SP depletion (Fig. 5B, revised), which takes more than one hour. The real-time change of SP was also slow even upon contact with 200-mM sucrose; this result was set aside for another manuscript in preparation. Therefore, we believe the main findings in this paper concern the homeostatic preservation of SP for the maintenance of gustatory function, not ephaptic interaction.

      (2) The authors use many different types of genetic manipulations to dissect the role of HCN in GRN function, including mutants, RNAi, overexpression, ectopic expression, and neuronal silencing. Their results convincingly show that HCN impacts the sensillum potential and has both cell-autonomous and nonautonomous effects that go in opposite directions. There are a couple of conflicting or counterintuitive results, but the authors discuss potential explanations.

      (3) Experiments comparing flies raised on different food sources suggest an explanation for why the system may have evolved the way that it did: when flies live in a sugar-rich environment, their bitter sensitivity decreases, and HCN expression in sugar GRNs helps to counteract this decrease.

      Weaknesses/Limitations:

      (1) The genetic manipulations were constitutive (e.g. Ih mutations, RNAi, or misexpression), and depleting Ih from birth could lead to compensatory effects that change the function of the neurons or sensillum. Using tools to temporally control Ih expression could help to confirm the results of this study.

      We attempted to address this point by using the tub-Gal80ts system. The result is now included as Fig. 1-figure supplement 2. At 29C, a non-permissive temperature for GAL80ts which allows GAL4-dependent expression Ih-RNAi, we observed that bGRN responses were decreased and sGRN responses were increased compared to the control maintained at 18°C, and this is in parallel with the result in Fig. 1C,D. For this experiment, we inserted “To exclude the possibility that Ih is required for normal gustatory development, we temporally controlled Ih RNAi knockdown to occur only in adulthood, which produced similar results (Fig. 1-figure supplement 2).” (~line 113).

      (2) The behavioral experiment shows a striking loss of bitter sensitivity, but it was only conducted for one bitter compound at one concentration. It is not clear how general this effect is. The same is true for some of the bitter GRN electrophysiological experiments that only tested one compound and concentration.

      We conducted additional behavioral experiments with other bitters such as lobeline and theophylline (Fig. 5-figure supplement 1), which showed sensitivity losses in Ih mutants similar to caffeine. For these results, the following is inserted at ~line 274: “These results were recapitulated with other bitters, lobeline and theophylline (Fig. 5-figure supplement 1).”

      We also added single sensillum recording data with bitters, berberine, lobeline, theophylline and umbelliferone, which yielded results similar to those obtained with caffeine (Fig. 1-figure supplement 1). This is described with the sentence at ~line 105 “Other bitter chemical compounds, berberine, lobeline, theophylline, and umbelliferone, also required Ih for normal bGRN responses (Fig. 1-figure supplement 1).”

      (3) Several experiments using the Gal4/UAS system only show the Gal4/+ control and not the UAS/+ control (or occasionally neither control). Since some of the measurements in control flies seem to vary (e.g., spiking rate), it is important to compare the experimental flies to both controls to ensure that any observed effects are in fact due to the transgene expression.

      We appreciate the reviewers for raising this point. Indeed, there was a small logical flaw with the controls. We have now included all the necessary controls for Fig. 1C-F, Fig. 2I,J, Fig. 4E, and Fig. 5D, as reviewers suggested. These experiments remained statistically significant after including the new control groups.

      (4) I was surprised that manipulations of sugar GRNs (e.g. Ih knockdown, Gr64a-f deletion, or Kir silencing) can impact the sensillum potential and bitter GRN responses even in experiments where no sugar was presented.

      We are afraid there is a misunderstanding on the early part of the paper. We suspected that the manipulations impacted bGRNs and SP due to the sweetness in the regular cornmeal food, as stated in lines 214-220 “Typically, we performed extracellular recordings on flies 4-5 days after eclosion, during which they were kept in a vial with fresh regular cornmeal food containing ~400 mM D-glucose. The presence of sweetness in the food would impose long-term stimulation of sGRNs, potentially requiring the delimitation of sGRN excitability for the homeostatic maintenance of gustatory functions. To investigate this possibility, we fed WT and Ihf03355 flies overnight with either non-sweet sorbitol alone (200 mM) or a sweet mixture of sorbitol (200 mM) + sucrose (100 mM).”

      I believe the authors are suggesting that the effects of sugar GRN activity (e.g., from consuming sugar in the fly food prior to the experiment) can have long-lasting effects, but it wasn't entirely clear if this is their primary explanation or on what timescale those long-lasting effects would occur. How much / how long of a sugar exposure do the flies need for these effects to be triggered, and how long do those effects last once sugar is removed?

      We attempted to address this point with additional experiments (Fig. 5A,B). The reduction of SP could be observed in WT and HCN-deficient mutants with similar degrees 1 hr after the flies were transferred from nonsweet sorbitol-containing vials to sweet sucrose-containing ones. Moreover, the mutants, but not WT, showed further depression of SP when the sweetness persisted in the media for 4 hrs and overnight. This long-term exposure to sweetness longer than 1 hr may simulates the feeding on the regular sweet cornmeal food. The recovery of SP was also tested by removing flies from the sweet media after overnight-long sweet exposure and placing them in sorbitol food. SPs of WT and the mutants were recovered to the similar levels 1 hr after separating the animals from sweetness, although the HCN-lacking mutants showed much lower SP right after overnight sweetness exposure. The unimpaired recovery of the mutants suggests that HCN is independent of generating transepithelial potential itself. Therefore, regardless of HCN, SP changes are not fast even in the presence of strong sweetness, and SP is much better guarded when sGRNs express HCN in a sweet environment.

      We inserted the following at ~line 260 to describe the newly added recovery experiment: “Following overnight sweet exposure, SPs of WT and Ihf03355 were recovered to similar levels after 1-hr incubation with sorbitol only food. However, it was after 4 hrs on the sorbitol food that the two lines exhibited SP levels similar to those achieved by overnight incubation with sorbitol only food (Fig. 5B). These results indicate that SP depletion by sweetness is a slow process, and that the dysregulated reduction and recovery of SPs in Ihf03355 manifest only after long-term conditioning with and without sweetness, respectively.”.

      (5) The authors mention that HCN may impact the resting potential in addition to changing the excitability of the cell through various mechanisms. It would be informative to record the resting potential and other neuronal properties, but this is very difficult for GRNs, so the current study is not able to determine exactly how HCN affects GRN activity.

      On this point, we cannot but rely on previous studies of biophysical and electrophysiological characterization on mammalian HCN channels and a heterologous expression study that revealed a robust hyperpolarization-activated cation current from Drosophila HCN channels (PMID: 15804582).

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors start by showing that HCN loss-of-function mutation causes a decrease in spiking in bitter GRNs (bGRN) while leaving sweet GRN (sGRN) response in the same sensillum intact. They show that a perturbation of HCN channels in sweet-sensing neurons causes a similar decrease while increasing the response of sugar neurons. They were also able to rescue the response by exogenous expression. Ectopic expression of HCN in bitter neurons had no effect. Next, they measure the sensillum potential and find that sensillum potential is also affected by HCN channel perturbation. These findings lead them to speculate that HCN in sGRN increases sGRN spiking which in turn affects bGRNs. To test this idea that carried out multiple perturbations aimed at decreasing sGRN activity. They found that decreasing sGRN activity by either using receptor mutant or by expressing Kir (a K+ channel) in sGRN increased bGRN responses. These responses also increase the sensillum potential. Finally, they show that these changes are behaviorally relevant as conditions that increase sGRN activity decrease avoidance of bitter substances.

      Strengths:

      There is solid evidence that perturbation of sweet GRNs affects bitter GRN in the same sensillum. The measurement of transsynaptic potential and how it changes is also interesting and supports the authors' conclusion.

      Weaknesses:

      The ionic basis of how perturbation in GRN affects the transepithelial potential which in turn affects the second neuron is not clear.

      We speculate that HCN-dependent membrane potential regulation, rather than ionic composition change, is responsible for the observed SP preservation, as further discussed as an author response in the section of “Recommendations for the authors”. The transepithelial potential can be dissipated by increased conductance through receptor-linked ion channels following gustatory receptor activation in GRNs. The volume of the sensillum lymph is very small according to electron micrographs of horizontally sliced bristles (PMID: 11456419). Therefore, robust excitation of a gustatory neuron may easily deplete the extracellular potential built as a form of polarized ion concentrations across the tight junction. When the consumption is too strong and extended, the neighboring neuron, which share TEP with the activated GRN, can be negatively affected. We propose that HCN suppresses overexcitation of sGRNs by means of membrane potential stabilization. This stabilization prevents sGRNs from excessively reducing the TEP, thereby protecting the activity of neighboring bGRNs.

      Reviewer #3 (Public Review):

      Ephaptic inhibition between neurons housed in the same sensilla has been long discovered in flies, but the molecular basis underlying this inhibition is underexplored. Specifically, it remains poorly understood which receptors or channels are important for maintaining the transepithelial potential between the sensillum lymph and the hemolymph (known as the sensillum potential), and how this affects the excitability of neurons housed in the same sensilla.

      Although a reduction of sensillum potential was proposed to underlie membrane hyperpolarization of post-ephaptic olfactory neurons in Drosophila, our preliminary data (not shown due to a manuscript in preparation) and the results included in the paper (Fig. 5B) strongly suggest that SP reduction is not a requisite for ephaptic inhibition at least in GRNs. Ephaptic inhibition is expected to be instantaneous, whereas we find that SP reduction in gustation is very slow. Therefore, we would like to indicate that the findings we report in this manuscript are not directly related to ephaptic inhibition.

      Lee et al. used single-sensillum recordings (SSR) of the labellar taste sensilla to demonstrate that the HCN channel, Ih, is critical for maintaining sensillum potential in flies. Ih is expressed in sugar-sensing GRNs (sGRNs) but affects the excitability of both the sGRNs and the bitter-sensing GRNs (bGRNs) in the same sensilla. Ih mutant flies have decreased sensillum potential, and bGRNs of Ih mutant flies have a decreased response to the bitter compound caffeine. Interestingly, ectopic expression of Ih in bGRNs also increases sGRN response to sucrose, suggesting that Ih-dependent increase in sensillum potential is not specific to Ih expressed in sGRNs. The authors further demonstrated, using both SSR and behavior assays, that exposure to sugars in the food substrate is important for the Ih-dependent sensitization of bGRNs. The experiments conducted in this paper are of interest to the chemosensory field. The observation that Ih is important for the activity in bGRNs albeit expressed in sGRNs is especially fascinating and highlights the importance of non-synaptic interactions in the taste system.

      Despite the interesting results, this paper is not written in a clear and easily understandable manner. It uses poorly defined terms without much elaboration, contains sentences that are borderline unreadable even for those in the narrower chemosensory field, and many figures can clearly benefit from more labeling and explanation. It certainly needs a bit of work.

      We would like to revise the language aspect of the manuscript after finalizing the scientific revision.

      Below are the major points:

      (1) Throughout the paper, it is assumed that Ih channels are expressed in sugar-sensing GRNs but not bitter-sensing GRNs. However, both this paper and citation #17, another paper from the same lab, contain only circumstantial evidence for the expression of Ih channels in sGRNs. A simple co-expression analysis, using the Ih-T2A-GAL4 line and Gr5a-LexA/Gr66a-LexA line, all of which are available, could easily demonstrate the co-expression. Including such a figure would significantly strengthen the conclusion of this paper.

      We did conduct confocal imaging with Ih-T2A-Gal4 in combination with GRN Gal4s (ref#17 version2). The expression is very broad, including both neurons and non-neuronal cells. We observed much stronger sGRN expression than bGRN expression. But the promiscuous expression of the reporter in many cells hindered us from clearly demonstrating the void of the reporter in bGRNs. However, the functional and physiological examination of Ih-T2A-Gal4 with the neuronal modifiers such as TRPA1 and Kir2.1 in ref#17 indicates the strong and little expression of Ih in sGRNs and bGRNs, respectively. Furthermore, the RNAi kd results present another line of evidence that HCN expressed in sGRNs regulates SP and bGRN activity (Fig. 1C,D, Fig. 1-figure supplement 2). Ih-RNAi expression in bGRNs did not result in any statistically significant changes in the activities of sGRNs and bGRNs compared to controls (Fig. 1C,D, revised), advocating that Ih acts in sGRNs for the functional homeostasis of SP and GRNs, as we claim.

      (2) Throughout this paper, it is often unclear which class of labellar taste sensilla is being recorded. S-a, S-b, I-a, and I-b sensilla all have different sensitivities to bitters and sugars. Each figure should clearly indicate which sensilla is being recorded. Justification should be provided if recordings from different classes of sensilla are being pooled together for statistics.

      We mainly performed SSR (single sensillum recording) on i-type bristles as they have the simplest composition of GRNs compared to s- and L-type bristles. As single s-types also contain each of s- and bGRN, we measured SP also for s-types (Figs. 2, 3F and 4D). In case of Fig.3-figure supplement 1, L-types were tested for the relationship between water cell activity and SP. Now all the panels are labelled with the tested bristle types.

      (3) In many figures, there is a lack of critical control experiments. Examples include Figures 1C-F (lacking UAS control), Figure 2I-J (lacking UAS control), Figure 4E (lacking the UAS and GAL4 control, and it is also strange to compare Gr64f > RNAi with Gr66a > RNAi, instead of with parental GAL4 and UAS controls.), and Figure 5D (lacking UAS control). Without these critical control experiments, it is difficult to evaluate the quality of the work.

      Thank you for pointing this out. We appreciate the feedback and have addressed these concerns by including all the requested controls in the figures. Specifically, we have added the UAS controls for Figs 1C-F and 2I-J, as well as the UAS and GAL4 controls for Fig. 4E. We have also included the UAS control for Fig. 5D.

      (4) Figure 2A could benefit from more clarification about what exactly is being recorded here. The text is confusing: a considerable amount of text is spent on explaining the technical details of how SP is recorded, but very little text about what SP represents, which is critical for the readers. The authors should clarify in the text that SP is measuring the potential between the sensillar lymph, where the dendrites of GRNs are immersed, and the hemolymph. Adding a schematic figure to show that SP represents the potential between the sensillar lymph and hemolymph would be beneficial.

      SP was defined at lines 55-56 in the first paragraph of introduction, which also contains the background information for SP as a transepithelial potential. As reviewer suggested, we now also included a sentence describing SP (“SP is known as a transepithelial potential between the sensillum lymph and the hemolymph, generated by active ion transport through support cells”, line 126) and a drawing to illustrate the concept of SP (Fig. 2A), and revised the legend.

      (5) The sGRN spiking rate in Figure 4B deviates significantly from previous literature (Wang, Carlson, eLife 2022; Jiao, Montell PNAS 2007, as examples), and the response to sucrose in the control flies is not dosage-dependent, which raises questions about the quality of the data. Why are the responses to sucrose not dosage-dependent? The responses are clearly not saturated at these (10 mM to 100 mM) concentrations.

      Our recordings show different spiking frequencies from others’ work, because the frequencies are from 5-sec bins not only first 0.5 sec. This lowers the frequencies, as spikes are relatively more frequent in the beginning of the recording (Fig. 4-figure supplement 1).

      Why are the responses to sucrose not dosage-dependent? The responses are clearly not saturated at these (10 mM to 100 mM) concentrations.

      We were also puzzled with the flat dose dependence to sucrose. This result may suggest the existence of another mechanism moderating sucrose responses of sGRNs. This flat curve reappeared with other genotypes with the same concentration range (5-50 mM) in Fig. 4E. However, 1-mM sucrose produced much lower spiking frequencies (Fig. 4E), suggesting that sGRN responses are saturated at 5 mM sucrose with our recording/analysis condition.

      (6) In Figure 4C, instead of showing the average spike rate of the first five seconds and the next 5 seconds, why not show a peristimulus time histogram? It would help the readers tremendously, and it would also show how quickly the spike rate adapts to overexpression and control flies. Also, since taste responses adapt rather quickly, a 500 ms or 1 s bin would be more appropriate than a 5-second bin.

      Taste single sensillum recording starts by contacting stimulants, which bars us from recording pre-stimulus responses of GRNs. Therefore, we showed post-stimulus graphs with 1-sec bins (Fig. 4-figure supplement 1) as we reviewer suggested.

      (7) Lines 215 - 220. The authors state that the presence of sugars in the culture media would expose the GRNs to sugar constantly, without providing much evidence. What is the evidence that the GRNs are being activated constantly in flies raised with culture media containing sugars? The sensilla are not always in contact with the food.

      We agree with reviewer. We replaced “long-term stimulation of sGRNs” with “strong and frequent stimulation of sGRNs for extended period”. The word long-term may be interpreted to be constant.

      (8) Line 223. To show that bGRN spike rates in Ih mutant flies "decreased even more than WT", you need to compare the difference in spike rates between the sorbitol group and the sorbitol + sucrose group, which is not what is currently shown.

      The data were examined by ANOVA and a multiple comparison test (Dunn’s) between all the groups regardless of genotypes and conditions in the panel (all the groups sharing the y axis). Therefore, the differences were statistically examined. However, the cited expression we used read like it was about the slope or extent of the decrease. We intended to indicate the difference in the absolute values of spiking frequencies after overnight sweet exposure between the genotypes, while bGRN activities were statistically indifferent between WT and Ih mutants when they were kept only on sorbitol food. We revised it to “decreased to the level significantly lower than WT”. We also changed the graph style to effectively present the trend of changes in bGRN sensitivity with comparison between genotypes. Again, the groups were statistically examined together regardless of the genotypes and conditions.

      (9) To help readers better understand the proposed mechanisms here, including a schematic figure would be helpful. This should show where Ih is expressed, how Ih in sGRNs impacts the sensillum potential, how elevated sensillum potential increases the electrical driving force for the receptor current, and affects the excitability of the bGRNs in the same sensilla, and how exposure to sugar is proposed to affect ion homeostasis in the sensillum lymph.

      As reviewer suggested, we included two panels to show working model for gustatory homeostasis via SP maintenance by HCN (Fig. 5E,F).

      Reviewer #1 (Recommendations For The Authors):

      (1) The relationship between this paper and the authors' bioRxiv preprint posted last year is not clear. In the introduction they made it seem like this paper is a follow-up that builds on the preprint, but most or all of the experiments in this paper were already performed in the preprint. I guess the authors are planning to divide the original paper into two papers. I would suggest updating the preprint to avoid confusion.

      Thank you for the comment. We updated the preprint to be without a part of Fig.6 and entire Fig.7 along with associated texts. As reviewer pointed out, our eLife paper was spun off from the part of the preprint paper, because we feel that the two stories could confuse readers when presented together.

      (2) Have the authors considered testing responses of water GRNs? They reside in the same sensilla as sugar neurons, so are they also increased affected by Ih mutation or RNAi in sugar neurons? This would strengthen the evidence that the indirect (non-cell autonomous) effects of Ih are due to the sensillum potential and not some specific interaction between sweet and bitter cells.

      As reviewer proposed, we appraised water GRN activity in the L-type bristles of WT, Ihf03355 and a genomic rescue line for Ihf03355. Spiking responses in water GRNs were evoked by hypo-osmolarity of electrolyte (0.1 mM tricholine citrate-TCC). Interestingly, the Ih mutant showed reduced 0.1 mM TCC-provoked spiking frequencies compared to WT. This impairment was rescued by the genomic fragment containing an intact Ih locus (Figure 3-figure supplement 1A).

      Additionally, SPs in L-type bristles were reduced by Ih deficiencies but increased in Gr64af, suggesting that HCN regulates sGRNs in L-type bristles as well (Figure 3-figure supplement 1B). Again, the bristles of animals with both mutations together exhibited SPs similar to those of WT.

      Furthermore, when we conducted cDNA rescue experiments in L bristles, introduction of Ih-RF cDNA in sGRNs restored SPs, while expressing it in bGRNs did not unlike the results from the i- and s-bristles (Fig. 2K,L), likely because L-bristles lack bGRNs. These cDNA rescue and genetic interaction experiments were conducted using flies fed on fresh cornmeal food with strong sweetness, suggesting that the sweetness in the media is the likely key factor producing the genetic interaction and necessitating HCN, consistent with other results in the manuscript. Therefore, SP regulation by HCN is observed in the L-type bristles.

      Minor comments:

      Line 52: typo, "Many of"

      Thank you. Corrected

      Line 95: typo, "sensilla do an sGRN"

      Corrected

      Line 98: typo, "we observed reduced the spiking responses"

      Corrected

      Line 206: typo, "a relatively low sucrose concentrations"

      Corrected

      Line 260: "inverse relationship between the two GRNs in excitability" - I am not exactly sure what data you are referring to.

      Although alleles did not show increased sGRN activities, knockdown of Ih decreased bGRN activity but increased sGRN activity (Fig. 1C,D, Fig.1-figure supplement 2B), while suppression of sGRNs increased bGRN activity (Fig. 3). To clarify this point, we revised the phrase to “the inverse relationship between the two GRNs in excitability observed in Fig. 1C,D, Fig. 1-figure supplement 2B, and Fig. 3”.

      Methods: typo, "twenty of 3-5 days with 10 males and 10 females"

      Corrected to “Twenty flies, aged 3-5 days and consisting of 10 males and 10 females,”

      Methods: typo, "Kim's wipes" should be "Kimwipes"

      Corrected

      Reviewer #2 (Recommendations For The Authors):

      (1) More clarification is necessary on Transepithelial potential (TEP). TEP is typically created by having pumps and tight junctions between the sensillar lymph and the hemolymph.

      We have an introduction to TEP or SP in the context of sensory functions (lines 40-57) with relevant references. The involvement of pumps and tight junction was mentioned in the same paragraph; “Glia-like support cells exhibit close physical association with sensory receptor neurons, and conduct active transcellular ion transport, which is important for the operation of sensory systems” (line 40) and “Tight junctions between support cells separate the externally facing sensillar lymph from the internal body fluid known as hemolymph” (line 53).

      It is not clear how HCN channels in one of the neurons might change the composition of the sensillum lymph. An explanation of their model of how TEP depends on HCN is necessary.

      Although the ionic composition of the sensillum lymph is a contributing factor to the sensillum potential, it is more conceptually relevant to describe our findings with the perspective of membrane potential regulation given the role of HCN in membrane potential stabilization as discussed in our manuscript.

      We speculate that HCN controls the membrane potential at rest and/or in motion to modulate sGRN activity towards saving SP despite the sweetness in the niche. We positioned our results in relation to SP in discussion; “Our results provide multiple lines of evidence that HCN suppresses HCN-expressing GRNs, thereby sustaining the activity of neighboring GRNs within the same sensilla. We propose that this modulation occurs by restricting SP consumption through HCN-dependent neuronal suppression rather than via chemical and electrical synaptic transmission.” (lines 252-255). Moreover, it is unclear whether HCN is localized to the dendrite bathed in the sensillum lymph to influence the ionic composition of the lymph. It would be very interesting to study in future whether the ionic flow through HCN channels itself is critical for the function of HCN in this context, and whether HCN is exclusively present in the dendrite to support the postulation. However, we would like to remind reviewer that Kir2.1 and HCN channels in sGRNs showed similar effects on SP and bGRNs, while they differ in Na+ conductance.

      In the initially submitted manuscript (lines 325-343), we discussed the potential mechanism by which Kir2.1 and HCN channels commonly increase SP in terms of how the membrane potential regulation in the soma can control the SP consumption in the dendrite of sGRNs.

      Another point about the TEP that needs some explanation is that these sensilla are open to the environment as tastants must flow in and are different from mechanical sensilla in that sense.

      This is a very important question regarding the general physiology of the taste sensilla, as the sensillum lymph is in contact with the external environment through the pore of the sensillum. It is indeed interesting to consider how the composition and potential of the lymph are maintained despite the relatively vast volume of food the sensilla encounter during gustation and the continuous evaporation to air between episodes of gustation. However, we believe that this question, while important, is distinct from the primary focus of our manuscript.

      Are the TEP measurements in Figure 2 under control conditions where there are no tastants?

      There is no tastant in the SP-measuring glass electrode other than the electrolyte. We apologize that we did not specify the recording electrode condition. We inserted a clause in the method; “For SP recordings, the recording electrode contained 2 mM TCC as the electrolyte, and…”

      Does the TEP change dynamically as sGRN is activated?

      SP does shift in response to sweets. Please see Fig. 5B. Also, we showed SP changes by mechanical stimuli, which depended on the mechanoreceptor, NompC (Fig. 2D-F). Mechanoreceptor neurons share the sensillum lymph with GRNs.

      (2) More clarification on the potential transduction mechanism and how TEP affects one neuron differentially. Essentially, sGRN perturbation affects sGRN activity and it affects the TEP. More explanation is needed for the potential ionic mechanism of each.

      Our results strongly suggest that HCN lowers the activity of HCN-expressing GRNs, mitigating SP consumption. This modulation is crucial because the SP serves as a driving force for neuronal activation within the sensillum. HCN is particularly necessary in sGRNs because of the flies’ sweet feeding niche, which is expected to result in frequent and strong activation of sGRNs. The SP saved by HCN-dependent delimitation of sGRNs can be used to raise the responsibility of bGRNs.

      (3) The authors refer to their own unreviewed paper (Reference 17). This paper is on a similar topic and there seems to be some overlap. Clarification on this point would be important.

      We revised the biorxiv preprint, so that the preprint version 2 does not contain the parts overlapping with this eLife paper. This eLife paper was originally part of the preprint paper, but it was separated to clarify the messages of the two stories. As we explained in Discussion (lines 276-297), HCN provides resistance to both hyperpolarization and depolarization of the membrane potential. Simply put, one paper focuses on the role of HCN in resisting hyperpolarization, while the other (this paper in eLife) focuses on resisting depolarization.

      (4) Methods are sparse. Many details on the method are necessary. For example, Sensilla recordings are being done by the tip-dip method (I assume). What does "number of experiments" mean in Figure 1? Is it the number of animals or the number of sensilla? How many trials/sensilla?

      We indicated the extracellular recording was performed by the tip-dip method; “In vivo extracellular recordings were performed by the tip-dip method as detailed previously”. We also added a statement on the number of experiments; “The number of experiments indicated in figures are the number of naïve bristles tested. The naïve bristles were from at least three different animals.”

      (5) Figure 1: I understand the author's interpretation. But if one compares WT in Figure 1A to Gr64a-IhRNAi in 1C, we can come to the conclusion that there is no change. In other words, the control in Figure 1C (grey) has a much higher response than WT. Similar conclusions can be made for other experiments. Is the WT response stable enough to make the conclusions made here?

      The genetic background of each genotype may influence GRN activity to some extent. RNAi knockdown experiments are well-known for their hypomorphic nature, and their effects should be evaluated by comparison with their parental controls such as Gal4 and UAS lines. As all reviewers pointed out, we added the results from UAS control. This effort confirms that Gr89a>Ih RNAi is statistically indifferent to UAS control as well as Gr64f-Gal4 control in bGRN spiking evoked by 2-mM caffeine, while Gr64f>Ih RNAi showed reduced bGRN responses to 2 mM caffeine compared to all the controls.

      (6) Figure 3: Why is bGRN spiking not plotted against sensillum potential to observe the dependence more directly?

      This is a very interesting suggestion. We are not, however, equipped to measure spiking and sensillum potential simultaneously. Therefore, they are independent experiments, and we treated them accordingly.

      (7) Figure 4: Why bGRN response is only affected at high caffeine concentrations is not clear.

      We were also surprised by the differences in the dose dependence results of b- and sGRNs, genetically manipulated to mis-express and over-express HCN in Fig. 4A and 4E, respectively. Each gustatory neuron likely has distinct sets of players and parameters that set its own membrane potential and excitability.

      We can think of a possibility that there might be a range of membrane potentials within which HCN does not engage. In bGRNs, the resting membrane potential may lie low within this range, so that some degrees of membrane depolarization by low concentrations of caffeine do not significantly close HCN channels, thus preventing their hyperpolarizing effects. On the other hand, the membrane potential of sGRNs may be high within this range, showing suppressive effects at all tested sucrose concentrations. However, we find this explanation is too speculative to include in the main text, while we stated in the original manuscript, “implying a complex cell-specific regulation of GRN excitability.” (line 210).

      (8) Minor:

      L98 - there is a small typo

      Corrected

      L274: "funny" !?

      “Funny” currents, denoted If, were initially observed by electrophysiologists and later attributed to HCN channels, now indicated by Ih (thus the gene name Ih in Drosophila). These currents were termed "funny" due to their unusual properties compared to other currents. For more detailed information, please refer to the cited references.

      L257: Neuropeptide seemed to be abrupt

      We attempted to discuss possible mechanisms that mediate excitability changes across GRNs beyond the mechanism by SP shifts. Neuropeptides, which are chemical neurotransmitters along with small neurotransmitters, were mentioned following the discussion on synaptic transmission to suggest alternative pathways for excitability regulation. This inclusion is meant to provide a comprehensive overview of potential mechanisms influencing GRN activity.

      Reviewer #3 (Recommendations For The Authors):

      Congratulations on your fascinating research! The results are certainly of interest to the chemosensory field. However, I suggest using academic editing services to enhance the clarity of your text and ensure that the terminology and jargon align with standard usage in the field. The current choice of words may not be consistent with commonly used terms. As it is now, the writing might not fully showcase the compelling story and the effort behind your study, and is underselling your interesting results. Proper refinement could make sure your valuable findings are appropriately recognized.

      We appreciate your comments and apologize for any difficulties reviewers faced during the review process. We are currently prioritizing the review of scientific content and plan to address language issues in a subsequent revision. It would be very helpful for future revisions if the problematic sentences or expressions could be indicated in detail after this revision. This will allow us to ensure that our terminology and expression align with standard usage in the field, and that our findings are clearly and effectively communicated.

      Minor points:

      (1) Line 110: what is Ih-RF?

      We apologize that we relied on a reference in describing the cDNA. The following clause was inserted with additional reference and the Flybase id: “(Flybase id: FBtr0290109), which previously rescued Ih deficiency in other contexts17,26 ,”  

      (2) Line 158: Gr64af mutant flies still have Gr5a and a residual response to fructose and sucrose (Slone, Amrein 2007).

      We revised the line to “is severely impaired in sucrose and glucose sensing”, since there is a substantial loss of sucrose and glucose sensing in both Gr64af from Kim et al 2018 and DGr64 from Slone et al 2007, when they were examined by the proboscis extension reflex assay. This was also confirmed in the study by Jiao et al 2009. We also deleted “sugar-ageusic” and instead describe the mutant “impaired in sucrose and glucose sensing” in Fig. 3 legend.

      (3) Lines 264-273 seem unnecessary. This paper is not about the function of HCN in mammals, and these discussions seem largely irrelevant.

      We feel that it is important to position our results within a broader context by discussing the potential implications of our findings for sensory systems of other animals. As we stated, HCN channels have been localized in mammalian sensory systems, but their roles are often not well understood. By including this discussion, we aim to highlight the relevance of our findings beyond the model organism used in our study and suggest possible areas for future research in mammalian systems.

    1. Author response:

      We would like to 1) response one comment from the public review, which is also related to the eLife assessment, and 2) give provisional author responses.

      (1) Regarding the definition of the colonization-extinction rate, the first reviewer may misunderstand it: “However, there does not need to be a temporal trend! Any warm-adapted species that colonizes a site has a positive net effect on CTI; similarly, any cold-adapted species that goes extinct contributes to thermophilization.” We here clarify the definition:

      In a single iteration of our MSOM (Multi-species occupancy model), the occupancy rate of species[n] in transect[i] from year[t-1] to year[t] is related to the colonization rate and extinction rate, and is defined as:<br /> muz[n,i,t] = z[n,i,t-1]*(1-eps[n,i,t-1]) + (1-z[n,i,t-1])*gam[n,i,t-1], (also shown in Line411 in our MS).

      If the colonization rate (gam) and extinction rate (eps) remain constant, the occupancy rate(muz) will be a constant number which is related to the state of real occupancy (0 or 1). The occupancy rate will only increase if colonization rate increases (or the extinction rate decreases). That is why we are considering the temporal trend in colonization/extinction rate.

      (2) Provisional author responses:

      We will revise and improve the manuscript according to the public reviews and mainly focus on:

      (1) clarify the general definition of habitat fragmentation in the Introduction.

      (2) provide a wider perspective about how our results can be applied to conservation biology in the Discussion.

      (3) discuss the diversity of isolation metrics for future research and provide more evidence about the link between larger areas and higher habitat diversity or heterogeneity.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The authors isolated and cultured pulmonary artery smooth muscle cells (PASMC) and pulmonary artery adventitial fibroblasts (PAAF) of the lung samples derived from the patients with idiopathic pulmonary arterial hypertension (PAH) and the healthy volunteers. They performed RNA-seq and proteomics analyses to detail the cellular communication between PASMC and PAAF, which are the main target cells of pulmonary vascular remodeling during the pathogenesis of PAH. The authors revealed that PASMC and PAAF retained their original cellular identity and acquired different states associated with the pathogenesis of PAH, respectively.

      Strengths:

      Although previous studies have shown that PASMC and PAAF cells each have an important role in the pathogenesis of PAH, there have been scarce reports focusing on the interactions between PASMC and PAAF. These findings may provide valuable information for elucidating the pathogenesis of pulmonary arterial hypertension.

      We appreciate the reviewer’s positive view of our study.

      Weaknesses:

      The results of proteome analysis using primary culture cells in this paper seem a bit insufficient to draw conclusions. In particular, the authors described "We elucidated the involvement of cellular crosstalk in regulating cell state dynamics and identified pentraxin-3 and hepatocyte growth factor as modulators of PASMC phenotypic transition orchestrated by PAAF." However, the presented data are considered limited and insufficient.

      We thank the reviewer for drawing our attention to this point and we will modify our statements and conclusions accordingly, in order to avoid making too general and broad claims.

      Reviewer #2 (Public Review):

      Summary:

      Utilizing a combination of transcriptomic and proteomic profiling as well as cellular phenotyping from source-matched PASMC and PAAFs in IPAH, this study sought to explore a molecular comparison of these cells in order to track distinct cell fate trajectories and acquisition of their IPAH-associated cellular states. The authors also aimed to identify cell-cell communication axes in order to infer mechanisms by which these two cells interact and depend upon external cues. This study will be of interest to the scientific and clinical communities of those interested in pulmonary vascular biology and disease. It also will appeal to those interested in lung and vascular development as well as multi-omic analytic procedures.

      We thank the reviewer forvery positive assessment of our study.

      Strengths:

      (1) This is one of the first studies using orthogonal sequencing and phenotyping for the characterization of source-matched neighboring mesenchymal PASMC and PAAF cells in healthy and diseased IPAH patients. This is a major strength that allows for direct comparison of neighboring cell types and the ability to address an unanswered question regarding the nature of these mesenchymal "mural" cells at a precise molecular level.

      We value the reviewer’s kind and objective summary of our study.

      (2) Unlike a number of multi-omic sequencing papers that read more as an atlas of findings without structure, the inherent comparative organization of the study and presentation of the data were valuable in aiding the reader in understanding how to discern the distinct IPAH-associated cell states. As a result, the reader not only gleans greater insight into these two interacting cell types in disease but also now can leverage these datasets more easily for future research questions in this space.

      We thank the reviewer for this highly positive comment.

      (3) There are interesting and surprising findings in the cellular characterizations, including the low proliferative state of IPAH-PASMCs as compared to the hyperproliferative state in IPAH-PAAFs. Furthermore, the cell-cell communication axes involving ECM components and soluble ligands provided by PAAFs that direct cell state dynamics of PASMCs offer some of the first and foundational descriptions of what are likely complex cellular interactions that await discovery.

      We agree with the reviewer’s assessment that some of the novel data in our study helps to formulate testable hypothesis that can be followed up in future research.

      (4) Technical rigor is quite high in the -omics methodology and in vitro phenotyping tools used.

      We are grateful for reviewer’s recognition and positive assessment of our work.

      Weaknesses:

      There are some weaknesses in the methodology that should temper the conclusions:

      (1) The number of donors sampled for PAAF/PASMCs was small for both healthy controls and IPAH patients. Thus, while the level of detail of -omics profiling was quite deep, the generalizability of their findings to all IPAH patients or Group 1 PAH patients is limited.

      We share the reviewers concerns regarding the generalizability of the findings. Indeed, the initial number of samples used for the omics study (n=4 in each group) was limited due to the unique setup of using source-matched cells from the same pulmonary artery. While we included additional samples in our phenotypic assays (n=6) which further confirmed our findings,  we will acknowledge the small number of samples in the revised manuscript as a limiting factor in drawing definite conclusions for all PAH patients.

      (2) While the study utilized early passage cells, these cells nonetheless were still cultured outside the in vivo milieu prior to analysis. Thus, while there is an assumption that these cells do not change fundamental behavior outside the body, that is not entirely proven for all transcriptional and proteomic signatures. As such, the major alterations that are noted would be more compelling if validated from tissue or cells derived directly from in vivo sources. Without such validation, the major limitation of the impact and conclusions of the paper is that the full extent of the relevance of these findings to human disease is not known.

      We thank the reviewer for this constructive and excellent suggestion. Changes induced by ex vivo culturing are a common challenge when working with primary human cells. We agree with the reviewer that the proposed comparison with the publicly available sequencing datasets utilizing fresh samples will provide the readers with sufficient information to more objectively put the findings of our study into perspective.

      (3) While the presentation of most of the manuscript was quite clear and convincing, the terminology and conclusions regarding "cell fate trajectories" throughout the manuscript did not seem to be fully justified. That is, all of the analyses were derived from cells originating from end-stage IPAH, and otherwise, the authors were not lineage tracing across disease initiation or development (which would be impossible currently in humans). So, while the description of distinct "IPAH-associated states" makes sense, any true cell fate trajectory was not clearly defined.

      In accordance with reviewer’s comment, we will more carefully choose the wording in order to better reflect our findings.

    1. Author response:

      Reviewer #1 (Public Review):

      Weaknesses:

      With the exception of the PCR analysis and the reporter assays, the manuscript does not contain any experiments or attempts to analyze current expression from any of the identified proviruses. No long-read RNASeq or other RNA analysis on cytoplasmic RNA was performed, nor any experiments to show that proteins are indeed expressed.

      We agree that an investigation of RNA and protein expression from these proviruses would be very interesting, and we hope to do such work in the future to test whether this clade is still actively infecting any primate species. However, we believe that such an investigation is out of the scope of this manuscript, which is focused on the past evolutionary history of these viruses. However, it is worth noting that we do show evidence for proviral expression at the RNA level in Fig. 6 supplement 1, showing alignment of publically available rhesus macaque iPSC RNAseq data to the SERV-K1 provirus, including both spliced and full length viral RNA. Interestingly, there appear to be reads derived from multiple proviruses, as some reads originate from proviruses with large internal deletions, while others derive from full length proviruses.

      The findings of a potential CTE are interesting, but the sequences that were appended to the reporter construct are much longer than previously identified CTEs. No data were presented to indicate whether this sequence show similarity to previously identified CTEs and no experiments to show whether this sequence functionally interacts with Nxf1, the protein shown to interact with previously identified bona fide CTEs. Also, since nucleo-cytoplasmic export was not directly analyzed, it remains possible that the sequences that were inserted into the reporter contained splice sites that would allow the RNA to be spliced "downstream" of the GFP gene, allowing the export of a "spliced" GFP mRNA.

      While it is true that the HML8-derived sequences we have tested are much longer than the canonical MPMV CTE and many other known CTEs, there are other reports of elements with CTE-like activity that are much longer and more complex than the MPMV CTE, including one, the MLV PTE, which is ~1400 nt long, even longer than the HML8-derived sequence we have identified. We have compared the MER11 sequence to known CTEs from MPMV, IAP, MusD, MLV, and RSV, as well as the woodchuck hepatitis virus WPRE, which is not a canonical CTE but has been shown to promote nuclear export of RNA; none of these sequences showed any clear sequence similarity to our sequences of interest. We have added a section discussing these questions in some detail (l. 535-547).

      Although the question of what pathway or pathways these elements co-opt is obviously of great interest, we believe it is outside the scope of this manuscript. It is worth noting that a number of cis-acting RNA transport elements do not bind NXF1, either indirectly recruiting NXF1 (IAP RTE), using CRM1 (MLV, WPRE, foamy viruses), or have an unknown mechanism (MusD). We agree that there are potential pitfalls of the reporter system used, and thus have added experiments to directly test the CTE activity of these elements, detailed above.

    1. Author response:

      Reviewer #1 (Public Review):

      This manuscript by Negi et al. investigates the effects of different ubiquitin and ubiquitin-like modifications on the stability of substrate proteins, seeking to provide mechanistic insights into known effects of these modifications on cellular protein abundance. The authors focus on comparative studies of two modifications, ubiquitin and FAT10 (a protein with two ubiquitin-like domains), on a panel of substrate proteins; prior work had established that FAT10-conjugated proteins had lower stability to proteosomal degradation than Ub-modified counterparts.

      Strengths of the work include its integration of data across diverse approaches, including molecular dynamics simulations, solution NMR spectroscopy, and in vitro and cellular stability assays. From these, the authors provide provocative mechanistic insight into the lower stability of FAT10 on its own, and in FAT10-mediated destabilization of substrate proteins in computational and experimental findings. Notably, such destabilization impacts both the tag and tagged proteins, raising some provocative questions about mechanism. The data here are generally compelling, albeit with minor concerns on presentation in parts. Conclusions from this work will be interesting to scientists in several fields, particularly those interested in cellular proteostasis and in vitro protein design / long-range communication.

      The most substantial weakness of this work from my perspective is the specificity of these destabilization effects. In particular, technical challenges of producing bona fide Ub- or FAT10-conjugated substrates with native linkages limits the ability to conduct in vitro studies on exactly the same molecules as being studied in cellular environments. Given some discussion in the manuscript about the importance of linkage location on the specificity of certain tag/substrate interactions, this raises an understandable but unfortunate caveat that needs to be considered more fully both in general and in light of data from other fields (e.g. single molecule pulling) showing site-dependence of comparable effects. I note that these concerns do not impact the caliber of the conclusions themselves, but perhaps suggest area for caution as to their potential impact at this time.

      We thank the reviewer for positive assessment. The reviewer has pointed out the caveats regarding producing Ub- and Fat10-conjugated substrate, which we have now mentioned in the discussion in page 35 line 15.

      Reviewer #2 (Public Review):

      "Plasticity of the proteasome-targeting signal Fat10 enhances substrate degradation" is a nice study where the authors have shown the differences between two protein degradation tags namely, FAT10 and ubiquitin. Even though these tags are closely related in terms of folds, they have differential efficiency in degrading the substrates covalently attached to them. The authors have utilised extensive MD simulations combined with biophysics and cell biology to show the structural dynamics these tags provide for proteasomal degradation.

      We thank the reviewer for positive assessment and suggestions to improve the manuscript quality.

    1. Author response:

      Reviewer #2 (Public Review):

      I have two significant concerns that I believe can be resolved on the timescale of review.

      1) The work identifies substantial thinning in one leaflet. Lipids expand as they thin. Given this, are there too few lipids in this leaflet (which would also indicate thinning)? I would expect their deformations depend strongly on the number-balance of lipids in each leaflet. The authors should check if thinning, and the boundary, is sensitive to inter-leaflet-lipid imbalance.

      We thank Reviewer #2 for this insight, as it led us to evaluate the leaflet tensions in our restrained 2L0J simulation. We found there was an imbalance in the leaflet packing, which we addressed with an extensive set of new simulations and new analysis aimed at generating balanced leaflets.

      See Page 6-8, Appendix Section 1, Appendix – figures 1, 2. We discuss these findings in the new Results section “Protein footprint asymmetry can lead to differential leaflet stresses” and accompanying appendix. Many of the bilayer features in the repacked simulations are consistent with our original submission, but not all. For instance, while we continue to see large tilt immediately around the amphipathic helices in the lower leaflet and little in the upper leaflet, tilts in both leaflets decay to similar values at the box edge (Appendix - figure 2). The degree of membrane pinch along the membrane-protein contact boundaries are less sensitive to the leaflet packing, as demonstrated by the surface heights (Appendix - figure 1).

      Determining the proper change in leaflet count is quite difficult. We are actively extending our continuum model to address questions of differential leaflet strain and coupled lipid tilt, which may allow us to estimate changes in leaflet-count, but this is a significant undertaking beyond the scope of this resubmission.

      2) By constraining the pore to have 2-fold symmetry, the authors remove a large entropic penalty disfavoring such a conformation, and thus presumably disfavoring the negative- gaussian-curvature it induces. For example, if the free energy surface for the fluctuations were rather flat, and only 1% of the conformations were consistent with 2-fold symmetry, the coupling to NGC may be reduced by -kT log( 1 % ), neglecting enhancement by coupling to NGC. Therefore, I predict that the coupling to NGC would be reduced further were the constraint removed.

      We agree with the reviewer that if the 2-fold states are highly disfavored for entropic or enthalpic reasons, it would directly reduce the coupling to NGC. However, we don’t know the free energy difference between these states, and it is hard to calculate them from all-atom and beyond our current scope. While our unrestrained simulations are not converged, they demonstrate that there is a wide range of orientations for the amphipathic helices that are energetically accessible (see Figure 2, Appendix Section 1, and Appendix - figure 4). Still, the DEER data from the Howard lab (Kim et al., 2015) would be better described by further symmetry-broken states with greater inter-AH distances, suggesting that such conformations are not well represented in our equilibrium ensemble.

      Reviewer #3 (Public Review):

      Helsell et al. uses atomistic molecular dynamics simulations to characterize the structural dynamics of the M2 protein together with continuum elastic models to evaluate the energetic cost of the protein-induced bilayer deformations. Using unbiased simulations (without constraints on the protein) they show that the M2 structure is dynamic and that the AH helices are mobile (though they tend to retain their secondary structure), in agreement with experimental observations. Then, using simulations in which the peptide backbone was restrained to the starting structure, they were able to quantitatively characterize the protein- induced bilayer deformations as well as the acyl chain dynamics.

      Both the atomistic simulations and the continuum-based determinations of the bilayer deformation energies are of high quality. The authors are careful to note that their unbiased simulations do not reach equilibrium, and the authors' conclusions are well supported by their results, though some issues need to be clarified.

      1) P. 7: Choice of lipid composition: POPC:POPG:Cholesterol 0.56:0.14:0.3. This lipid composition (or POPC:POPG 0.8:0.2) has been used in a number of experimental studies that the authors use as reference. It differs, however, substantially from the lipid composition of the influenza membrane (Gerl et al., J Cell Biol, 2012; Ivanova et al., ACS Infect Dis, 2015), which is enriched in cholesterol, has a 2:1 ratio of phosphatidylethanolamine to phosphatidylcholine, and almost no PG. The choice of lipid composition is unlikely to impact the authors' major conclusions, but it should be discussed briefly. As noted by Ivanova et al., the lipids of the influenza membrane are enriched in fusogenic lipids. How will that impact the authors results.

      As noted by the Reviewer, the lipid composition we explored was based on DEER studies from Kathleen Howard. While there is a lot of cholesterol in our simulations, it is lower than the lipidomics papers suggest for the viral membrane (Gerl et al., 2012; Ivanova et al., 2015). We hypothesize that further increasing cholesterol would stiffen the membrane even more and cause the energy differences we report here to become even larger – accentuating our finding. We employ 14% POPG and the Simons lab finds about 14% PS. Chemically these headgroups are similar, but the size and spontaneous curvature difference could be a concern. This is the the different intrinsic curvatures of PE versus PC. However, we have not considered spontaneous curvature in our continuum calculations, so we cannot predict how this will influence our results.

      See Appendix - figure 6. We added a new panel to this figure with continuum parameters intended to mimic a high 50 % cholesterol membrane reported for viral coats, and we show that the curvature sensing of symmetry-broken states increases as the cholesterol content increases.

      See Page 25. We added text in the Discussion concerning the difference in lipids found in the virus versus those compositions employed in experiment and here.

      2) The definition of the lipid tilt needs to be revisited. On P. 13 (in the Pdf received for review, the authors do not provide page numbers), the tilt is defined/approximated as "the angle between the presumed membrane normal (aligned with the Z axis of the box) and the vector pointing from each phospholipid's phosphate to the midpoint between the last carbon atoms of the lipid tails." This (equating the normal to the interface with the Z axis of the simulation box) may be an acceptable approximation for the lower leaflet, which is approximately flat, but probably not for the upper leaflet where the interface is curved in the vicinity of the protein. The authors should, at least, discuss the implications of their approximation in terms of their conclusion that there is little lipid tilt in the upper leaflet.

      We agree that our lipid tilt calculations are approximate since we assume the membrane normal points along the z direction. We have now restated this assumption in the Results when we start to discuss tilt. Different models define lipid tilt in different ways, but the work of Deserno defines it with respect to the bilayer mid-plane which is a shared surface for the upper and lower leaflets. Thus, tilt would be moderately impacted in both leaflets. Examining the snapshots at the top of Figure 7, we surmise that the calculated tilts in both leaflets adjacent to the protein would be slightly reduced, leaving the values at the boundary unaffected. Thus, the upper leaflet likely experiences even less tilt than calculated.

      See Page 16. We have added the discussion above to the section on lipid tilt. Also, we have added page numbers to the resubmission.

      3) P. 14, last paragraph, Figure 5 and 6: The snapshots in Figure 5 are too small to see what the authors refer to when they write "tilt their lipid tails to wrap around the helices." The authors should consider citing the work of H W. Huang, e.g., Huang et al. (PRL, 2004), who introduced the notion of curvature stress induced by antimicrobial peptides, a concept similar to what the present authors propose.

      See Page 17. We have now drawn the connection between what our simulations are showing and the earlier work by Huey Huang on antimicrobial peptides.

      See Figure 7. To make the lipid deformations easier to see, we are attaching the full-size versions of each snapshot to the figure as supplemental data.

      4) P. 17-18, Figure 7: The authors introduce the bilayer midplane, which becomes important for the determination of the deformation energy in the (unnumbered) equation on P. 17, but do not specify how it is determined. This is a non-trivial undertaking, but critical for the evaluation of the deformation energy; please add the necessary details.

      See Pages 15 and 20. In the continuum model, we define CM (the compression surface) following the work of May and colleagues (and other groups) as the areal compression weighted mean of the upper and lower surface. In the MD simulation results in Figure 6, we define leaflet thickness as the absolute difference between the interpolated leaflet hydrophobic surface (calculated using the first carbon atoms of each POPC and POPG lipid tail) and the interpolated bilayer midplane surface (calculated as the average of the upper and lower leaflet tail surfaces, each interpolated based on the last carbon atoms of each POPC and POPG lipid tail for each leaflet, respectively). These two leaflet-based definitions are different, and a more sophisticated continuum model of the upper and lower leaflet coupling would require the incorporation of lipid tilt, which we do not currently have.

      5) P. 18-19, Figure 8: The comparison of the MD and continuum membrane deformations is very informative, but the authors should discuss the implications of the increased symmetry further in terms of the estimated deformation energies. (I do not believe the authors really mean that they predicted the energies, they estimated/approximated them.)

      The Reviewer is correct, we are not predicting the energies of the actual MD generated bilayers, but rather we are estimating the energies of these shapes using a continuum-based approximation. The good agreement between the MD generated surfaces and the continuum predicted surfaces suggested that the model is capturing the underlying physics. We argued that the increased symmetry of the continuum surfaces compared to the MD surfaces was due to incomplete sampling in the MD. We were right about that. Please see revised Figure 10 with new data and some longer simulations, where the symmetry in the MD is now apparent and the match between continuum and MD is even better. Frankly, we are very pleased with these new results.

      See Page 18 and Figure 10. We have changed language throughout moving away from “predicting” to “estimating”. The new MD generated data shows much greater symmetry reflected in the starting structures, and better agreement with model predictions.

      References

      Argudo, D., Bethel, N. P., Marcoline, F. V., Wolgemuth, C. W., & Grabe, M. (2017). New Continuum Approaches for Determining Protein-Induced Membrane Deformations. Biophys J, 112(10), 2159-2172. https://doi.org/10.1016/j.bpj.2017.03.040

      Bethel, N. P., & Grabe, M. (2016). Atomistic insight into lipid translocation by a TMEM16 scramblase. Proc Natl Acad Sci U S A, 113(49), 14049-14054. https://doi.org/10.1073/pnas.1607574113

      Drabik, D., Chodaczek, G., Kraszewski, S., & Langner, M. (2020). Mechanical Properties Determination of DMPC, DPPC, DSPC, and HSPC Solid-Ordered Bilayers. Langmuir, 36(14), 3826-3835. https://doi.org/10.1021/acs.langmuir.0c00475

      Ferreira, T. M., Coreta-Gomes, F., Ollila, O. H., Moreno, M. J., Vaz, W. L., & Topgaard, D. (2013). Cholesterol and POPC segmental order parameters in lipid membranes: solid state 1H-13C NMR and MD simulation studies. Phys Chem Chem Phys, 15(6), 1976- 1989. https://doi.org/10.1039/c2cp42738a

      Gerl, M. J., Sampaio, J. L., Urban, S., Kalvodova, L., Verbavatz, J. M., Binnington, B., Lindemann, D., Lingwood, C. A., Shevchenko, A., Schroeder, C., & Simons, K. (2012). Quantitative analysis of the lipidomes of the influenza virus envelope and MDCK cell apical membrane. J Cell Biol, 196(2), 213-221. https://doi.org/10.1083/jcb.201108175

      Henriksen, J., Rowat, A. C., Brief, E., Hsueh, Y. W., Thewalt, J. L., Zuckermann, M. J., & Ipsen, J. H. (2006). Universal behavior of membranes with sterols. Biophys J, 90(5), 1639- 1649. https://doi.org/10.1529/biophysj.105.067652

      Hossein, A., & Sodt, A. J. (2023). Membraneanalysis. jl: A Julia package for analyzing molecular dynamics simulations of lipid membranes. Journal of Open Source Software, 8(87), 5380.

      Hu, M., Briguglio, J. J., & Deserno, M. (2012). Determining the Gaussian curvature modulus of lipid membranes in simulations. Biophys J, 102(6), 1403-1410. https://doi.org/10.1016/j.bpj.2012.02.013

      Ivanova, P. T., Myers, D. S., Milne, S. B., McClaren, J. L., Thomas, P. G., & Brown, H. A. (2015). Lipid composition of viral envelope of three strains of influenza virus - not all viruses are created equal. ACS Infect Dis, 1(9), 399-452. https://doi.org/10.1021/acsinfecdis.5b00040

      Kim, S. S., Upshur, M. A., Saotome, K., Sahu, I. D., McCarrick, R. M., Feix, J. B., Lorigan, G. A., & Howard, K. P. (2015). Cholesterol-Dependent Conformational Exchange of the C- Terminal Domain of the Influenza A M2 Protein. Biochemistry, 54(49), 7157-7167. https://doi.org/10.1021/acs.biochem.5b01065

      Kučerka, N., Tristram-Nagle, S., & Nagle, J. F. (2006). Structure of fully hydrated fluid phase lipid bilayers with monounsaturated chains. J Membr Biol, 208(3), 193-202.

      Latorraca, N. R., Callenberg, K. M., Boyle, J. P., & Grabe, M. (2014). Continuum approaches to understanding ion and peptide interactions with the membrane. J Membr Biol, 247(5), 395-408. https://doi.org/10.1007/s00232-014-9646-z

      Liu, J., Kaksonen, M., Drubin, D. G., & Oster, G. (2006). Endocytic vesicle scission by lipid phase boundary forces. Proc Natl Acad Sci U S A, 103(27), 10277-10282. https://doi.org/10.1073/pnas.0601045103

      Pan, J., Tristram-Nagle, S., & Nagle, J. F. (2009). Effect of cholesterol on structural and mechanical properties of membranes depends on lipid chain saturation. Phys Rev E Stat Nonlin Soft Matter Phys, 80(2 Pt 1), 021931. https://doi.org/10.1103/PhysRevE.80.021931

      Rawicz, W., Olbrich, K. C., McIntosh, T., Needham, D., & Evans, E. (2000). Effect of chain length and unsaturation on elasticity of lipid bilayers. Biophys J, 79(1), 328-339. https://doi.org/10.1016/S0006-3495(00)76295-3

      Sun, D., Peyear, T. A., Bennett, W. F. D., Andersen, O. S., Lightstone, F. C., & Ingolfsson, H. I. (2019). Molecular Mechanism for Gramicidin Dimerization and Dissociation in Bilayers of Different Thickness. Biophys J, 117(10), 1831-1844. https://doi.org/10.1016/j.bpj.2019.09.044

      Tzlil, S., Deserno, M., Gelbart, W. M., & Ben-Shaul, A. (2004). A statistical-thermodynamic model of viral budding. Biophys J, 86(4), 2037-2048. https://doi.org/10.1016/S0006- 3495(04)74265-4

      Ursell, T. S., Klug, W. S., & Phillips, R. (2009). Morphology and interaction between lipid domains. Proc Natl Acad Sci U S A, 106(32), 13301-13306. https://doi.org/10.1073/pnas.0903825106

      Veatch, S. L., & Keller, S. L. (2003). Separation of liquid phases in giant vesicles of ternary mixtures of phospholipids and cholesterol. Biophys J, 85(5), 3074-3083. https://doi.org/10.1016/S0006-3495(03)74726-2

      Venable, R. M., Brown, F. L. H., & Pastor, R. W. (2015). Mechanical properties of lipid bilayers from molecular dynamics simulation. Chem Phys Lipids, 192, 60-74. https://doi.org/10.1016/j.chemphyslip.2015.07.014

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The author has addressed all the concerns I have raised.

      I have only one minor suggestion. 

      We would argue both a gray screen and a grating are visual stimuli. ... We concur, our data only address one of many possible transitions, but it is a switch between distinct visual stimuli that is sped up by ACh. 

      Thank you for clarifying this. 

      Following my comment in the previous review, the author has revised the abstract as follows:  (Before) "Our results suggest that acetylcholine augments the responsiveness of layer 5 neurons to inputs from outside of the local network, enabling faster switching between internal representations during locomotion." 

      (After) "Based on this we speculate that acetylcholine augments the responsiveness of layer 5 neurons to inputs from outside of the local network, possibly enabling faster switching between internal representations during locomotion." 

      My previous comment concerned specifically the latter part, "enabling faster switching between internal representations during locomotion", and, in fact, their data fully support the first part, "acetylcholine augments the responsiveness of layer 5 neurons to inputs from outside of the local network". Thus, I suggest the following sentence: 

      "Our results suggest that acetylcholine augments the responsiveness of layer 5 neurons to inputs from outside of the local network, possibly enabling faster switching between internal representations during locomotion." 

      Thank you for clarifying. We have changed as suggested.

      Reviewer #2 (Recommendations For The Authors): 

      I thank the authors for the clarification regarding the distribution of running speeds in the study. I do agree that 30 cm/s is indeed fast for head-fixed locomotion. My concern is that while all mice contribute to the low locomotion velocity bin, the high locomotion velocity bin is dominated by a subset of animals, since not all mice reached high locomotion speeds. Therefore, the comparison between low, intermediate and high locomotion velocities includes data from different cohorts of animals and variability across animals may confound the analysis of cholinergic axon activity. However, the manuscript is carefully worded to emphasize lack of evidence (e.g. "we found no evidence of an increase in calcium activity between low and high locomotion velocities") and I have revised my summary in the public review to reflect this. 

      I thank the authors for including the scatterplots of single neuron responses locomotion and optogenetic stimulation, which illustrate their heterogeneity. I am surprised that the axes are limited to 20% deltaF/F as visual responses recorded using GCaMP6f often exceed 100% deltaF/F . 

      There are definitely neurons with responses larger than 20% dF/F0, but it is a small fraction. There are two considerations relevant to assessing dF/F amplitudes. First, in our hands trial averaged dF/F0 responses tend to be below 30% even for the most responsive neurons (trial averaging convolves response amplitude and response reliability). The reviewer is probably thinking of single trial responses often shown as raw data that can exceed 100s of %. Second, different published variants for calculating dF/F0 can result in a spectrum of values that varies by up to a factor of 10. This is largely a consequence of the choice of F0 and preprocessing related to correcting slow drifts in signal strength (originally motivated by photobleaching). Attempting to compare dF/F0 across labs is unfortunately a futile effort in absence of standardized way of calculating it. 

      Allow me to clarify how evaluating the effects of optogenetic stimulation and locomotion without analyzing them at the level of individual neurons could result in misleading conclusions. I will use the effects of cholinergic responses on grating responses as an example but this concern applies equally to the other analyses. The manuscript reports that "in layer 2/3, optogenetic activation of cholinergic axons did not result in a detectable increase in grating onset responses (Figure 4C), while the responses of layer 5 neurons to the same stimulus increased with concurrent optogenetic activation of cholinergic axons." As the Figure R2C-D illustrates, only a minority of L2/3 neurons are excited by the grating in baseline conditions, while the vast majority are either suppressed or non-responsive. This is expected, as it is well established that visual responses in layer 2/3 are sparse. If responses of the small subset of L2/3 neurons that are activated by the grating were enhanced, it may not be apparent in the population average presented in the manuscript. In contrast, since a larger fraction of L5 neurons is excited by the grating, enhancement of grating responses may be easier to detect. In other words, the effects of optogenetic stimulation may be to boost the responses of those neurons that are activated by the grating and the difference between L2/3 and L5 lies simply in the proportion of activated neurons. I do not mean to argue in favour of this specific scenario but simply present it so as to illustrate the way in which considering population averages alone may be misleading. 

      While the authors state in their response that "all relevant and clear conclusions are already captured by the mean differences shown in Figure 4", the evidence supporting this statement is not presented in the manuscript. Most importantly, it is essential to determine whether the neurons that show significant activation in response to gratings (Figure 4C-D), mismatch (Figure 4E-F) or locomotion (Figure 4G-H), are affected by optogenetic stimulation in the same way as the population average. 

      We have added the analysis suggested as Figure S6. Consistent with the population averages, even within the subset of layer 2/3 neurons most responsive to specific inputs, we found no detectable increase in responsiveness upon optogenetic stimulation of cholinergic axons.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Overall, the manuscript is very well written, the approaches used are clever, and the data were thoroughly analyzed. The study conveyed important information for understanding the circuit mechanism that shapes grid cell activity. It is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor networks and neural circuits.

      We appreciate the positive comments.

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. However, it is unclear what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. More detailed information/statistics about the asynchronization of SC activity is necessary for interpreting the results.

      The short answer here is that spiking responses from the pairs of SCs that we sampled appear asynchronous. We now show this in the form of cross-correlograms for all recorded pairs of SCs (Figure 2, Figure Supplement 1). The correlograms lack peaks that would indicate synchronous activation. Thus, while our dataset is not large enough to rule out occasional direct synchronisation of SCs, this appears unlikely to account for synchronised input to PV+INs.

      This conclusion is consistent with consideration of mechanisms that could in principle synchronise SCs:

      First, if responses to ramping light inputs was fully deterministic, then this could lead to fixed relative timing of spikes fired by different SCs. This is unlikely given the influence of stochastic channel gating on SC spiking (Dudman and Nolan 2009) and is inconsistent with trial to trial variability in spike timing (Figure 2, Figure Supplement 2).

      Second, as SCs are glutamatergic they could excite one another. However, excitatory connections between stellate cells are rare (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016) and when detected they have low amplitude (mean < 0.25 mV; (Winterer et al. 2017)). Our finding that spiking by pairs of SCs is not correlated is consistent with this.

      Third, strong interaction between stellate cells mediated by local inhibitory pathways (Pastoll et al. 2013; Couey et al. 2013) could coordinate their activity. The lack of correlation between spiking of pairs of SCs suggests that such coordination is rarely recruited by our ramping protocols. Nevertheless, recruitment of inhibition may happen to some extent as experiments in Figure 4 show that correlated input from SCs to more distant, but not nearby PV+INs, is reduced by blocking inhibitory synapses. Given that we don't find evidence for synchronised spiking of SCs, this additional common input to widely separated PV+INs is instead best explained by recruitment of interneurons that act directly on the target SCs. We have modified Figure 8 to make this clear.

      Thus, for experiments with ramping light stimuli, synchronous activation of SCs is unlikely to explain common input to PV+INs. Input from the same SC best explains correlated responses of nearby PV+IN inhibitory populations, while recruitment of an additional inhibitory pathway may contribute to correlated responses of more distant PV+INs.

      For experiment using focal stimulation, substantial trial-to-trial variation in SC spike timing argues strongly against deterministic coordination. Indirect coordination of presynaptic neurons is also extremely unlikely given that focal activation is sparse and brief, while inputs from many presynaptic SCs are required to drive a postsynaptic interneuron to spike (e.g. (Pastoll et al. 2013; Couey et al. 2013)). Results from these experiments thus corroborate results from experiments using ramping light stimulation.

      In revising the manuscript we have tried to ensure these arguments are clear (e.g. p 5, para 3; p 6, para 2; p 10, para 1).

      (2) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. However, the evidence supporting this "direct interaction" between these two cell types is missing. Is it possible that pyramidal cells are also involved in this interaction? Some pieces of evidence or discussions are necessary to further support the "direction interaction".

      Indirect connections between stellate cells mediated via fast spiking inhibitory interneurons are well established by previous studies (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), and so were not addressed here. Previous work also establishes that connections from stellate cells to pyramidal cells are extremely rare (Winterer et al. 2017). Because the Sim1:Cre mouse line is specific to stellate cells and does not drive transgene expression in pyramidal cells (Sürmeli et al. 2015), it's therefore unlikely that pyramidal cells play a role.

      To make these points clearer we have modified the text in the discussion (p 5, para 3; p 10, paras 1 & 2). We have also modified Figure 8 to highlight that the indirect interaction may be best accounted for by inhibitory pathways onto PV+INs rather than via SCs (which our new cross-correlation analyses indicate is unlikely).

      Reviewer #2 (Public Review):

      In this study, Huang et al. employed optogenetic stimulation alongside paired whole-cell recordings in genetically defined neuron populations of the medial entorhinal cortex to examine the spatial distribution of synaptic inputs and the functional-anatomical structure of the MEC. They specifically studied the spatial distribution of synaptic inputs from parvalbumin-expressing interneurons to pairs of excitatory stellate cells. Additionally, they explored the spatial distribution of synaptic inputs to pairs of PV INs. Their results indicate that both pairs of SCs and PV INs generally receive common input when their relative somata are within 200-300 ums of each other. The research is intriguing, with controlled and systematic methodologies. There are interesting takeaways based on the implications of this work to grid cell network organization in MEC.

      We appreciate the positive comments.

      (1) Results indicate that in brain slices, nearby cells typically share a higher degree of common input. However, some proximate cells lack this shared input. The authors interpret these findings as: "Many cells in close proximity don't seem to share common input, as illustrated in Figures 3, 5, and 7. This implies that these cells might belong to separate networks or exist in distinct regions of the connectivity space within the same network.". Every slice orientation could have potentially shared inputs from an orthogonal direction that are unavoidably eliminated. For instance, in a horizontal section, shared inputs to two SCs might be situated either dorsally or ventrally from the horizontal cut, and thus removed during slicing. Given the synaptic connection distributions observed within each intact orientation, and considering these distributions appear symmetrically in both horizontal and sagittal sections, the authors should be equipped to estimate the potential number of inputs absent due to sectioning in the orthogonal direction. How might this estimate influence the findings, especially those indicating that many close neurons don't have shared inputs?

      Given we find high probabilities of correlated inputs to nearby cells in both planes, our conclusion that nearby cells are likely to receive common inputs appears to be independent of the slice plane. For cells further apart, where the degree of correlated input becomes more variable, it is possible that cell pairs that have low input correlations measured in one slice plane would have high input correlations if measured in a different plane. An argument against this is that as the cell pairs are further apart, it is less likely that an orthogonal axon would intersect dendritic trees of both cells. Nevertheless, we can't rule this out given the data here. We have amended the discussion to highlight this possibility (p 10, para 1). We agree it would be interesting to address this point further with quantitative analyses but this will be difficult without detailed reconstructions of the circuit.

      (2) The study examines correlations during various light-intensity phases of the ramp stimuli. One wonders if the spatial distribution of shared (or correlated) versus independent inputs differs when juxtaposing the initial light stimulation phase, which begins to trigger spiking, against subsequent phases. This differentiation might be particularly pertinent to the PV to SC measurements. Here, the initial phase of stimulation, as depicted in Figure 7, reveals a relatively sparse temporal frequency of IPSCs. This might not represent the physiological conditions under which high-firing INs function. While the authors seem to have addressed parts of this concern in their focal stim experiments by examining correlations during both high and low light intensities, they could potentially extract this metric from data acquired in their ramp conditions. This would be especially valuable for PV to SC measurements, given the absence of corresponding focal stimulation experiments.

      We understand the gist of the question here as being can differences in correlation scores between initial vs later phases of responses to ramping light inputs be used to infer spatial organisation? These differences are likely to reflect heterogeneity in the spiking of the input neurons, for example through differences in spike threshold, spike frequency adaptation and saturation of spiking (e.g. Figure 2, Figure Supplement 1A, and also see (Pastoll et al. 2020)). We don't expect these differences to have any spatial organisation along the mediolateral axis, and while spike threshold follows a dorsoventral organisation there is nevertheless substantial local variation between neurons (Pastoll et al. 2020). It's therefore unlikely we can use differences in early versus late correlations to make the inferences proposed by the reviewer.

      With respect to PV to SC measurements, similar heterogeneity is likely. We note that we were unable to carry out focal stimulation experiments for PV to SC connections as PV neurons did not spike in response to focal optogenetic stimulation.

      With respect to physiological conditions, our aim here is simply to assess connectivity in well controlled conditions, e.g. voltage-clamp, minimal spontaneous activity, known neuronal locations, etc. It's not clear that physiological activation patterns would improve on these tests and quite likely data would be noisier and harder to interpret.

      (3) Re results from Figure 2: Please fully describe the model in the methods section. Generally, I like using a modeling approach to explore the impact of convergent synaptic input to PVs from SCs that could effectively validate the experimental approach and enhance the interpretability of the experimental stim/recording outcomes. However, as currently detailed in the manuscript, the model description is inadequate for assessing the robustness of the simulation outcomes. If the IN model is simply integrate-and-fire with minimal biophysical attributes, then the findings in Fig 2F results shown in Fig 2F might be trivial. Conversely, if the model offers a more biophysically accurate representation (e.g., with conductance-based synaptic inputs, synapses appropriately dispersed across the model IN dendritic tree, and standard PV IN voltage-gated membrane conductances), then the model's results could serve as a meaningful method to both validate and interpret the experiments.

      We appreciate the simulation descriptions were insufficient and have modified the manuscript to include additional details and clarification (p 14, paras 1-3).

      We're not sure we follow the logic here with respect to model types. The experiments were carried out in the voltage-clamp recording configuration with the goal of identifying correlated inputs independently from how they are integrated by the postsynaptic neuron. Given that membrane potential doesn't change (and so the CdVm/dt term of the membrane equation = 0), integrate and fire and point conductance-based models both simplify down to summing of input currents. We achieve this by convolving spike times with experimentally measured synaptic current waveforms. An assumption of our approach is that we achieve a reasonable space clamp. We believe this is justified given that stellate cells and PV interneurons are reasonably electrotonically compact, and that our analysis relies on consistent correlations rather than absolute amplitudes or time constants of the postsynaptic response and so should tolerate moderate space clamp errors.

      Reviewer #3 (Public Review):

      This paper presents convincing data from technically demanding dual whole-cell patch recordings of stellate cells in medial entorhinal cortex slice preparations during optogenetic stimulation of PV+ interneurons. The authors show that the patterns of postsynaptic activation are consistent with dual recorded cells close to each other receiving shared inhibitory input and sending excitatory connections back to the same PV neurons, supporting a circuitry in which clusters of stellate cells and PV+IN interact with each other with much weaker interactions between clusters. These data are important to our understanding of the dynamics of functional cell responses in the entorhinal cortex. The experiments and analysis are quite complex and would benefit from some revisions to enhance clarity.

      These are technically demanding experiments, but the authors show quite convincing differences in the correlated response of cell pairs that are close to each other in contrast to an absence of correlation in other cell pairs at a range of relative distances. This supports their main point of demonstrating anatomical clusters of cells receiving shared inhibitory input.

      We appreciate the positive comments.

      The overall technique is complex and the presentation could be more clear about the techniques and analysis. In addition, due to this being a slice preparation they cannot directly relate the inhibitory interactions to the functional properties of grid cells which was possible in the 2-photon in vivo imaging experiment by Heys and Dombeck, 2014.

      We have modified the manuscript to try to improve the presentation (specific changes are detailed below). We agree that an important future challenge is to relate our findings to in vivo observations (p 11, para 2).

      Reviewer #1 (Recommendations For The Authors):

      Major points

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. In Figure 2 and its supplementary figures, the authors also showed examples of asynchronized activity. However, it is unclear to me what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. Related to this concern, it would also be important to simulate what level of activity asynchronization in SCs could still lead to correlated PV+ IN activity above shuffle, and among the recorded SCs, what percentage of cells belong to this synchronized/less asynchronized category.

      We address this point in our response to the public review. In brief, we have added additional cross-correllograms showing that ramp activation of SC pairs does not cause detectable synchronous activation. We also clarify that sensitivity of correlations of some widely separated pairs to GABA-blockers is suggestive of SCs activating common inhibitory inputs to cell pairs.

      (2) The above concern is more relevant to the focal stimulation experiments, in which the authors tried to claim that a pair of PV+ INs with correlated activity could receive inputs from the same SCs neurons. The authors also showed that the stimulation patterns leading to the activation of PV+ INs were more similar if PV+ INs had correlated activity (Figure 5D). However, if nearby SCs were more synchronized than distal SCs within this stimulation scale, even though a pair of PV+ INs showed correlated activity, they could still receive inputs from different but nearby SCs. In this case, it would be helpful to quantify the relationship between the level of activity synchronization of SCs and their distances. In Figure 5 Supplementary Figure 1, the data were only provided for 8 cells. If feasible, collecting data from more cells would be needed for the proposed analysis.

      We explain in our responses to point 1 above and in the public review that direct synchronisation of SCs is unlikely. This is particularly unlikely for focal stimulation experiments as the timing of responses of individual SCs is extremely variable between trials. Thus, even if there were strong synaptic connections between SCs, which the evidence suggests there is not (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), then this would be unlikely to result in reliably timed coordinated firing.

      (3) It is unclear what the definition of "common inputs" is. Do they refer to inputs from the same group of cells? If different groups of cells provide synchronized inputs, will the inputs be considered "common inputs" or "different inputs"?

      We used "common" in an attempt to be consistent with classic work by Yoshimura et al. and in an attempt to be succinct. Thus, by common input we are referring to cell pairs for which a proportion of their input is from the same presynaptic neuron(s), as opposed to cell pairs for which their input is from different neurons and therefore have no common input. We have attempted to make sure this is clear in the revised manuscript (e.g description of simulations on p 4, para 2).

      (4) In the introduction and abstract, it was mentioned that "dense, but specific, direct excitatory-inhibitory synaptic interactions may operate at the scale of grid cell clusters". It is unclear to me how "dense" was demonstrated in the data. Can the authors clarify?

      Thanks for flagging this, we were insufficiently clear. We have revised the text to refer to cell pairs for which a proportion of their input is from the same presynaptic neurons (e.g. p 3, para 1), and separately about indirect coordination, by which we mean inputs to cell pairs that appear correlated because of coordination between upstream neurons.

      (5) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. Is there any evidence supporting this "direct interaction"?

      The direct interaction from SCs to PV+INs and from PV+INs to SCs were previously demonstrated by experiments with recordings from pairs of neurons (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016; Winterer et al. 2017). Our results in Figures 3-5, which show that exciting SCs by light activation of ChR2 leads to excitation of PV+INs, and in Figure 7, which show that light activation of PV+INs expressing ChR2 leads to inhibition of SCs, are consistent with these previous conclusions. We have modified the manuscript to make sure this is clear (p 2, para 3).

      Is it possible that pyramidal cells are also involved in this interaction? If this is unlikely, the author may provide some pieces of evidence (e.g., timing of responses after optogenetic stimulation) or some discussions.

      This is unlikely given that previous studies indicate that connections from stellate to pyramidal cells are weak or absent (Winterer et al. 2017). We now clarify this in the Discussion (p 10, para 1).

      Minor points (1) Page 4: the last paragraph: the author claimed that CCpeakmean was reduced and CClagvar increased with cell separation. Although the trends are visible in the figures, the author may provide appropriate statistics to support this statement, such as a correlation between cell separation and CCpeakmean CClagvar./

      We have inserted summaries of linear model fits into the legends for Figure 3E-F, Figure 5F-H and Figure 7D.

      (2)  If I understood correctly, in the second last paragraph on page 6, "pairs of SCs" should be changed to "pairs of PV+ INs".

      Thanks. Corrected.

      (3)  Page 9: the 7th line to the end: where is Figure S4?

      Corrected to 'Figure 3, Figure Supplement 2'.

      (4)  Page 27: at the end of figure caption B: two ".

      Corrected.

      (5)  Figures 3A and B: what are the red vertical rectangles?

      These are the regions shown on an expanded time base in C and D. This is now clarified in the legend.

      (6)  Page 28 Figure caption of D and E: (C) and (D) should be (D) and (E).

      Corrected.

      (7)  The first sentence of the third paragraph in INTRODUCTION: 'later' should be 'layer'.

      Corrected.

      Reviewer #2 (Recommendations For The Authors):

      - Some related work has been done by Beed et al. 2013 to map the spatial distribution of inputs to neurons in MEC. Certainly, there are differences in the approaches and the key questions, but the contribution of this study would benefit from a more detailed comparison of the results from Beed vs the current study and should be included in the discussion.

      It's hard to include a detailed comparison of results, at least without losing focus, as the two studies address different questions with different approaches. We already noted that 'Local optical activation of unidentified neurons has also been used to infer connectivity principles but with a focus on responses of single postsynaptic neurons (Beed et al., 2013, 2010)'. In addition, we now note that 'Our focal optogenetic stimulation approach also offers insight into the spatial organization of presynaptic neuronal populations, with the advantage, compared to focal glutamate uncaging previously used to investigate connectivity in the MEC (Beed et al., 2013, 2010), that the identity of the presynaptic cell population is genetically defined'.

      - There are a few places where the language is ambiguous or needs a more detailed description for clarity. • 3rd paragraph under "Focal activation of SCs generates common input to nearby PV+Ins". The correlation probability description in this paragraph and a similar sentence in the methods are very hard to understand. I had to look up the analysis in Yoshimura et al. 2005 to understand what was done here. It's a nice analysis, but the manuscript could benefit from a more detailed description of this measure in the methods.

      We agree, it is a somewhat complex metric and is challenging to explain. In the interests of keeping the main text succinct, we have left the bare bones explanation as it was in the Results, but have expanded the explanation in the Methods. We hope this is now clear.

      - " Alternatively, if there is no clear spatial organization of SC to PV+INs connections, then the similarity between stimulus locations for pairs of SCs should have a random distribution." This sentence is hard to understand. I think the use of the phrase "similarity of stimulus location" is a strange phrasing and is driving the confusion in this sentence.

      We have replaced this with 'correspondence between active stimulus locations'.

      - In the discussion under "Spatial extent and functional organization of L2 circuits" there is a grammatical mistake (seems to be 2x phrasing of "leads to common synaptic input").

      Corrected.

      - Citation in the introduction/discussion. Introduction: in addition to Gu et al. 2018, Heys et al 2014 also showed there are non-random correlations among putative grid cells as a function of their somatic distance. In the discussion section, in addition to Gu et al. 2018, Heys et al. 2014 showed there is anatomical clustering of grid cells in MEC. This earlier work investigating functional correlations among neurons in the superficial aspect of MEC in vivo should be cited and is particularly relevant in these two sections of the manuscript.

      Thanks, we apologise for the oversight. We're well aware of this important study and have now cited it.

      -Typo - Paragraph 3 of the intro; "later" should be layer.

      Corrected.

      -Figure 5 (D-E) there is a typo high correlation probability is D and low correlation is E (text says C/D).

      Corrected.

      Reviewer #3 (Recommendations For The Authors):

      The paper is missing the bibliography section. This makes the review somewhat difficult as some cited papers are not immediately familiar based on the citation.

      Thanks and our apologises for making extra work by omitting this. It is now included.

      Page 2 - "cell clusters" - they should also cite the paper by Heys and Dombeck, 2014 that shows a spatial scale of inhibitory interactions computed based on correlations of grid cells recorded using 2-photon calcium imaging.

      Added (see above).

      Page 2 - "later 2 of the MEC" - layer.

      Corrected.

      Page 2 - "synaptic interactions" - again they should mention the work by Heys and Dombeck, 2014 that indirectly measured the spatial scale of inhibition.

      Now cited in this paragraph.

      Page 4 "we simulated responses" and Figure 2E - in each simulation - did they fit the magnitude and time constant of the simulated EPSCs to individual EPSCs in the data? Or did they randomly vary these to find the best fit?

      The parameters for the simulations are given in the Methods and were chosen to correspond to the experimental values. We have rewritten this section to make the simulation methods clearer. Simulations using different time constants within a physiological range support similar conclusions.

      Page 4 - "we identified 35/71" - Are these the cells that appear in yellow as correlated in Figures 3E-F? If so, the text should indicate that these cells are shown in yellow.

      We have added this and have also updated the legends for additional clarification.

      Figure 2, Figure Supplement 1 - B,C - the following phrase is not clear: "when the 4 / 8 of each neurons inputs from SCs also project to the other neuron (B)," Should the "the" be removed? Also, by 4/8 do they mean 50%, or do they mean 4 to 8?

      Thanks, we've reworded to improve the clarity.

      E - "receiving presynaptic inputs consisted of 4 overlapping SCs" - should it say "consisting"?

      Corrected.

      Figure 3, Figure Supplement 1 part E - "the same data as (C )" - should this be the same data as (D)?? I do not see how doing clustering on the shuffled data in (C ) would give two groups, but it makes sense if it is from (D).

      That's right, now corrected.

      Page 5 - "used action potentials" - this is confusing. Is the word "used" supposed to be there?

      Corrected.

      Page 5 - "widefield activation experiments" - they should cite the experiments that they are referring to here.

      Added.

      Page 5 - "effect of blocking" - "Figure 4" - I find it very odd that the agent GABAzine in Figure 4 is not explicitly mentioned in the main text (though it is mentioned in the methods). The main text should indicate that blocking was performed using GABAzine.

      Added.

      Page and page 14 and Figure 5 - "shifted" - do they mean shuffled?

      We do. The classic papers by Yoshimura et al. used shifted so we keep this here so it's clear we've used their approach. We've added additional explanation to try to make sure the meaning is clear.

      Figure 5 A, B, D, and E would benefit from a more detailed description. They should state whether the labels "1a" and "1b" and "2a" and "2b" refer to different recorded neurons in each pair. They should indicate that 2a and 2b are a different pair? Are the x, y axes of the images corresponding to anatomical position? Does "B" indicate the location of recordings shown in Figure 5B? The authors probably think this is all obvious, but it is not immediately obvious to the reader.

      We have added additional clarification.

      Page 8 - "Beed et al." - These papers by Beed ought to be cited in the introduction as well as they are highly relevant.

      We now cite Beed et al. 2013 in the Introduction when we discuss local inhibitory input to SCs. While the Beed et al. 2010 paper is an important contribution to understanding about pathways from deep to superficial layers, the introduction focuses on communication between identified pre- and postsynaptic populations within layer 2 and therefore we haven't found a way to cite it without losing focus. We do cite this paper multiple times elsewhere.

      Page 10 - "Excitatory-inhibitory interactions" - this summary of attractor models ought to cite the paper by Burak and Fiete as well.

      The discussion focuses on models with excitatory-inhibitory connectivity and cites an important paper from the Fiete group. The model by Burak and Fiete, while also important, is purely inhibitory and so is not well constrained by the known circuitry, and therefore could not be correctly cited here.

      Page 10 - "be consistent with models…or that focus on pyramidal neurons have also been proposed" - this seems ungrammatical as if two different sentences were merged.

      Corrected.

      References

      Couey, Jonathan J, Aree Witoelar, Sheng-Jia Zhang, Kang Zheng, Jing Ye, Benjamin Dunn, Rafal Czajkowski, et al. 2013. “Recurrent Inhibitory Circuitry as a Mechanism for Grid Formation.” Nat. Neurosci. 16 (3): 318–24. https://doi.org/10.1038/nn.3310.

      Dudman, Joshua T, and Matthew F Nolan. 2009. “Stochastically Gating Ion Channels Enable Patterned Spike Firing through Activity-Dependent Modulation of Spike Probability.” Plos Comput. Biol. 5 (2): e1000290. https://doi.org/10.1371/journal.pcbi.1000290.

      Fuchs, Elke C, Angela Neitz, Roberta Pinna, Sarah Melzer, Antonio Caputi, and Hannah Monyer. 2016. “Local and Distant Input Controlling Excitation in Layer II of the Medial Entorhinal Cortex.” Neuron 89 (1): 194–208. https://doi.org/10.1016/j.neuron.2015.11.029.

      Pastoll, Hugh, Derek L Garden, Ioannis Papastathopoulos, Gülşen Sürmeli, and Matthew F Nolan. 2020. “Inter- and Intra-Animal Variation in the Integrative Properties of Stellate Cells in the Medial Entorhinal Cortex.” Elife 9 (February). https://doi.org/10.7554/eLife.52258.

      Pastoll, Hugh, Lukas Solanka, Mark C W van Rossum, and Matthew F Nolan. 2013. “Feedback Inhibition Enables Theta-Nested Gamma Oscillations and Grid Firing Fields.” Neuron 77 (1): 141–54. https://doi.org/10.1016/j.neuron.2012.11.032.

      Sürmeli, Gülşen, Daniel Cosmin Marcu, Christina McClure, Derek L F Garden, Hugh Pastoll, and Matthew F Nolan. 2015. “Molecularly Defined Circuitry Reveals Input-Output Segregation in Deep Layers of the Medial Entorhinal Cortex.” Neuron 88 (5): 1040–53. https://doi.org/10.1016/j.neuron.2015.10.041.

      Winterer, Jochen, Nikolaus Maier, Christian Wozny, Prateep Beed, Jörg Breustedt, Roberta Evangelista, Yangfan Peng, Tiziano D’Albis, Richard Kempter, and Dietmar Schmitz. 2017. “Excitatory Microcircuits within Superficial Layers of the Medial Entorhinal Cortex.” Cell Rep. 19 (6): 1110–16. https://doi.org/10.1016/j.celrep.2017.04.041.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The manuscript by Agha et al. provides a fundamental understanding regarding the participation of V2a interneurons in generating and patterning the locomotor rhythm. The authors provide convincing and solid evidence regarding the heterogeneity of V2a neurons in their intrinsic and synaptic properties and how these shape their outputs. The manuscript could be much improved by the inclusion of statistical analysis of some of the key data currently presented qualitatively. 

      We are extremely grateful for the positive and thorough comments provided by the three reviewers and have now had the opportunity to address all their concerns, as detailed below in our point-by-point response. Specifically, we have provided statistical analysis and major revisions to the text to help with rigor, clarity and interpretation, and we have also include new perturbation experiments that provide a more definitive test of one of our predictions – namely that reciprocal inhibition plays speed-specific roles in rhythm generation and pattern formation. The revisions greatly improve the manuscript and help bolster our conclusions.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary:

      In this very interesting study, Agha and colleagues show that two types of Chx10-positive neurons (V2a neurons) have different anatomical and electrophysiological properties and receive distinct patterns of excitatory and inhibitory inputs as a function of speed during fictive swimming in the larval zebrafish. Using single-cell fills they show that one cell type has a descending axon ("descending V2as"), while the other cell type has both a descending axon and an ascending axon ("bifurcating V2as"). In the Chx10:GFP line, descending V2as display strong GFP labeling, while bifurcating V2as display weak GFP labeling. The bifurcating V2as are located more laterally in the spinal cord. These two cell types have different electrophysiological properties as revealed by patch-clamp recordings. Positive current steps indicated that descending V2as comprise tonic spiking or bursting neurons. Bifurcating V2as comprise chattering or bursting neurons. The two types of V2a neurons display different recruitment patterns as a function of speed. Descending tonic and bifurcating chattering neurons are recruited at the beginning of the swimming bout, at fast speeds (swimming frequency above 30 Hz). Descending bursting neurons were preferentially recruited at the end of swimming bouts, at low speeds (swimming frequency below 30 Hz), while bifurcating bursting neurons were recruited for a broader swimming frequency range. The two types of V2a neurons receive distinct patterns of excitatory and inhibitory inputs during fictive locomotion. In descending V2as, when speed increases: i) excitatory conductances increase in fast neurons and decrease in slow neurons; ii) inhibitory conductances increase in fast neurons and increase in slow neurons. In bifurcating V2as, when speed increases: i) excitatory conductances increase in fast neurons but do not change in slow neurons; ii) inhibitory conductances increase in fast neurons and do not change in slow neurons. The timing of excitatory and inhibitory inputs was then studied. In descending V2as, fast neurons receive excitatory and inhibitory inputs that are in anti-phase with low contrast in amplitude and are both broadly distributed over the phase. The slow neurons receive two peaks of inhibition, one in anti-phase with the excitatory inputs and another just after the excitation. In bifurcating V2as, fast neurons receive two peaks of inhibition, while slow ones receive anti-phase inhibition. 

      Strengths: 

      This study focuses on the diversity of V2a neurons in zebrafish, an interesting cell population playing important roles in locomotor control and beyond, from fish to mammals. The authors provide compelling evidence that two subtypes of V2as show distinct anatomical, electrophysiological, and speed-dependent spiking activity, and receive distinct synaptic inputs as a function of speed. This opens the door to future investigation of the inputs and outputs of these neurons. Finding ways to activate or inhibit specifically these cells would be very helpful in the years to come. 

      Weaknesses: 

      No major weakness was detected. The experiments were carefully done, and the data were of high quality. 

      We really appreciate the positive assessment and have addressed minor issues below.

      Reviewer #2 (Public Review): 

      Summary: 

      Animals exhibit different speeds of locomotion. In vertebrates, this is thought to be implemented by different groups of spinal interneurons and motor neurons. A fundamental assumption in the field has been that neural mechanisms that generate and sustain the rhythm at different locomotor speeds are the same. In this study, the authors challenge this view. Using rigorous in vivo electrophysiology during fictive locomotion combined with genetics, the authors provide a detailed analysis of cellular and synaptic properties of different subtypes of spinal V2a neurons that play a crucial role in rhythm generation. Importantly, they are able to show that speed-related subsets of V2a neurons have distinct cellular and synaptic properties and may utilize different mechanisms to implement different locomotor speeds. 

      Strengths: 

      The authors fully utilize the zebrafish model system and solid electrophysiological analyses to study the active and passive properties of speed-related V2a subsets. Identification of the V2a subtype is based directly on their recruitment at different locomotor speeds and not on indirect markers like soma size, D-V position etc. Throughout the article, the authors have cleverly used standard electrophysiological tests and analysis to tease out different neuronal properties and link it to natural activity. For example, in Figures 2 and 4, the authors make comparisons of V2a spiking with current steps and during fictive swims showing spike rates measured with current steps are physiologically relevant and observed during natural recruitment. The experiments done are rigorous and well-controlled.

      Weaknesses: 

      The authors claim that a primary result of their study is that reciprocal inhibition is important for rhythmogenesis at fast speeds while recurrent inhibition is key at slow speeds. This is shown in Figure 6, however, the authors do not show any statistical tests for this claim. The authors also do not show any conclusive evidence that reciprocal inhibition is required for rhythmogenesis at fast speeds and vice versa for slow speeds. Additional experiments or modeling studies that conclusively show the necessity of these different inhibitory sources to the generation of different rhythms would be needed to strengthen this claim. 

      We have added new loss-of-function experiments as requested to strengthen the claim that reciprocal inhibition is critical for rhythmogenesis at fast speeds, but dispensable at slow. Specifically, we use botulinum toxin selectively expressed in Dmrt3-labeled dI6 interneurons, which play a role in reciprocal inhibition at a variety of speeds (new Figure 7). These experiments demonstrate a selective impact on rhythmic burst generation and alternation during periods of swimming where the highest frequency motor activity occurs. During lower frequency activity, rhythm generation is preserved, however motor output is selectively altered, consistent with the idea that reciprocal inhibition plays an important role in patterning at slow speeds.

      The authors do a great job of teasing out cellular and synaptic properties in the different V2a subsets, however, it is not clear if or how these match the final output. For example, V2aD neurons are tonic or bursting for fast and slow speeds respectively but it is not intuitive how these cellular properties would influence phasic excitation and inhibition these neurons receive. 

      This question gets at the heart of what we are trying to illustrate in Figure 6. Specifically, in the new Figure 6E,F we have aligned the cumulative distribution of spikes recorded in cell-attached mode with phasic excitatory and inhibitory currents to reveal how well cellular properties versus patterns of synaptic drive match the final output (spikes). Our expectation was if intrinsic cellular properties where ultimately generating phasic spiking patterns, then patterns of excitatory and inhibitory drive need not be phasic. Instead, we see that synaptic drive is phasic with spiking occurring between peaks in excitation and troughs in inhibition.  Since post-synaptic cellular properties should not impact the pre-synaptic excitation they receive, this suggests that phasic spiking in all V2a neurons regardless of the capacity for cellular rhythmogenesis is a result of phasic input. In response to this concern, we have elaborated our discussion of what cellular properties may contribute and the impact on output in the Discussion (L502-511). 

      It is not clear from the discussion why having different mechanisms of rhythm generation at different speeds could be an important circuit design. The authors use anguilliform and carangiform modes of swimming to denote fast and slow speeds but there are differences in these movements other than speed, like rostrocaudal coordination. The frequency and pattern of these movements are linked and warrant more discussion. 

      We appreciate the opportunity to elaborate on this point more in the Discussion. In particular, we have added more text to clarify differences in movement related to both pattern-formation and rhythm-generation (L373-398) and to also suggest potential reasons for differences in mechanisms of rhythm generation (L478-488).  

      Reviewer #3 (Public Review):

      The manuscript by Agha et al. explores mechanisms of rhythmicity in V2a neurons in larval zebrafish. Two subpopulations of V2a neurons are distinguishable by anatomy, connectivity, level of GFP, and speed-dependent recruitment properties consistent with V2a neurons involved in rhythm generation and pattern formation. The descending neurons proposed to be consistent with rhythm-generating neurons are active during either slow or fast locomotion, and their firing frequencies during current steps are well matched with the swim frequency they firing during. The bifurcating (patterning neurons) are active during a broader swim frequency range unrelated to their firing during current steps. All of the V2a neurons receive strong inhibitory input but the phasing of this input is based on neuronal type and swim speed when the neuron is active, with prominent in-phase inhibition in slow descending V2a neurons and bifurcating V2a neurons active during fast swimming. Antiphase inhibition is observed in all V2a neurons but it is the main source of rhythmic inhibition in fast descending V2a neurons and bifurcating neurons active during slow swimming. The authors suggest that properties supporting rhythmic bursting are not directly related to locomotor speed but rather to functional neuronal subtypes. 

      This is a well-written paper with many strengths including the rigorous approach. Many parameters, including projection pattern, intracellular properties, inhibition received, and activity during slow/fast swimming were obtained from the same neuron. This links up very well with prior data from the lab on cell position, birth order, morphology/projections, and control of MN recruitment to provide a comprehensive overview of the functioning of V2a interneuronal populations in the larval zebrafish. The overall conclusions are well supported by the data. Weaknesses are relatively minor and were largely related to terminology for some of the secondary conclusions. 

      (1) The assumption is made that all in-phase inhibition is recurrent and out-of-phase inhibition is reciprocal. The latter is likely true but the definition of recurrent may be a bit loose as could be multisegmental feed-forward inhibition as well. 

      This is an excellent point, which was also raised by Reviewer 1. We have now added references that justify this assertion (L281-283). We also add a new figure with schematics (Figure 8) to make it clearer how we are defining sources of recurrent versus reciprocal inhibition, as based on the anatomical constraints of the circuit. We agree that multi-segmental inputs could contribute to inhibition, but they will likely be more broadly distributed based on rostro-caudal location and contribute to tonic sources of drive.  We now clarify this (L285-286).

      (2). In a few places, it is mentioned that the properties of the V2a-D neurons are consistent with pacemakers. This could be true of both the V2a-D and -B neurons that burst in response to depolarizing steps but the properties of the remaining (fast) V2a-D neurons do not seem to be consistent with pacemakers, based on the properties shown. Tonic firing at a frequency related to the locomotor speed the neuron is active during and strong antiphase inhibition may instead suggest a stronger network component driving the rhythmicity. 

      We have been purposefully agnostic regarding the relative contribution of pacemaking to rhythm generation in the paper. Our measurements of bursting overlap with swim frequencies only in the V2a-D subtype. Similarly, the spike rates of V2a-D neurons alone overlap with their swim frequencies (Fig 2D,G,I). Since both respond to tonic input (current injection) by spiking in a pattern that resembles their natural spiking behavior, we have treated these cellular properties both as pacemaking. Although the bursting behavior is more consistent with what is normally considered pacemaking in rhythmic motor circuits, in the basal ganglia field tonic firing of dopaminergic neurons in the substantia nigra is referred to as pacemaking. Since the tonic firing pattern overlaps with swimming frequency in the same way the bursting pattern does, we are less inclined to discount its possible contribution to rhythmogenesis based on the fact they do not burst. We have made modifications to the document to make this point clearer (L409-416).  Regardless, our data argue that pacemaking is unlikely to be a major contributor to phasic firing in V2a neurons, at least at midbody, so we agree with you on this last point.

      Reviewer #1 (Recommendations For The Authors): 

      I only have very minor suggestions. 

      (1) It would be useful to add a table or a figure summarizing the main results (integration of anatomy, electrophysiological properties, synaptic inputs, firing, swimming speed). 

      We agree and have added a figure panel summarizing the main results (new Figure 8).

      (2) Some statistics to possibly add (only suggestions): Do bifurcating V2as display significantly weaker GFP labeling than descending V2as? Do descending V2as have a significantly smaller soma size? Do descending V2as have a significantly lower rheobase and significantly higher resistance? Are tonic descending neurons and chattering bifurcating neurons located significantly more dorsally than the bursting descending and bifurcating neurons? Is there a way to show that bifurcating bursting neurons are recruited statistically on a broader swimming frequency range than other cell types (e.g. SD, coefficient of variation, cumulative distribution function with Kolmogorov-Smirnov test)? 

      For the first question, in all cases when we targeted more dimly labeled neurons they were bifurcating. We now clarify this in the text (L119, L129-132). However, this is difficult to quantify, since absolute levels of fluorescence will vary from preparation to preparation based on the dissection and intensity of epifluorescence illumination. In addition, we did not always take images prior to recording and levels of GFP after recording will vary depending on relative state of dialysis. So, unfortunately, we cannot provide a rigorous statistical analysis beyond the qualitative statement we provide.

      For the remainder of the questions, we now provide statistical analysis for soma size, position, rheobase, and resistance for the data in Figure 2.  Please note, we have reported all our statistical analyses in the figure legends. We also provide analysis of the density distributions of swimming frequencies for slow bursting bifurcating neurons and slow bursting descending neurons as requested, which are significantly different following a K-S test (L162).

      (3) Some details to possibly add (only suggestions): proportion of neurons in which single cell fills were done/checked anatomically? Proportions of bursting/chattering/tonic/bursting neurons? In Figure 1, maybe define visually bifurcating vs descending neurons. In Figure 2I, the recruitment of bifurcating chattering neurons is not plotted. Is that normal? Figures 6D, E, maybe specify more clearly which neurons are the fast and slow ones. In Figure 3C, the X-axis name is missing. 

      For the first question, the proportion is 100%, since the morphology of all neurons was confirmed post recording, which we now clarify in the Methods section (L573). For the second question, the numbers of bursting/chattering/tonic/bursting neurons are now reported in legend of Figure 2, in addition to the total number of V2a-D and V2a-B types, so it is clear what proportion of the recording population this represents. For the third question, in Figure 1 we cannot define V2a neurons as bifurcating or descending yet, this was only possible to confirm after the recording (Figure 2), and was done for every neuron (as mentioned above). For the fourth question, for Figure 2I the chattering response was too variable to be meaningful in terms of averaging and plotting, which we now mention in the text (L169-171). The standard deviations are ridiculous. For the fifth question, we have modified Figures 6D, E to more clearly label fast and slow V2a neurons. Finally, we have included the X-axis label in Figure 3C, thank you!

      (4) Some text to possibly modulate (only suggestions): 

      A possible role for these V2a subtypes in the rhythm generation and pattern formation layer is an interesting idea but this may not be completely solved by the present experiments. Maybe the authors could suggest future experiments in the discussion that would establish how to tackle this important question (double bursts, deletions, etc...)? 

      We appreciate the opportunity to raise future experiments that could help further tease apart their contribution to rhythm and pattern and have now added potential experiments to the Discussion (L498-501; L527-529), which include more precise molecular identification, spatial perturbation, and computational modeling.

      It would be nice to cite the references in which the rhythm/pattern CPG concept was proposed initially (lines 49-50 and elsewhere, Cf. Perret and Cabelguen 1980 Brain Res; Perret et al. 1989 Stance and Motion, Plenum Press; McCrea et al. 2006 J Physiol). 

      Apologies for our poor scholarship here, we now credit the appropriate primary research articles (L50-51).

      In the abstract, it would be useful to say clearly which cells are descending vs. bifurcating ones. Same thing in the result section, maybe it would be nice to identify the two populations long before line 127. 

      We have modified the abstract and introduction sections accordingly. We also note that the two populations are defined in the first paragraph of the results (L90).

      About the possible mechanism of rhythm generation, it is mentioned in line 54 that a single mechanism was proposed to exist, but the authors also mention in lines 122-123 that several mechanisms were proposed for rhythm generation... Maybe adjust the introduction? 

      As requested, we have clarified our meaning in the introduction (L55-58). Several mechanisms exist, but the likelihood that different mechanisms operate at different speeds has not been considered.  Either cellular properties are tuned to different speeds (i.e., bursting is faster in neurons recruited at faster speeds) or network properties can explain different speeds (i.e., different frequencies and patterns emerge from the connectivity).

      About the convention that in fish in-phase currents originate from the ipsilateral and out-of-phase currents originate from the contralateral side (lines 271-275), is there any reference for this assumption? 

      Yes, we now provide references (L281-283).

      Lines 338-345 stating that reciprocal inhibition is important for rhythm generation as predicted by the half-center model can sound surprising to some authors considering that many studies showed that inhibition is not needed for rhythm generation, including lamprey hemicords stimulated electrically (Cangiano and Grillner 2003 J Neurophysiol; 2005 J Neurosci, Cangiano et al. 2012 Neuroscience), salamander hemicords or hemisegments stimulated chemically (Ryczko et al. 2010, 2015 J Neurophysiol), or rhythm activity evoked on each side of the cord using optogenetic stimulation of glutamatergic neurons (Hägglund et al. 2013 PNAS) etc. To demonstrate the importance of inhibition in rhythmogenesis, one would need to activate and/or deactivate the ipsilateral versus contralateral inhibitory neurons. It would be nice to maybe add citations to such studies if available in the zebrafish literature. Overall I would simply suggest modulating this section to be a bit more balanced conceptually. 

      We have included the above referenced studies for lampreys and added ones for tadpoles (L464-468), to stick with undulatory swimmers. We had focused on experiments with the most selective perturbations in the interests of space, but appreciate the opportunity to present both arguments. We also include new loss-of-function experiments that impact one spinal population linked to reciprocal inhibition (Dmrt3-labeled dI6 interneurons), which demonstrate a speed-specific impact on rhythmogenesis (L323-371; new Figure 7) and compare our findings to a recent study in the zebrafish literature examining the impact of spinal Dmrt3-ablations on axial rhythmogenesis (L426-433).

      Line 676 "episodies". 

      Thanks, corrected.

      Reviewer #2 (Recommendations For The Authors): 

      The authors make a claim that recurrent and reciprocal inhibition play key roles in rhythmogenesis at different speeds. This is not conclusively shown. Rayleigh's z-test can be used to test the significance of the directionality of circular data. Including more data from experiments or computational models to show the necessity of reciprocal or recurrent inhibition for timed spiking of V2a neurons would address this. 

      We have now modified Figure 6 so we can directly compare differences in reciprocal and recurrent inhibition between V2a types. We now report statistical analysis in the figure legends using a Watson’s Two Test for Homogeneity to test differences in the circular data. As mentioned above, we have also added new loss-of-function experiments as requested to strengthen the claim that reciprocal inhibition is critical for rhythmogenesis at fast speeds, but dispensable at slow. Specifically, we use botulinum toxin selectively expressed in Dmrt3-labeled dI6 interneurons, which play a role in reciprocal inhibition at a variety of speeds (new Figure 7). These experiments demonstrate a selective impact on rhythmic burst generation and alternation during periods of swimming where the highest frequency motor activity occurs. During lower frequency activity, rhythm generation is preserved, however motor output is selectively altered, consistent with the idea that reciprocal inhibition plays an important role in patterning at slow speeds.

      In Figure 4D, the authors show that V2a neurons, both subtypes, spike in advance of the center of the motor burst. Recent studies (Jay et al., 2023) have shown differences in the timing of V2aD and V2aB neurons. Are there differences in the methods or selection of cells that would reflect differences in results? 

      This is a great point and we appreciate the opportunity to reconcile our observations here with those in Jay et al., 2023. In the Jay et al paper, we used drifting visual stimuli to evoke fictive swimming.  These experiments allow you to uncouple rhythm generation (forward propulsion) and pattern formation (lateral direction). Notably, fictive swim frequencies during so called optomotor responses are below 35Hz, meaning that we are sampling exclusively from V2a neurons recruited during carangiform swim mode. In these experiments, slow V2a-D neurons fire well in advance of slow V2a-B neurons, compared to what we see here which is relatively synchronous. Critically, however, the phase-advanced firing pattern revealed in the Jay et al paper for V2a-D neurons aligns with the phase-advanced excitatory input reported here.  In addition, the recruitment probabilities of slow V2a-D neurons are higher in the Jay et al paper than what we report here. Collectively these observations suggest either more effective excitation during optomotor responses (Jay et al) or more potent inhibition during escape responses (Agha et al). Ultimately, differences in the relative synchrony of firing among slow V2a-D and slow V2a-B neurons appears to depend on the nature of the stimulus and range of swim frequencies, where in one case frequency and amplitude modulation are coupled over a broad range of frequencies (somatosensory stimuli delivered here), while in the other case frequency and amplitude modulation are uncoupled over a narrow range of frequencies (visual stimuli in Jay at al). We now elaborate on this point in the Discussion (L485-498).

      Given the conserved nature of spinal circuits across vertebrates, it is also important to discuss these findings in the context of limbed animals. In tetrapods, changes in locomotor speed also involve pattern/gait changes, however, it is not known if or how these changes in frequency and pattern are linked. This study, by suggesting that different speeds are implemented not only by different neurons but possibly by different neuronal mechanisms, provides important cues for the missing link and would strengthen the discussion. 

      We agree and have made substantial edits to the beginning Discussion to provide better context for the impact of our work (L373-398).

      Minor points: 

      Line 122: of needs to be replaced by or. 

      Corrected, thanks!

      Figure 3B Top panel: What is the grey bar? 

      This has been removed for clarity.

      Figure 3B bottom panel is not referenced in the main text at all. 

      Now referenced (L187, L189)

      Line 260: 2nd inhibition needs to be replaced with excitation. 

      Done, thanks!

      Reviewer #3 (Recommendations For The Authors): 

      Minor comments: 

      - Figure 2 panel ordering is visually appealing but tough to follow. 

      We apologize and tried reconfigurations, but they just looked too kludgy.  Hoping for a pass on this one.

      - Lines 164-166 and 319-327 (related to comment 2 above): For the fast/tonic V2a-Ds, it is not clear that this is intrinsic and it is not consistent with pacemaker properties. This could also be (and likely is) synaptically/network-driven rhythmicity, although the firing frequencies match up well with the swim frequencies. 

      Fast/tonic V2a-Ds were tested with somatic current injection as with all other neurons, which we assume primarily reflects intrinsic cellular properties. The spike rates we observe in fast/tonic V2a-Ds overlap with spike rates observed during fictive swimming, so they are positioned as well as bursting neurons to contribute to pacemaking. We also elaborate on this point in response to Major Comment #2.

      - Lines 189-192: The patterning neurons receive excitatory drive before rhythm-generating neurons. The time constant explanation makes sense for why two neurons with a common drive would fire at different times but this does not support the proposed hierarchical arrangement or being consistent with V2a-Bs being downstream as mentioned in lines 49-56 and 218-219. 

      In response to this point, we have modified Figure 6 so we can directly compare the timing of presynaptic excitatory inputs between the types. Here it can be seen clearly that phasic excitatory inputs to both fast and slow V2a-Ds are phase-advanced relative to fast and slow V2a-Ds (Figure 6B,C). As the reviewer mentions, it is likely a combination of time constants and the relative balance of excitation and inhibition that ultimately lead to synchronous spiking despite differences in the timing of inputs.

      - Lines 338-339: It is not shown that the rhythm relies on inhibition during slow. 

      This line has been removed in the revision process.

      - Consistent with the importance of reciprocal (contralateral) inhibition in fast locomotion here, rodent fictive locomotion is slower in hemisect than in the full cord. However, the Rybak and O'Donovan groups suggest that this is due to loss of drive to ipsilateral inhibitory neurons by excitatory contralateral projections, rather than contralateral inhibitory interneurons (see Falgairolle and O'Donovan 2019, 2021, and Shevtsova et al 2022). 

      This is an interesting point that highlights how we are defining reciprocal versus recurrent inhibition. In this example, although ipsilaterally-projecting interneurons are responsible for inhibition, since they are excited by commissurally-projecting excitatory interneurons, we would classify this as feedforward (reciprocal) not feedback (recurrent) inhibition. So reciprocal (feedforward) inhibition is still important to get higher frequency rhythms, it is di-synaptic in this case. We have added a new figure (Figure 8) to clarify what we mean by reciprocal (feedforward) and recurrent (feedback) based on the ipsilateral projection patterns of V2a neurons, and point out the definitions would be flipped for excitatory interneurons in the Discussion (L452-455).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works [1,2] have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Strengths:

      The authors have expertly characterised the system dynamics using both simulations and theory. How the speed and quality of retrieval varies across phases space has been well-studied. The authors are also able to vary the external inputs to reproduce a preparatory followed by an execution phase of sequence retrieval as seen experimentally in motor control. They also propose a simple reinforcement learning scheme for learning to map the two external inputs to the desired retrieval speed.

      Weaknesses:

      (1) The authors translate spike-based synaptic plasticity rules to a way to learn/set connections for rate units operating in discrete time, similar to their earlier work in [5]. The bio-plausibility issues of learning in [5] carry over here, for e.g. the authors ignore any input due to the recurrent connectivity during learning and effectively fix the pre and post rates to the desired ones. While the learning itself is not fully bio-plausible, it does lend itself to writing the final connectivity matrix in a manner that is easier to analyze theoretically.

      We agree with the reviewer that learning is not `fully bio-plausible’. However, we believe that extending the results to a model in which synaptic plasticity depends on recurrent inputs is beyond the scope of this work. We have added a mention of this issue in the Discussion in the revised manuscript.

      (2) While the authors learn to map the set of two external input strengths to speed of retrieval, they still hand-wire one external input to the subpopulation of neurons with temporally symmetric plasticity and the other external input to the other subpopulation with temporally asymmetric plasticity. The authors suggest that these subpopulations might arise due to differences in the parameters of Ca dynamics as in their earlier work [29]. How these two external inputs would connect to neurons differentially based on the plasticity kernel / Ca dynamics parameters of the recurrent connections is still an open question which the authors have not touched upon.

      The issue of how external inputs could self-organize to drive the network to retrieve sequences at appropriate speeds is addressed in the Results section, paragraph `Reward-driven learning’. These inputs are not `hand-wired’ - they are initially random and then acquire the necessary strengths to allow the network to retrieve the sequences at different speeds thanks to a simple reinforcement learning scheme. We have rewritten this section to clarify this issue.

      (3) The authors require that temporally symmetric and asymmetric learning rules be present in the recurrent connections between subpopulations of neurons in the same brain region, i.e. some neurons in the same brain region should have temporally symmetric kernels, while others should have temporally asymmetric ones. The evidence for this seems thin. Though, in the discussion, the authors clarify 'While this heterogeneity has been found so far across structures or across different regions in the same structure, this heterogeneity could also be present within local networks, as current experimental methods for probing plasticity only have access to a single delay between pre and post-synaptic spikes in each recorded neuron, and would therefore miss this heterogeneity'.

      We agree with the reviewer that this is currently an open question. We describe this issue in more detail in the Discussion of the revised manuscript.

      (4) An aspect which the authors have not connected to is one of the author's earlier work:

      Brunel, N. (2016). Is cortical connectivity optimized for storing information? Nature Neuroscience, 19(5), 749-755. https://doi.org/10.1038/nn.4286 which suggests that the experimentally observed over-representation of symmetric synapses suggests that cortical networks are optimized for attractors rather than sequences.

      We thank the reviewer for this suggestion. We have added a paragraph in the discussion that discusses work on statistics of synaptic connectivity in optimal networks. We expect that in networks that contain two subpopulations of neurons, the degree of symmetry should be intermediate between a network storing fixed point attractors exclusively, and a network storing sequences exclusively.

      Despite the above weaknesses, the work is a solid advance in proposing an alternate model for modulating speed of sequence retrieval and extends the use of well-established theoretical tools. This work is expected to spawn further works like extending to a spiking neural network with Dale's law, more realistic learning taking into account recurrent connections during learning, and experimental follow-ups. Thus, I expect this to be an important contribution to the field.

      We thank the reviewer for the insightful comments.

      Reviewer #2 (Public Review):

      Sequences of neural activity underlie most of our behavior. And as experience suggests we are (in most cases) able to flexibly change the speed for our learned behavior which essentially means that brains are able to change the speed at which the sequence is retrieved from the memory. The authors here propose a mechanism by which networks in the brain can learn a sequence of spike patterns and retrieve them at variable speed. At a conceptual level I think the authors have a very nice idea: use of symmetric and asymmetric learning rules to learn the sequences and then use different inputs to neurons with symmetric or asymmetric plasticity to control the retrieval speed. The authors have demonstrated the feasibility of the idea in a rather idealized network model. I think it is important that the idea is demonstrated in more biologically plausible settings (e.g. spiking neurons, a network with exc. and inh. neurons with ongoing activity).

      Summary

      In this manuscript authors have addressed the problem of learning and retrieval sequential activity in neuronal networks. In particular, they have focussed on the problem of how sequence retrieval speed can be controlled?

      They have considered a model with excitatory rate-based neurons. Authors show that when sequences are learned with both temporally symmetric and asymmetric Hebbian plasticity, by modulating the external inputs to the network the sequence retrieval speed can be modulated. With the two types of Hebbian plasticity in the network, sequence learning essentially means that the network has both feedforward and recurrent connections related to the sequence. By giving different amounts of input to the feed-forward and recurrent components of the sequence, authors are able to adjust the speed.

      Strengths

      - Authors solve the problem of sequence retrieval speed control by learning the sequence in both feedforward and recurrent connectivity within a network. It is a very interesting idea for two main reasons: 1. It does not rely on delays or short-term dynamics in neurons/synapses 2. It does not require that the animal is presented with the same sequences multiple times at different speeds. Different inputs to the feedforward and recurrent populations are sufficient to alter the speed. However, the work leaves several issues unaddressed as explained below.

      Weaknesses

      - The main weakness of the paper is that it is mostly driven by a motivation to find a computational solution to the problem of sequence retrieval speed. In most cases they have not provided any arguments about the biological plausibility of the solution they have proposed e.g.:

      - Is there any experimental evidence that some neurons in the network have symmetric Hebbian plasticity and some temporally asymmetric? In the references authors have cited some references to support this. But usually the switch between temporally symmetric and asymmetric rules is dependent on spike patterns used for pairing (e.g. bursts vs single spikes). In the context of this manuscript, it would mean that in the same pattern, some neurons burst and some don't and this is the same for all the patterns in the sequence. As far as I see here authors have assumed a binary pattern of activity which is the same for all neurons that participate in the pattern.

      There is currently only weak evidence for heterogeneity of synaptic plasticity rules within a single network, though there is plenty of evidence for such a heterogeneity across networks or across locations within a particular structure (see references in our Discussion). The reviewer suggests another interesting possibility, that the temporal asymmetry could depend on the firing pattern on the post-synaptic neuron. An example of such a behavior can be found in a paper by Wittenberg and Wang in 2006, where they show that pairing single spikes of pre and post-synaptic neurons lead to LTD at all time differences in a symmetric fashion, while pairing a pre-synaptic spike with a burst of post-synaptic spikes lead to temporally asymmetric plasticity, with a LTP window at short positive time differences. We now mention this possibility in the Discussion, but we believe exploring fully this scenario is beyond the scope of the paper.

      - How would external inputs know that they are impinging on a symmetric or asymmetric neuron? Authors have proposed a mechanism to learn these inputs. But that makes the sequence learning problem a two stage problem -- first an animal has to learn the sequence and then it has to learn to modulate the speed of retrieval. It should be possible to find experimental evidence to support this?

      Our model does not assume that the two processes necessarily occur one after the other. Importantly, once the correct external inputs that can modulate sequence retrieval are learned, sequence retrieval modulation will automatically generalize to arbitrary new sequences that are learned by the network.

      - Authors have only considered homogeneous DC input for sequence retrieval. This kind of input is highly unnatural. It would be more plausible if the authors considered fluctuating input which is different from each neuron.

      We have modified Figure 1e and Figure 2c to show the effects of fluctuating inputs on pattern correlations and single unit activity. We find that these inputs do not qualitatively affect our results.

      - All the work is demonstrated using a firing rate based model of only excitatory neurons. I think it is important that some of the key results are demonstrated in a network of both excitatory and inhibitory spiking neurons. As the authors very well know it is not always trivial to extend rate-based models to spiking neurons.

      I think at a conceptual level authors have a very nice idea but it needs to be demonstrated in a more biologically plausible setting (and by that I do not mean biophysical neurons etc.).

      We have included a new section in the discussion with an associated figure (Figure 7) demonstrating that flexible speed control can be achieved in an excitatory-inhibitory (E-I) spiking network containing two excitatory populations with distinct plasticity mechanisms.

      Reviewer #1 (Recommendations For The Authors):

      In the introduction, the authors state: 'symmetric kernels, in which coincident activity leads to strengthening regardless of the order of pre and post-synaptic spikes, have also been observed in multiple contexts with high frequency plasticity induction protocols in cortex [21]'. To my understanding, [21]'s final model 3, ignores LTD if the post-spike also participates in LTP, and only considers nearest-neighbour interactions. Thus, the kernel would not be symmetric. Can the authors clarify what they mean and how their conclusion follows, as [21] does not show any kernels either.

      In this statement, we were not referring to the model in [21], but rather the experimentally observed plasticity kernels at different frequencies. In particular, we were referring to the symmetric kernel that appears in the bottom panel of Figure 7c in that paper.

      The authors should also address the weaknesses mentioned above. They don't need to solve the issues but expand (and maybe indicate resolutions) on these issues in the Discussion.

      For ease of reproducibility, the authors should make their code available as well.

      We intend to publish the code required to reproduce all figures on Github.

      Reviewer #2 (Recommendations For The Authors):

      -  Show the ground state of the network before and after learning.

      We have decided not to include such a figure, as we have not analyzed the learning process, but instead a network with a fixed connectivity matrix which is assumed to be the end result of a learning process.

      -  Authors have only considered a network of excitatory neurons. This does not make sense. I think they should demonstrate a network of both exc. and inch. neurons (spiking neurons) exhibiting ongoing activity.

      See our comment to Reviewer #2 in the previous section.

      -  Show how the sequence dynamics unfolds when we assume a non-zero ongoing activity.

      We are not sure what the reviewer means by `non-zero ongoing activity. We show now the dynamics of the network in the presence of noisy inputs, which can represent ongoing activity from other structures (see Fig 1e and 2c).

      -  From the correlation (==quality) alone it is difficult to judge how well the sequence has been recovered. Authors should consider showing some examples so that the reader can get a visual estimate of what 0.6 quality may mean. High speed is not really associated with high quality (Fig 2b). So it is important to show how the sequence retrieval quality is for non-linear and heterogeneous learning rules.

      We believe that some insight into the relationship between speed and quality for the case of non-linear and heterogeneous learning rules is addressed by the correlation plots for chosen input configurations (see Fig. 3a and and 5b). We leave a full characterization for future work.

      -  Authors should show how the retrieval and quality of sequences change when they are recovered with positive input, or positive input to one population and negative to another. In the current version sequence retrieval is shown only with negative inputs. This is a somewhat non-biological setting. The inhibitory gating argument (L367-389) is really weak.

      We would like to clarify that with the parameters chosen in this paper, the transfer function has half its maximal rate at zero input. This is due to the fact we chose the threshold to be zero, using the fact that any threshold can be absorbed in the external inputs. Thus, negative inputs really mean sub-threshold inputs, and they are consistent with sub-threshold external excitatory inputs. We have clarified this issue in the revised manuscript.

      -  Authors should demonstrate how the sequence retrieval dynamics is altered when they assume a fluctuating input current for sequence retrieval instead of a homogeneous DC input.

      See our comment to Reviewer #2 in the previous section.

      -  Authors should show what are the differences in synaptic weight distribution for the two types of learning (bi-linear and non-linear). I am curious to know if the difference in the speed in the two cases is related to the weight distribution. In general I think it is a good idea to show the synaptic weight distribution before and after learning.

      As mentioned above, we do not study any learning process, but rather a network with a fixed connectivity matrix, assumed to represent the end result of learning. In this network, the distribution of synaptic weights converges to a Gaussian in the large p and cN limits, independently of the functions f and g, because of the central limit theorem, if there are no sign constraints on weights. In the presence of sign constraints, the distribution is a truncated Gaussian.

      -  I suggest the use of a monochromatic color scale for figure 2b and 3b.

      Figure 3: The sentence describing panel 2 seems incomplete.

      Also explain why there is non-monotonic relationship between I_s and speed for some values of

      I_a in 3b

      There is a non-monotonic relationship for retrieval quality, not speed. We have clarified this in the manuscript text, but don’t currently have an explanation for why this phenomenon occurs for these specific values of I_a.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Additional Discussion Points

      (1) There is not much exploration of potential mechanisms, i.e., the impact of PV neuron activity on the broader circuit. Additionally, the study exclusively focuses on PV cells and does not explore the role of other prefrontal populations, particularly those known to respond to cueevoked fear states. The discussion should consider how PV activity might impact the broader circuit and whether the present findings are specific to PV cells or applicable to other interneuron subtypes.

      We have added an extensive discussion of potential mechanisms and the potential contributions of other interneuron subtypes:

      “For example, PV neurons aid in improving visual discrimination through sharpening response selectivity in visual cortex (Lee et al., 2012). In prefrontal cortex, PV neurons are critical for task performance, particularly during performance of tasks that require flexible behavior such as rule shift learning (Cho et al., 2020) and reward extinction (Sparta et al., 2014). Further, PV neurons play an essential role in the generation of cortical gamma rhythms, which contribute to synchronization of selective populations of pyramidal neurons (Sohal et al., 2009; Cardin et al., 2009). Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004). These and other studies support the idea that PV neural activity supports the execution of a behavior by shaping rather than suppressing cortical activity, potentially by selecting among conflicting behaviors by the synchronization of different pyramidal populations (Warden et al., 2012; Lee et al., 2014).

      The roles of other inhibitory neural subtypes (such as somatostatin (SOM)-expressing and vasoactive intestinal peptide (VIP)-expressing IL GABA neurons) in avoidance behavior are currently unknown, but are likely important given the role of SOM neurons in gamma-band synchronization (Veit et al., 2017), and the role of VIP neurons in regulating PV and SOM neural activity (Cardin, 2018).” 

      (2) There is some discordance between changes in neural activity and behavior. For example, in Figure 4C, the relationship between PV neuron activity and movement emerges almost immediately during learning, but successful active avoidance emerges much more gradually. Why is this?

      We have added extensive text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (3) vmPFC was defined here as including the infralimbic (IL) and dorsal peduncular (DP) regions. While the role of IL has been frequently characterized for motivated behavior, relatively few studies have examined DP. Perhaps the authors are just being cautious, given the challenges involved in the viral targeting of the IL region without leakage to nearby regions such as DP. But since the optical fibers were positioned above the IL region, it is possible that DP did not contribute much to either the fiber photometry signals or the effects of the optogenetic manipulations. Perhaps DP should be completely omitted, which is more consistent with the definitions of vmPFC in the field.

      Yes, we included DP to be cautious as our viral expression sometimes leaks into DP, though the optic fiber targets IL. We have replaced vmPFC with IL throughout the manuscript. 

      (4) In the Discussion, the authors should consider why PV cells exhibit increased activity during both movement initiation and successful chamber crossing during avoidance. While the functional contribution of the PV signal during movement initiation was tested with optogenetic inhibition, some discussion on the possible role of the additional PV signal during chamber crossing is of interest readers who are intrigued by the signaling of two events. Is the chamber crossing signal related to successful avoidance or learned safety (e.g., see Sangha, Diehl, Bergstrom, Drew 2020)?

      IL PV neural activity starts to increase at movement initiation, peaks at chamber crossing (when movement speed is highest), and decreases after chamber crossing (Figure 1E). Thus, the increase in PV neural activity at movement initiation and at chamber crossing are different phases of the same event. 

      We think this signal is unlikely to be a safety signal, and have added text to the discussion to clarify this issue:

      “We think the IL PV signal is unlikely to be a safety signal (Sangha et al., 2020). First, the PV signal rises during movement not only in the avoidance context, but during any movement in a “threatening” context (i.e. a context where the animal has been shocked). For example, PV neural activity rises during movement during the intertrial interval in the avoidance task. Further, the emergence of the PV signal during movement happens quickly – after the first shock – and significantly before the animal has learned to move to the safe zone. This suggests a close association with enabling movement in a threatening environment, when animals must suppress a freezing response in order to move. Additionally, the rise in PV activity was specifically associated with movement and not with tone offset, the indicator of safety in this task. Finally, if IL PV neural activity reflects safety signals one would expect the response to be enhanced by learning, but the amplitude of the IL PV response was unaffected by learning after the first shock.”

      (5) The primary conclusion here that PV cells control the fear response should be considered within the context of prior findings by the Herry laboratory. Courtin et al (2014) demonstrated a select role of prefrontal PV cells in the regulation of fear states, accomplished through their control over prefrontal output to the basolateral amygdala. The observations in this paper, which used both ChR2 and Arch-T to address the impact of vmPFC PV activity on reactive behavior, are highly relevant to issues raised both in the Introduction and Discussion.

      Courtin et al (2014)’s finding is very important. We did not discuss this paper originally because Courtin et al. is about dmPFC, which has a different role in fear processing than IL/vmPFC. We have added text about this finding to the discussion:

      “Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004).

      Additional analyses

      (1) As avoidance trials progress (particularly on days 2 and 3), do PFC PV responses attenuate? That is, does continued unreinforced tone presentations lead to reduced reliance of PV cellmediated suppression in order for successful avoidance to occur?

      We added Figure 1—Figure supplement 1M and 1N and a sentence on page 5: “IL PV neural activity during the avoidance movement was not attenuated by learning or repeated reinforcement (Figure 1—Figure supplement 1M and N, N = 8 mice, p = 0.8886, 1-way ANOVA).” We only included data from days 1 and 2, since we started to introduce short and long tone trials on day 3 which might interfere. 

      (2) In Figure 3D, it would be very informative and further support the claim of "no role for movement during reward" if the response of these cells during the "initiation of movement during reward-approach" was shown (similar to Figure 1F for threat avoidance).

      Thank you for the question. We added Figure 3—Figure supplement 1B and C to show IL PV neural activity aligned to initiation of movement during reward-approach. IL PV activity decreased after movement initiation for reward approach (N = 6 mice, p=0.0382, paired t-test). This further solidifies our claim that IL PV neuron activity only increases for threat avoidance.   

      Reviewer 1 (Recommendations For The Authors):

      (1) Fig1G shows the average response of PV cells during chamber crossing on an animal-toanimal basis. It would be informative to also see a similar plot for movement initiation.

      We have added the suggested figure in Figure 1—Figure supplement 1B.  

      (2) In the Results section (Page 5), there is a small issue with the logic. It says: "As vmPFC inactivation impairs avoidance behavior, the activity of inhibitory vmPFC PV neurons might be predicted to be low during successful avoidance trials." As opposed to "low", it should say "high", right? If inhibition impairs avoidance, then high responding by these cells would be presumed to drive the avoidance response, as supported by your findings.

      We have re-worded the text in this section. Based on prior findings that IL inactivation impairs avoidance (Moscarello et al., 2013), we predicted that inhibitory PV neurons would be less active during avoidance, because activating these neurons could suppress IL. However, we found that they were selectively active during avoidance.

      (3) In the caption/legend for Fig1E, it says that the "black ticks" indicate "tone onset". But it should say "movement initiation".

      We thank the reviewer for pointing out this error. The ticks do indicate tone onset, and we have corrected the figure to reflect this. 

      Reviewer 2 (Recommendations For The Authors):

      (4) Perhaps replace the term 'good outcomes' with 'reinforcing outcomes' or simply 'reinforcement'.

      Thank you for the suggestion. We have replaced ‘good outcomes’ with ‘reinforcing outcomes’.

      Reviewer 3 (Recommendations For The Authors):

      (5) It would be useful to provide some (perhaps speculative) explanation for the discordance between the PV activity-movement relationship and success of active avoidance in Fig. 4C

      We have added text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (6) I don't really understand what is shown in Figure 4D -- exactly what time points does this represent? Was habituation performed everyday?

      Figure 4D shows data from the approach task, not the avoidance task. This data is from welltrained mice, not the first day of training on this task. There was a pre-task recording period every day.

      (7) Why was optogenetic inhibition only delivered from 0.5-2.5 sec after the tone cue?

      We wanted to avoid any possibility that perception of the tone would be disrupted, so we delayed the onset of optogenetic inhibition. We chose 0.5 sec onset because animals typically begin to move ~1 second after tone onset.

      (8) The regression analysis with shuffled time points is not well explained -- some additional methodological details are needed (Fig. 2H).

      We added the following to the methods section to provide a clearer explanation: 

      “DF/F (t) was modeled as the linear combination of all event kernels. Given the event occurrence time points of all event types, we can use linear regression to decompose characteristic kernels for each event type. Kernel coefficients of the model were solved by minimizing the mean square errors between the model and the actual recorded signals. To prove that kernel ki is an essential component for the raw calcium dynamics, we compared the explanation power of the full model to the reduced model where the time points of the occurrence of event ki were randomly assigned. Thus, the kernel coefficients should not reflect the response to the event in the reduced model. 

      Editor's notes:

      -  Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the pvalue is less than 0.05.

      Thank you for pointing this out. We have included all the test statistics and exact p values as suggested.

      -  Please note the sex of the mice and distribution of sexes in each group for each experiment.

      We have added the sex of mice for all experiments in the methods section.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 2 (Public Review):

      Stress response in males versus females: The authors argue that the contextual control over behaviour was more robust in female rats as females show less within session variability and greater resistance to stress. What evidence is there that the restraint stress procedure caused a similar stress response in both sexes? That is, was the stress induction equally effective in males and females?

      The restraint protocol used in this study is a well-established stressor in rodents, known to produce robust behavioral and physiological effects (HPA axis activation), in both sexes. Although not measured in this study, the ACTH and cortisol responses are actually greater in females during restraint. To the extent that “stress induction” is interpreted as “HPA axis activation”, this strongly suggests that the stress induction in males and females was at least comparable, if not greater in females.

      We have added a few sentences (in the Result and Method section) to highlight this important point. We thank the reviewer for bringing this up.

      Minor corrections:<br /> (1) Please verify that the in-text reference to the figures is correct. I noticed a few mistakes, for example:

      - Line 120 (pdf) refers to Fig. 1 C-D but should refer to D only.

      - Line 312 (pdf) refers to Fig 1D for discrimination ratios but these are shown in Fig 1E

      - No reference in text to 2A

      Thank you for bringing this to our attention. We have fixed the in-text references to the figures.

      (2) In the results it states that the homecage c-Fos+ counts are shown in Figure 5 but I couldn't see these?

      The homecage c-Fos+ counts were initially shown as a pale gray band in the background of the main histograms. Because those counts are very low, it was hard to dissociate this gray band from the black horizontal axis. We have replaced the gray band with a more vivid blue line that is now in the foreground of the histograms. Moreover, we added a note in the figure legend to bring readers’ attention to this homecage count line, close to floor level. 

      (3) Line 306: It is stated that "the use of differential outcomes presumably allows animals to solve the task via simple (nonhierarchical) summation processes". I don't understand the use of "summation" here, isn't it simply that the rats are relying on direct context-outcome and/or cue-outcome associations?

      That’s right. These rats might be relying on direct context-outcome and cue-outcome associations and adding (or summing up) the converging expectations. We have added a few words in the text to clarify what we mean by summation (i.e. the addition of converging cue-evoked + context-evoked predictions).

    1. Author response:

      We thank the reviewers for their kind comments and advice. Like Reviewer 1, we acknowledge that while the exact involvement of Ih in allowing smooth transitions is likely not universal across all systems, our demonstration of the ways in which such currents can affect the dynamics of the response of complex rhythmic motor networks provides valuable insight. To address the concerns of Reviewer 2, we intend to include a sentence in the discussion to highlight the fact that cesium neither increased the pyloric frequency nor cause consistent depolarization in intracellular recordings. We will also highlight that these observations suggest both that cesium is not indirectly raising [K+]outside and support the conclusion that the effects of cesium are primarily through blockade of Ih rather than other potassium channels.

      Reviewer 3 raised some important points about modeling. While the lab has models that explore the effects of temperature on artificial triphasic rhythms, these models do not account for all the biophysical nuances of the full biological system. We have limited data about the exact nature of temperature-induced parameter changes and the extent to which these changes are mediated by intrinsic effects of temperature on protein structure versus protein interactions/modification by e.g. phosphorylation. With respects to the A current, we have seen in Tang et al., 2010 that the activation and inactivation rates are differentially temperature sensitive but do not have the data to suggest whether or not the time courses of such sensitivities are different as well. We intend to mention these facts in the paper, but plan to leave more comprehensive modeling as the purview of future works.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work successfully identified and validated TRLs in hepatic metastatic uveal melanoma, providing new horizons for enhanced immunotherapy. Uveal melanoma is a highly metastatic cancer that, unlike cutaneous melanoma, has a limited effect on immune checkpoint responses, and thus there is a lack of formal clinical treatment for metastatic UM. In this manuscript, the authors described the immune microenvironmental profile of hepatic metastatic uveal melanoma by sc-RNAseq, TCR-seq, and PDX models. Firstly, they identified and defined the phenotypes of tumor-reactive T lymphocytes (TRLs). Moreover, they validated the activity of TILs by in vivo PDX modelling as well as in vitro coculture of 3D tumorsphere cultures and autologous TILs. Additionally, the authors found that TRLs are mainly derived from depleted and late activated T cells, which recognize melanoma antigens and tumor-specific antigens. Most importantly, they identified TRLs associated phenotypes, which provide new avenues for targeting expanded T cells to improve cellular and immune checkpoint immunotherapy.

      Strengths:

      Jonas A. Nilsson, et al. has been working on new therapies for melanoma.  The team has also previously performed the most comprehensive genome-wide analysis of uveal melanoma available, presenting the latest insights into metastatic disease. In this work, the authors performed paired sc-RNAseq and TCR-seq on 14 patients with metastatic UM, which is the largest single-cell map of metastatic UM available. This provides huge data support for other  studies of metastatic UM.

      We thank the reviewer for these kind words about our work.

      Weaknesses:

      Although the paper does have strengths in principle, the weaknesses of the paper are that these strengths are not  directly demonstrated. That is,  insufficient analyses are performed to fully support the key claims in the manuscript by the data presented. In particular:

      The author's description of the overall results of the article should be logical, not just a description of the observed phenomena. For example, the presentation related to the results of TRLs lacked logic. In addition, the title of the article emphasizes the three subtypes of hepatic metastatic UM  TRLs, but these three subtypes are not specifically discussed in the results as well as the discussion section. The title of the article is not a very comprehensive generalization and should be carefully considered by the authors.

      We thank the reviewer for the critical reading of our work. We have added more data and more discussion.

      The authors' claim that they are the first to use autologous TILs and sc-RNAseq to study immunotherapy needs to be supported by the corresponding literature to be more convincing. This can help the reader to understand the innovation and importance of the methodology.

      We have gone through the manuscript and found that we only refer to being first in using PDX models and autologous TILs to study immunotherapy responses by single-cell sequencing. While there are data to be deduced from other studies, we still believe this to be an accurate statement.

      In addition, the authors argue that TILs from metastatic UM can kill tumor cells. This is the key and bridging point to the main conclusion of the article. Therefore, the credibility of this conclusion should be considered.  Metastatic UM1 and UM9 remain responsive to autologous tumors under in vitro conditions with their autologous TILs.

      UM1 responds also in vivo in the subcutaneous model in the paper. We have also finished an experiment where we show that this model also responds in a liver metastasis model. These data have been added in this revised version of the paper. We add two main figures and one supplementary figure where we characterize the response in vivo and also by single-cell sequencing of TILs.

      In contrast, UM22, also as a metastatic UM, did not respond to TIL treatment. In particular, the presence of MART1-responsive TILs. The reliability of the results obtained by the authors in the model of only one case of UM22 liver metastasis should be considered. The authors should likewise consider whether such a specific cellular taxon might also exist in other patients with metastatic UM, producing an immune response to tumor cells. The results would be more comprehensive if supported by relevant data.

      The reviewer has interpreted the results absolutely right, the allogenic and autologous MART1-specific TILs cells while reactive in vitro against UM22, cannot kill this tumor either in a subcutaneous or liver metastases model. We hypothesize this has to do with an immune exclusion phenotype and show weak immunohistochemistry that suggest this. We hope the addition of more UM1 data can be viewed as supportive of tumor-reactivity also in vivo.

      In addition, the authors in that study used previously frozen biopsy samples for TCR-seq, which may be associated with low-quality sequencing data, high risk of outcome indicators, and unfriendly access to immune cell information. The existence of these problems and the reliability of the results should be considered. If special processing of TCR-seq data from frozen samples was performed, this should also be accounted for.  

      We agree with the reviewers and acknowledge we never anticipated the development of single-cell sequencing techniques when we started biobank 2013. We performed dead cell removal before the 10x Genomics experiment. We have also done extensive quality controls and believe that the data from the biopsies should be viewed as a whole and that quantitative intra-patient comparisons cannot be done.

      Reviewer #2 (Public Review):  

      Summary:  

      The study's goal is to characterize and validate tumor-reactive T cells in liver metastases of uveal melanoma (UM), which could contribute to enhancing immunotherapy for these patients. The authors used single-cell RNA and TCR sequencing to find potential tumor-reactive T cells and then used patientderived xenograft (PDX) models and tumor sphere cultures for functional analysis. They discovered that tumor-reactive T cells exist in activated/exhausted T cell subsets and in cytotoxic effector cells. Functional experiments with isolated TILs show that they are capable of killing UM cells in vivo and ex vivo.

      Strengths:  

      The study highlights the potential of using single-cell sequencing and functional analysis to identify T cells that can be useful for cell therapy and marker selection in UM treatment. This is important and novel as conventional immune checkpoint therapies are not highly effective in treating UM. Additionally, the study's strength lies in its validation of findings through functional assays, which underscores the clinical relevance of the research. 

      We thank the reviewer for these kind words about our work.

      Weaknesses:  

      The manuscript may pose challenges for individuals with limited knowledge of single-cell analysis and immunology markers, making it less accessible to a broader audience.

      The first draft of the manuscript (excluding methods) was written by a person (J.A.N) who is not a bioinformatician. It has been corrected to include the correct nomenclature where applicable but overall it is written with the aim to be understandable. We have made an additional effort in this version. 

      Reviewer #1 (Recommendations For The Authors):  

      (1) Firstly, the authors should provide high-resolution pictures to ensure readability for readers. 

      We have converted to pdf ourselves and that improved resolution. We are happy to provide high-resolution to the office if needed for the printing.

      (2) Furthermore, some parts of the article are more colloquial, and the authors should consider the logic and academic nature of the overall writing of the article. For example, authors should double-check whether the relevant expressions in the results are correct. For example, 'TCR' in the fourth part of the results should be 'TRLs'.

      We thank the reviewer for the recommendations and have gone through the manuscript.

      (3) Moreover, UM22 is described several times in the results as a metastatic UM and should be clearly defined in the methodology.

      The UM22 and UM1 samples are described in-depth in Karlsson et al., Nature Communications, 2020, a paper that is cited in the beginning of Results as part of the narrative. The current work can be viewed as an extension of that work.

      (4) Finally, it is recommended that authors describe a part of the results in full before citing the corresponding picture, otherwise, it will lead to confusion among readers.

      We have made an effort in the revised version to describe the new data in more detail.

      Reviewer #2 (Recommendations For The Authors):  

      The manuscript is very interesting and important to understanding key aspects of uveal melanoma immune profile and functionality. However, in my opinion, there are a few aspects that could be addressed.  

      - The manuscript lacks comprehensive details about the samples used, such as their disease progression, response to treatment, or any relevant information that could shed light on potential differences between samples. It would be valuable to know whether these samples were collected before any systemic treatment or if any of the patients underwent immunotherapy post-sample collection, along with the outcomes of such treatments. Providing this information would enrich the manuscript and provide a more holistic view of the research.

      We thank the reviewer for the recommendation and have included a new Supplementary table 7 with information about the samples. We have also pasted in individual samples’ contribution to the UMAP to add further holistic view.  

      - The results presented and discussed in the manuscript seem to indicate that there were no significant differences across the various samples, including comparisons between lymph-node and liver metastases. However, this lack of variation or the reasons for not discussing any observed differences should be clarified. If there are distinctions between the samples, it would be beneficial to discuss these findings in the manuscript.

      We thank the reviewer for the recommendation. Whereas 14 samples are many for a uveal melanoma study it is not really powered to do intra-patient comparisons.

      - The manuscript may pose difficulties for individuals with limited knowledge of single-cell analysis and immunology markers, potentially limiting its accessibility. To make the research more inclusive, the authors might consider presenting the technical aspects of their work in a less descriptive manner and providing explanations for those less familiar with the technology. This would help a broader audience grasp the significance of the study's findings. 

      The manuscript is from a multidisciplinary team where all have read and commented. The draft was written by a tumor biologist and edited by a bioinformatician for accuracy. We honestly think it is more understandable than most studies in this bioinformatics era. But we have tried to describe the new data in an easier way.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Line 56: replace "pyomastitis" with "pyogenic skin infections".

      Corrected.

      (2) Line 58: replace "basal strains" with "ancestral strains".

      Corrected.

      (3) Line 62: population structure impacts gene acquisition too, however, gene acquisitions can be easier to connect with a phenotype. For example, acquisition of mecA is thought to be adaptive rather than just linked to a successful lineage. This same reasoning applies to resistance-associated mutations such as gyrA mutations in ST22 emergence.

      We completely agree with the reviewer that population structure also impacts gene acquisition. We wanted to convey that connecting gain or loss of genes to a change in particular phenotype is much easier than doing the same for a mutation, specially in the presence of strong linkage, and therefore gene level analysis is the focus of many previous studies. We have rewritten the sentence to better convey this idea:

      “Due to this limitation, studies of emerging strains often focus on gene level analysis such as acquisition of mobile genetic elements or loss of gene function as their effect on phenotype is easier to determine than that of point mutations.”

      (4) Line 112 this might be simply due to the smaller size of the intergenic regions chosen. I suggest to correct for the size of the genome segment considered.

      We thank the reviewer for pointing this out. The size of the intergenic was indeed the simple explanation for this observation. We have added the following sentence to the manuscript:

      “This is reflective of the fact that most of S. aureus genome sequence comprises of ORFs e.g. ~84% of TCH1516 genome is part of an ORF.”

      (5) Line 189: please add p values to supp table 2.

      We have added the p and q values from DBGWAS into Supp table 2. It is under the ‘DBGWAS Result’ sheet.

      (6) Line 227: high entropy indicates that this site is polymorph, not necessarily that there is selective pressure. In the extreme, this might actually point to a neutral position, since any amino-acid could be equally present (see for example https://www.nature.com/articles/s41467-022-31643-3#Sec10 ).

      We agree that high entropy by itself may point to a position with neutral selection leading to some false positives. However, we were focused on positions that were mostly biallelic in CC8, and with differential prevalence in USA300 vs non-USA300 (albeit in the presence of strong linkage disequilibrium) in addition to having high entropy in non-CC8 strains. This helps us filter some of the positions that were mostly monoallelic or with rare mutations while preserving other sites of interest. The approach was able to find cap5E mutation which has been associated with disruption of capsule production.

      (7) Line 271: show USA500 on the tree.

      Our current study is mostly focused on differences between USA300 and non-USA300 strains and we want to highlight those differences in the tree.

      (8) Line 327: still not possible to infer causality.

      We have changed the language to remove mentions of causality and instead talk about the association of GWAS enriched genes with measured transcriptional changes. The revised sentence now reads:

      “Here, we demonstrated how a model of transcriptional regulation with iModulons can be used to make a headway through the impasse created by the high linkage disequilibrium and identify GWAS-enriched mutations that are also associated with measurable phenotypic changes in the TRN.”

      (9) Line 324: subclades reference.

      We are unsure what this means.

      (10) Line 366: the authors seem to have used a bespoke pan-genome analysis approach. Would they be able to validate it using established tools such as Roary, Pirate or Panaroo? Panaroo in particular appears to have superior accuracy thanks to its pan-genome graph approach (https://github.com/gtonkinhill/panaroo). 

      We have added the results of Roary to our analysis (Figure S1b). The roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      (11) Line 397: what was the size of the core genome?

      There were 24881 core sites. We have added the number to the manuscript.

      (12) Line 407: please add citation or website for SCCmecFinder.

      The citation of SCCmecFinder (45) is at the end of the sentence.

      (13) Line 421: I was not able to find the code used for this analysis in the github repository provided.

      The code can be found in “notebook/02_Preprocess_DBGWAS.ipynb” within the repo.

      (14) Line 427: this is a very complex analysis for a simple univariate comparison between USA300-vs-non USA300 strains with no correction for population structure. The authors should compare their results with a more established pipeline like Pyseer or Gemma that can handle kmers and show the added value of their approach.

      We wanted to take advantage of DBGWAS’s ability to collapse kmers into unitigs and further collapse significant unitigs within a genetic neighborhood into components. Unfortunately, we found that in many cases, it became difficult to determine the exact mutation that was being enriched e.g. (T234G) without doing lots of manual work. Our network analysis simply parses the DBGWAS graph to automatically extract these mutations, making the results more interpretable. It does not do any additional hypothesis testing.

      We also attempted to pass kmer data into GEMMA but without the compaction provided by DBGWAS the memory required (>168 GB) exceeded what we had available.

      (15) DBGWAS: please indicate DBGWAS version and the options used for kmer size and number of neighbour nodes retained in the subgraph. Also, I assume that no correction for population structure was applied.

      We have added the version and parameters for DBGWAS. The method section now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      (16) Could the authors provide the DBGWAS output for the most significant unitings in graph format? This would help readers understand the findings.

      The outputs are available in the github repo. The link to this specific data is (https://github.com/sapoudel/USA300GWASPUB/tree/master/data/dbgwas/dbgwas_output/visualisations)

      The text format of the output is part of Supplementary Table 2 under “DBGWAS Result” sheet.

      (17) Line 469: please provide more details on iModulons, it is not enough to simply reference the paper: specific QC criteria, mapping algorithm and parameters, ICA algorithm.

      We have now added a new Supplementary Note 2 section with more details about building iModulons.

      (18) Line 474: what is log-TPM?

      Log-Transcripts per Million. We have added the description in the text.

      (19) Line 479: not sure what "Chapter 3" refers to.

      Thank you for correcting the mistake. The reference has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Line 45. The introduction is not well-structured, and there is a lack of coherence among the topics pertinent to the research objective. I would recommend rewriting this section addressing the following topics: the challenge of distinguishing lineages within the CC8, especially the CA-MRSA USA300 strains; discussing the state-of-the-art GWAS methodologies, elucidating the main confounding factors in the application of GWAS to bacterial studies, and finally, exploring how current methods aim to address these concerns.

      We would like to thank the reviewer for the suggestions. The main innovation of the paper is using iModulons to find phenotype associated mutations from a set of linked mutations. The challenge of distinguishing CC8 subclades has been largely resolved thanks to efforts by Bowers et al. (PMID: 29720527). We have made some revisions to address the GWAS methodologies (bugwas and DBGWAS), the effect of linkage disequilibrium in interpreting the output of these methods and how combining the results of these association tests with modeling of TRN with iModulons can lead to finding candidate mutations of interest that are linked to specific changes in gene regulation.

      Line 56. Replace "pyomastitis" with "pyomyositis".

      Corrected to “pyogenic skin infections.”

      Lines 71. What do the authors mean by "endemic USA300 strain"?

      We have removed references to endemic strains.

      Line 106. Please verify the number of genomes used in the DBGWAS analysis. In the text, the authors mention that 2038 genomes were utilized. However, in Supplementary Table 1, only 2030 genomes are listed.

      Thank you for catching the discrepancy. We started the analysis with 2037 genomes, including four “spiked-in” reference genomes- USA100 D592 (CC5 strain used for rooting the CC8 tree), TCH1516 (same accession number as the one used for ICA), COL and Newman. Before further analysis, we removed 6 genomes for being smaller than 2.5 million base-pairs (see preprocessing.ipynb) and the USA100 D592 strain as it is not part of CC8. This resulted in 2030 genomes being used for DBGWAS. We kept the other 3 spiked CC8 genomes to help annotate the unitigs from DBGWAS.  Lastly, we removed the other three CC8 clade spiked genomes for pangenomic analysis. To clarify this, we have made the following changes to the text:

      (1) Changed line 106: We downloaded 2033 S. aureus genomes for analysis and excluded six of them with genome length of less than 2.5 million base pairs. The remaining 2027 S. aureus CC8 genomes formed a closed pangenome, suggesting that the sampled genomes mostly captured the gene level variations within the clonal complex (Figure 1a).

      (2) DBGWAS section Line 177: We used 2030 genomes for this analysis; the 2027 genomes in pangenomics analysis above were “spiked” with three well known CC8 genomes- TCH1516, COL, and Newman- to help annotate the DBGWAS unitigs.

      Line 108. Could the authors provide a table with the genes that constitute the core, accessory genome, and unique genes for each of the strains?

      The genes presence absence tables are very large files and therefore we have only added them to our github repo. The results can be found in following files:

      Pangenomics: data/pangenome/Pangenomics/CC8_strain_by_gene.pickle.gz

      Lines 112 and 315. On what basis did the authors decide on the size of the upstream regulatory region? In the search for mutations, they extracted segments of 300 base pairs, whereas, in the search for the Fur binding motif, only 100 base pairs were considered. The RegPrecise database contains regulons for Staphylococcus aureus N315 (https://regprecise.lbl.gov/genome.jsp?genome_id=26), including the Fur regulon with multiple Transcription Factor Binding Sites (TFBSs) that extend beyond the 100 base-pair sequence. I would recommend reconsidering the search within the standardized upstream region of -400 base pairs. In the case of the Fur binding motif search, it might be beneficial to include the TFBSs available in the RegPrecise database.

      For Fur motif search, we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains.

      Line 175. This work aimed to identify potential mutations associated with the success of a specific lineage rather than a phenotype, where correction for population structure effects is necessary. Would the implementation of the bugwas method in DBGWAS for controlling bacterial population structure not potentially impact the results? How was this issue addressed in your analysis? Would it not be pertinent to run a program without population structure correction to enable a comparison of results?

      We initially tried to use Linear Mixed Models to find kmers that were only enriched in USA300 strains. These efforts were hampered by extreme linkage disequilibrium which led to high collinearity between kmer abundance making it extremely difficult to get a good estimate of the coefficients. We also tried to run chi-squared tests individually on each kmer which led to unmanageable number (>100k) kmers that were significantly different. DBGWAS on the other hand was able to compress unbranched kmers in the De Bruijn into unitigs and further reduce the number of tests by testing at pattern level instead of unitig level. We found no straight forward way to run DBGWAS (or GEMMA) without population structure correction. Therefore, it is likely we may be underestimating the number of significant unitigs with this approach.

      Line 189. Please italicize the gene name cap5E.

      Corrected.

      Line 277. Please clarify the QC/QA criteria and curation process employed for the selection of RNA-seq experiments, as this constitutes a crucial step in the reconstruction of the network.

      We have now added a new supplementary material section, Supplementary Note 2 titled “Creating iModulons for CC8 Clade Staphylococcus aureus” with details of QC/QA.

      Line 279. In Supplementary Table 3, please label the first column and standardize the use of either the experiment ID or the run ID. Furthermore, verify the experiment identifiers from rows 19 to 26, as I could not locate them in the SRA database.

      We have changed all accession to experiment ID including rows 19 to 26.

      Lines 290, 330, 424, and 437. Please correct "SCCMec" to "SCCmec IVa" (italicize "mec").

      Corrected.

      Line 298. What is the size of the upstream regulatory region considered for this analysis? It is important to standardize this value for all analyses involving the upstream regulatory region. In this regard, I recommend maintaining a consistent size of -400 base pairs.

      For Fur motif search we chose 100 base-pairs because the Fur motif in non-USA300 strains were within ~20 base-pairs of isdH translation start site (Figure 4C). In our search of Fur motif in this analysis, we were not looking to see if any exists, we were simply looking to see if the one proximal to the translation start site exists as our DBGWAS analysis suggested that specific region was deleted in USA300 strains. In our usual analysis, we use -300 base pairs.

      Line 321. The discussion is rather concise and lacks an in-depth comparative perspective with relevant literature on any of the obtained results, whether concerning the proposed methodology or the potential new markers associated with the success of the USA300 lineage. The authors must underscore the method is not applicable to all GWAS analyses, due to the issue of correction for population structure.

      We have now added sections talking about the importance of isdH in S. aureus infection and a section addressing the limitation of the current approach when applied to other GWAS type study.

      Line 366. The authors employed the methodology described in the article by Hyun et al. 2022 (https://doi.org/10.1186/s12864-021-08223-8) to construct the pangenome. However, this methodology was designed for comparative analysis of pangenomes across various species, which does not align with the objective of this study, focusing solely on S. aureus genomes. Consequently, it remains unclear to me why the authors made this particular choice and, more importantly, what advantages it offers over well-established tools for individual pangenomes, such as Roary. I would strongly recommend validating the results using at least one established tool.

      With our analysis, we can determine proper thresholds for core/accessory/unique genes based on the observed data (Supplementary Figure 1a). However, we agree that it would be proper to include a more established pangenome package. We have added the results of Roary to our analysis. The Roary results largely agree with our biggest take away from pangenomics which is that our collection of genomes have a good coverage of the CC8 clade at the gene level.

      Line 370. Please include the version of CD-HIT that was utilized.

      Added. CD-HIT version 4.6 was used for the analysis.

      Line 372. What tool did the authors use to extract these regions?

      The list of CDS, 5’ and 3’ sequences can be extracted easily with a combination of fasta file and gff file. The gff file was used to find the position of each of these sequences and the sequences were extracted from the fasta file with python scripts.

      Line 395. What were the QC/QA criteria used to select the sequences?

      The QC/QA criteria for the sequences are mentioned in the beginning of the Pangnomic analysis subsection and is as follows:

      “Briefly, “complete” or “WGS” samples from CC8/ST8 were downloaded from the PATRIC database. Sequences with lengths that were not within 3 standard deviations of the mean length or those with more than 100 contigs were filtered out.”

      Line 407. Please correct the tool name to "SCCmecFinder" (italicize "mec").

      The name has been corrected.

      Line 409. I believe BLASTp was run locally, so please specify the version used and the search parameters.

      As corrected further down, we used BLASTn not BLASTp. The version v2.2.31 has been added to the methods section.

      Line 416. There is conflicting information with line 409, which mentions that PVL was identified through a protein BLAST, but right below, it states it was a BLASTn. Please verify which information is correct and consider the previous comment to specify the version and parameters.

      Thank you catching the discrepancy. We have corrected the text:

      “PVL was detected using nucleotide BLAST.”

      Line 418. Please provide the column identifiers for the Supplementary Table 5 (PVL worksheet).

      Column names are added.

      Line 418. Please remove the repeated word "and" in Supplementary Table 5 (mecA worksheet) and italicize the gene names in this table.

      Corrected

      Line 419. You can use the abbreviation "SNPs" since it was introduced in line 65.

      Corrected.

      Line 420. In my view, this analysis could benefit from a more detailed and clearer explanation.

      We have added to the explanation. The section now reads:

      “To find the root of the USA300 strains in the phylogenetic tree, the genomes in the tree were first annotated by their PVL and SCC_mec_ status. Then the tree traversed from leaf to root starting from known USA300 strains – TCH1516 and FPR3757- while keeping track of the number of descendant genomes from the current root that contained known markers SCC_mec_ IVa and PVL. The node where the number of genomes with the markers started flatlining was marked as the root of USA300.”

      Line 428. Specify the version and parameters used in the analysis with DBGWAS.

      Added. The text now reads:

      “DBGWAS (v0.5.4) was used to enrich mutations unique to USA300 strains using default kmer size of 31 (-k 31) and neighborhood size of 5 (-nh 5). Alleles with frequency less than 0.1 were filtered  (-maf 0.1) and all components enriched with q-values less than 0.05 were documented (-SFF q0.05).”

      Line 431. What tools were employed to calculate Pearson correlation and distances relative to the reference genome?

      Added. The text now reads:

      “Genome-wide linkage was estimated by Pearson correlation (calculated with built-in Pandas function) of the presence/ absence of enriched kmers and distance was measured based on the kmer alignment to the reference TCH1516 genome as determined by BLASTn.”

      Line 450. What type of BLAST was used?

      Added. Nucleotide blast was used for all kmer analysis.

      Line 452. I didn't quite understand the reason for making this analysis available in a separate repository. It would be easier for readers looking to reproduce the work if all the codes were in a single repository.

      We kept the repository separate in case we wanted to further develop the network analysis code in the future. We have added the link to the network analysis repository in the README of the publication repo.

      Line 460. Please specify the version and parameters, if run locally, or indicate if a web page was used.

      Corrected to indicate that we used the PATRIC website for this

      Line 470. Specify the version and provide a detailed account of all parameters used, along with the QC/QA criteria and curation methods applied.

      We have added Supplementary Note 2 with all the details about packages and parameters used to calculate the iModulons.

      Line 479. The phrase "ICA was then run as previously described in chapter 3" does not make sense. Please clarify.

      We have corrected the mistake and added a new supplementary note with details about our ICA run. The line now reads:

      “A detailed version of the methods for RNA-sequencing and ICA analysis is available as Supplementary Note 2. ICA of RNA sequencing data was performed using the pymodulon package.”

      Line 484. Specify the version of CD-HIT.

      Added. The version used was v4.6.

      Line 494. To enable reproducibility, the repository should be better organized, especially the directory containing the code. Numbering each script in the order it was run would assist the reader in comprehending the overall analysis flow and adapting it to their needs. If creating a manual for method usage is not feasible, the code could be more extensively commented on to explain the parameters, choices made, and how these could be modified. The "Data" folder seems to contain some test files, such as those in the "isdh_fimo" folder, so removing test files would aid the understanding of the reader.

      Thank you for the suggestions. We have now numbered the notebooks that generate the figures, we have added more comments to the code, removed testing code and test datasets.

      Throughout the article, please correct "SCCMec" to "SCCmec" (italicize "mec").

      Corrected.

    1. Author response:

      (1) The manuscript emphasizes the hypothesis that stable super-complexes, maintained through sequential replacement of subunits, might underlie the long-term storage of memory. While an interesting idea, this notion requires considerably more research. The presented experimental data are indeed consistent with this notion, but there is no evidence that these complexes are causally related to memory storage. 

      We agree with the reviewer that, while our data support the idea that subunit exchange in supercomplexes could underlie long-term memory storage, more research is necessary to conclusively validate this hypothesis. The experimental data presented are consistent with the idea that stable supercomplexes, maintained through sequential replacement of subunits, play a role in memory retention. However, establishing a causal relationship between these supercomplexes and memory storage will require additional experiments and in-depth analyses.

      (2) Much of the presented work is performed on biochemically isolated protein complexes. The biochemical isolation procedures rely on physical disruption and detergents that are known to alter the composition and structure of complexes in certain cases. Thus, it remains unclear how the protein complexes described in this study relate to PSD95 complexes in intact synapses. 

      Whilst it could be the case that biochemical isolation procedures have the potential to alter the composition and structure of protein complexes, we have previously published the protocol used to isolate PSD95-containing supercomplexes (Nat Commun. 2016; 7: 11264). In that study, we demonstrated that the isolated supercomplexes are approximately 1.5 MDa in size and contain multiple proteins, including other scaffolding proteins (e.g., PSD93) and receptors (e.g., NMDARs). Importantly, these supercomplexes remain stable when exposed to detergents and dilution, strongly indicating that they represent the native complexes present in intact synapses.

      (3) Because not all GFP molecules mature and fold correctly in vitro and the PSD95-mEos mice used were heterozygous, the interpretation of the corresponding quantifications is not straightforward. 

      Although genetic tagging ensures a 1:1 labeling stoichiometry, we acknowledge that the presence of unfolded GFP and the use of heterozygous PSD95-mEos mice can complicate the analysis. We have highlighted this limitation in the manuscript. Nonetheless, our results show a high level of consistency across the different genetic fusions used in this study.

      (4) It was not tested whether different numbers of PSD95 molecules per super-complex might contribute to different retention times of PSD95, e.g. in synaptic vs. total-forebrain super-complexes. 

      The potential impact of varying numbers of PSD95 molecules per super-complex on retention times was considered. However, our analysis showed minimal differences in the distribution of molecule numbers per super-complex between the synaptic and forebrain samples.

      (5) The conclusion that the population of 'mixed' synapses is higher in the isocortex than in other brain regions is not supported by statistical analysis. 

      The conclusion that the population of 'mixed' synapses is higher in the isocortex than in other brain regions is indeed supported by statistical analysis. All relevant statistical data are detailed in Table S2, and the finding is statistically significant. We will emphasize this point in the revised manuscript.

      (6) The validity of conclusions regarding PSD95 degradation based on relative changes in the occurrence of SiR-Halo-positive puncta is limited.

      We recognize that conclusions based solely on the relative changes in SiR-Halo-positive puncta concerning PSD95 degradation have limitations. To address this, we also quantified the “new” PSD95 by analyzing AF488-Halo-positive molecules.

    1. Author Response:

      Thank you for the reviews and the eLife assessment. We want to take this opportunity to acknowledge the weaknesses pointed out by the reviewers and we will make small changes to the manuscript to account for these as part of the Version of Record.

      The tools are command-based and store outcomes locally

      We consider this to be an advantage of our ecosystem, which is intended for the case of individuals or small groups of authors. These features facilitate easy installation and integration with other tools. Further, our tool labelbuddy is a graphic user interface. Our tools may also be integrated into web-based systems as backends. Pubget is already being used in this way in the NeuroSynth Compose platform for semi-automated neuroimaging meta-analyses.

      pubget only gathers open-access papers from PubMed Central

      We recognize this as a limitation, and we acknowledge it in the original manuscript (in the discussion section, starting with "A limitation of Pubget is that it is restricted to the Open-Access subset of PMC"). We chose to limit the scope of our tools in order to ensure maintainability. Further, we are currently expanding pubget so it will also be able to access the abstracts and meta-data from closed-access papers indexed on PubMed. Future research could build other tools to work alongside pubget, to access other databases.

      Logic flow is difficult to follow

      We thank the reviewer for this feedback. Our paper describes an ecosystem of literature mining tools which does not lend itself to narrative flow nor does readily fit into the standard "Intro, Results, Discussion, Methods" structure that is typical in the scientific literature. We have done our best to conform to this expected format, but we have also provided detailed section and subsection headings to enable the reader to digest the paper nonlinearly. Each of the tools we describe also has detailed documentation on github that we update continuously.

      Results were not validated

      For the example where we automatically extracted participant demographics from papers, we validated the results on a held-out dataset of 100 manually-annotated papers. For the example with automatic large-scale meta-analyses (neuroquery and neurosynth), these methods are described together with their validation in the original papers. If this ecosystem of tools is integrated into other workflows, it should be validated in those contexts. We recognize that validating meta-analyses is a difficult problem because we do not have ground truth maps of the brain.

      Efficiency was not quantified

      Creators of tools do not always do experiments to quantify their efficiency and other qualities. We have chosen not to do this here, first because it is outside the scope of this paper as it would necessitate to specify very precise tasks and how efficiency is measured, and second because at least for the data collection part, the benefit of using an automated tool over manually downloading papers one by one is clear even without quantifying it. Compared to the approach of re-using existing datasets, our ecosystem is not necessarily more or less efficient. But it has other advantages, such as providing datasets that contain the latest literature, whereas the existing datasets are static and quickly out-of-date.

      We do not highlight the strength of AI functions

      We provide an example of using our tools to gather data and manually annotate a validation set for use with large language models (in our case, GPT). We are further exploring this domain in other projects; for example, for performing semi-automated meta-analyses using the NeuroSynth Compose platform. However, we did not deem it necessary to include more AI examples in the current paper; we only wanted to provide enough examples to demonstrate the scope of possible use cases of our ecosystem.

      We thank the reviewers for their time and valuable feedback, which we will keep in mind in our future research.

    1. Author response:

      Thank you for handling our paper and our thanks to the reviewers for their engagement, comments and valuable suggestions. We will take the opportunity to provide a full response and submit a revised version in the coming weeks.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. While the experiments presented are convincing, the manuscript could benefit from restructuring towards the precise findings; alternatively, additional data buttressing the claims made would significantly enhance the study. These valuable findings will be of interest to the chromosome and microbiology fields.

      We appreciate editors for taking time for assessment and reviewers for giving critical suggestions. Both reviewers were concerned about our interpretation of 3C data, and Reviewer #2 suggested the biochemistry of cyAbrB2 to reinforce our claim. We agree with the concern and suggest editors add a sentence “How cyAbrB2 affects chromosome structure is still elusive from this study, and the biochemical assays are needed in the future experiment.” to the eLife assessment.

      The major revision points are the following;

      Reconstruction of Figures

      Previous Figure 5E has been omitted

      Additional 3C data on the nifJ region

      Rephrasing the conclusion of 3C data

      Additional discussion on cyAbrB2 and NAPs

      Reviewer #1 (Public Review): 

      Strength: 

      At first glance, I had a very positive impression of the overall manuscript. The experiments were well done, the data presentation looks very structured, and the text reads well in principle.

      Weakness: 

      Having a closer look, the red line of the manuscript is somewhat blurry. Reading the abstract, the introduction, and parts of the discussion, it is not really clear what the authors exactly aim to target. Is it the regulation of fermentation in cyanobacteria because it is under-investigated? Is it to bring light to the transcriptional regulation of hydrogenase genes? The regulation by SigE? Or is it to get insight into the real function of cyAbrB2 in cyanobacteria? All of this would be good of course. But it appears that the authors try to integrate all these aspects, which in the end is a little bit counterintuitive and in some places even confusing. From my point of view, the major story is a functional investigation of the presumable transcriptional regulator cyAbrB2, which turned out to be a potential NAP. To demonstrate/prove this, the hox genes have been chosen as an example due to the fact that a regulatory role of cyAbrB2 has already been described. In my eyes, it would be good to restructure or streamline the introduction according to this major outcome. 

      As you pointed out, the major focus of this study is cyAbrB2 as a potential NAPs. To focus on NAPs, we simplified the first paragraph of the discussion (ll.246-263) and added the section comparing cyAbrB2 with other known NAPs (11.269-299). To emphasize the description of cyAbrB2, we also rearranged the figures and divided the analysis on cyAbrB2 ChIP into two figures. We reduced the first paragraph of the introduction but mostly preserved the composition of the introduction to keep the general to specific pattern, even though the manuscript is blurry.

      Points to consider: 

      The authors suggest that the microoxic condition is the reason for the downregulation of e.g. photosynthesis (l.112-114). But of course, they also switched off the light to achieve a microoxic environment, which presumably is the trigger signal for photosynthesis-related genes. I suggest avoiding making causal conclusions exclusively related to oxygen and recommend rephrasing (for example, "were downregulated under the conditions applied").

      We agree with this point. We rephrased l.114 to “by the transition to dark microoxic conditions from light aerobic conditions” (ll.108-109).

      The authors hypothesized that cyAbrB2 modulates chromosomal conformation and conducted a 3C analysis. But if I read the data in Figure 5B & C correctly, there is a lot of interaction in a range of 1650 and 1700 kb, not only at marked positions c and j. Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant? In the case of position j the variation between the replicates seems quite high, in the case of position c the mean difference is not that high. Moreover, does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A? If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT. That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown. But I have to mention that I am not an expert in these kinds of assays. Nevertheless, if there is a biological function that shall be revealed by an experiment, the data must be crystal clear on that. At least the descriptions of the 3C data and the corresponding conclusions need to be improved. For me, it is hard to follow the authors' thoughts in this context. 

      According to your suggestion, we again have carefully observed the 3C data. Furthermore, we conducted an additional 3C experiment on nifJ region (Figures 7F-J). Then we admit we had overinterpreted the 3C data. Therefore, we rewrote the result and discussion of the 3C assay in line with the data (ll.220-245) and removed the previous Figure 5E. Following are individual responses.

      Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant?

      We could not find statistically significant differences at locus c and j. Therefore, we added this in the result section “Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.231-232)

      does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A?

      As you are concerned, interaction frequency and cyAbrB2 binding do not correlate. Therefore, we withdraw the previous claim and stated as follows; “Moreover, our 3C data did not support bridging at least in hox region and nifJ region, as the high interaction locus and cyAbrB2 binding region did not seem to correlate (Figure 7).” (ll.280-282)

      If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT.

      We rewrote it as follows; “Then we compared the chromatin conformation of wildtype and cyabrb2∆. Although overall shapes of graphs did not differ, some differences were observed in wildtype and cyabrb2∆ (Figures 7B and 7G); interaction of locus (c) with hox region were slightly lower in cyabrb2∆ and interaction of loci (f’) and (g’) with nifJ region were different in wildtype and cyabrb2∆. Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.228-232)

      That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown.

      We rewrote the sentence as follow; “While the interaction scores exhibit considerable variability, the individual data over time demonstrate declining trends of the wildtype at locus (c) and (j) (Figure S8). In ∆cyabrb2, by contrast, the interaction frequency of loci (c) and (j) was unchanged in the aerobic and microoxic conditions (Figure 7E). The interaction frequency of locus (c) in ∆cyabrb2 was as low as that in the microoxic condition of wildtype, while that of locus (j) in ∆cyabrb2 was as high as that in the aerobic condition of wildtype (Figures 7B and 7C).” (ll.238-243)

      The figures are nicely prepared, albeit quite complex and in some cases not really supportive of the understanding of the results description. Moreover, they show a rather loose organization that sometimes does not fit the red line of the results section. For example, Figure 1D is not mentioned in the paragraph that refers to several other panels of the same figure (see lines110-128). Panel 1D is mentioned later in the discussion. Does 1D really fit into Figure 1 then? Are all the panels indeed required to be shown in the main document? As some elements are only briefly mentioned, the authors might also consider moving some into the supplement (e.g. left part of Figure 1C, Figure 2A, Figure 3B ...) or at least try to distribute some panels into more figures. This would reduce complexity and increase comprehensibility for future readers. Also, Figure 3 is a way too complex. Panel G could be an alone-standing figure. The latter would also allow for an increase in font sizes or to show ChIP data of both conditions (L+O2 and D-O2) separately. Moreover, a figure legend typically introduces the content as a whole by one phrase but here only the different panels are described, which fits to the impression that all the different panels are not well connected. Of course, it is the decision of the authors what to present and how but may they consider restructuring and simplifying.

      According to the advice, we have rearranged the Figure composition.

      The left side of Figure 1C has been moved to supplement. Instead, representative expression fold changes of “Transient”, “Plateau”, “Continuous”, and “Late” genes are shown for comprehensibility. We left Figure 1D in Figure 1, as this diagram shows our motive to focus on hox and nifJ. We moved Figure 2A to supplement. We did not move Fig3B, as this figure shows the distribution of cyAbrB2 (“long tract of AT-rich DNA”) comprehensively and simply. We agree that Figure 3 was too complex. Therefore, we moved Figures 3F and 3G to a new independent figure (Figure 4). In Figure 4C (former 3G), we show the ChIP data of the L+O2 condition only, and the change of ChIP data under the D-O2 condition is shown in Figure 5. The schematic image showing cyanobacterial chromosome and NAPs (previous Figure 5E) was omitted because it was overinterpreting.

      The authors assume a physiological significance of transient upregulation of e.g. hox genes under microoxic conditions. But does the hydrogenase indeed produce hydrogen under the conditions investigated and is this even required? Moreover, the authors use the term "fermentative gene". But is hydrogen indeed a fermentation product, i.e. are protons the terminal electron acceptor to achieve catabolic electron balance? Then huge amounts of hydrogen should be released. Comment should be made on this.

      This is a very important point; Yes, hydrogenase indeed produces hydrogen under the conditions we investigated, and proton accepts a majority of reducing power under the dark microoxic condition. We wrote in the introduction section as follows; “Hydrogen is generated in quantities comparable to lactate and dicarboxylic acids as the result of electron acceptance in the dark microoxic condition (Akiyama and Osanai 2023; Iijima et al. 2016)” (ll.54-55). The detailed explanation is below, although omitted from the manuscript.

      A recent study (Akiyama and Oasanai 2023) quantified the consumed glycogen and secreted fermentative products (hydrogen, lactate, dicarboxylic acid, and acetate) in the Synechocystis under the dark microoxic condition, the same conditions as we investigated. The system of the study consists of a 10 mL liquid layer and a 10 mL gas layer, cultivated for 3 days under dark microoxic conditions. Then the amounts of lactic acid, dicarboxylic acid, and hydrogen were approximately 2 µmol, 3.5 µmol, and 11µmol (assuming the gas layer was at 1 atm and ignoring aqueous population), respectively. On the other hand, glycogen equivalent to 15µmol of glucose was consumed in the system. This estimate supports hydrogen accounts for a substantial portion of fermentative products during dark microoxic conditions.

      The necessity of hydrogen production under dark microoxic conditions was demonstrated in (Gutekunst et al. 2014). They show hydrogenase activity is required for the mixotrophic growth in the light-dark and microoxic cycle with arginine. The necessity remains unclear in our conditions because we only performed continuous dark microoxic conditions without glucose.

      The authors also mention a reverse TCA cycle. But is its existence an assumption or indeed active in cyanobacteria, i.e. is it experimentally proven? The authors are a little bit vague in this regard (see lines 241-246).

      We misused the Terminology. We mean to mention the “reductive branch of TCA”. Cyanobacteria conduct the branched TCA cycle under microoxic conditions. One of the branches is the reductive branch, which reduces oxaloacetate to produce malate. We corrected “reverse TCA cycle” to “reductive branch of TCA”. (Figure 1D and ll.260-262)

      Reviewer #2 (Public Review): 

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given its functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild-type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid-associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented and described.

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which it seems to function similarly to the well-studied Escherichia coli H-NS protein. It would be helpful to provide some additional commentary within the discussion around the similarities and differences of cyAbrB2 to other nucleoid-associated proteins, and possible mechanisms of cyAbrB2 control (post-translational modification; protein-protein interactions; etc.). The manuscript would also be strengthened with the inclusion of biochemical experiments probing the binding of cyAbrB2, particularly focusing on its oligomerization and DNA polymerization/bridging potential.

      We agree with the comment that the biochemical experiments will deepen our insights into the cyAbrB2 and chromatin conformation. As the reviewer pointed out, the biochemical assay will provide valuable information on mechanisms of cyAbrB2 control, such as post-transcriptional modification, cooperation with cyAbrB1, oligomerization, and the structure of cyAbrB2-bound DNA. However, we think those potential findings are worth of new independent research paper, rather than a part of this paper. Therefore, we added a discussion mentioning biochemistry as the future work (ll.275-290; the section of “The biochemistry of cyAbrB2 will shed light on the regulation of chromatin conformation in the future”).

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. However, the results of the SigA studies here suggested that SigA both strongly associated with the hox promoter, and its binding sites were shared more frequently than SigE with cyAbrB2. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. It seems important to consider how cyAbrB1 activity might influence the results described here.

      We infer that the minor side of the bimodal SigE peak is the genuine population that contributes to hox transcription, as hox genes are expressed in a SigE-dependent manner (Figure S2). We considered the strong SigA peak upstream of the hox operon binds the promoter of TU1715, the opposite direction of the hox operon. We added a description of the single SigA peak and bimodal SigE peak near the TSS of the hox operon as follows;

      “A bimodal peak of SigE was observed at the TSS of the hox operon in a microoxic-specific manner (Figure 6C bottom panel). The downstream side of the bimodal SigE peak coincides with SigA peak and the TSS of TU1715. Another side of the bimodal peak lacked SigA binding and was located at the TSS of the hox operon (marked with an arrow in Figure 6C), although the peak caller failed to recognize it as a peak.” (ll.206-209)

      The point that cyAbrB1 binds similar sites as cyAbrB2, despite regulating hox expression in the opposite direction, is very interesting. Therefore, we referred to the transcriptome data of the cyAbrB1 knockdown strain and compared the impact of cyAbrB1 knockdown and cyAbrB2 deletion. We described in result and discussion as follows;

      “we referred to the recent study performing transcriptome of cyAbrB1 knockdown strain, whose cyAbrB1 protein amount drops by half (Hishida et al. 2024). Among 24 genes induced by cyAbrB1 knockdown, 12 genes are differentially downregulated genes in cyabrb2∆ in our study (Figure S5D).” (ll.162-165)

      “CyAbrB1, the homolog of cyAbrB2, may cooperatively work, as cyAbrB1 directly interacts with cyAbrB2 (Yamauchi et al. 2011), their distribution is similar, and they partially share their target genes for suppression (Figures 3A S5C and S5D). The possibility of cooperation would be examined by the electrophoretic mobility shift assay of cyAbrB1 and cyAbrB2 as a complex. Despite their similar repressive function, cyAbrB1 and cyAbrB2 regulate hox expression in the opposite directions, and their mechanism remains elusive.” (ll.292-296)

      Hox operon differs from this general tendency. To see if cyAbrB1 behaves differently from cyAbrB2 in the hox operon, we did an additional ChIP-qPCR experiment on cyAbrB1 in the aerobic condition and the dark microoxic condition (Figure 5C). However, we could not find the difference.

      Reviewer #1 (Recommendations For The Authors): 

      Figure 1B: I recommend changing the header in the grey bar to terms like "upregulated" and "downregulated", which are also used in the legend description. Upregulation of genes can also be a result of de-repression, which is why the term "activated" is somewhat misleading.

      Corrected.

      Lines 114-116: It is unclear what the authors exactly mean here. Please clarify. 

      We rephrase the sentence “The enrichment in the butanoate metabolism pathway indicates the upregulation of genes involved in carbohydrate metabolism. We further classified genes according to their expression dynamics.” (ll.110-111)

      Reviewer #3 (Recommendations For The Authors): 

      Major/experimental comments: 

      (1) For the chromosome conformation capture experiments, it is indicated that these were conducted at aerobic (1hr) and microoxic (4 hr) conditions. But the data presented in Figure 1 suggest that 1 hr corresponds to the beginning of microoxic growth, and that time 0 is aerobic. The composite 3C data in Figure 5 show some interesting but specific differences. It is appreciated that the authors presented the profiles for individual samples in Figure S7, and the differences here do not seem to be as compelling. Are the major differences being highlighted significantly (statistically) different (e.g. at the (c) and (j) loci)? Might the differences be starker if an earlier aerobic condition (e.g. time 0) had been used instead of the 1 hr - microoxic - timepoint?

      Previous Figure 5 consisted of three time points (solid line: aerobic condition, dashed line:1hr of microoxic condition, and dotty line:4hr of microoxic condition). We omitted data of 4hr in the main figure (Figure 7) as 4hr in microoxic conditions makes data complicated. Three time points are shown in the profiles of individual loci (Figure S8).

      There is no statistical significance found in (c) and (j) loci by t-test. Therefore, we have toned down the interpretation of 3C data as follows; “Our 3C result demonstrated that cyAbrB2 influences the chromosomal conformation of hox and nifJ region to some extent (Figure 7).” (ll.325-326)

      (2) This is a complicated system that involves multiple regulatory proteins, each of which is differentially affected by the growth conditions (aerobic/microoxic). It is obviously beyond the scope of this work to probe deeply into all of these proteins. The focus here was on cyAbrB2, and to a slightly lesser extent SigE; however, based on the data presented, it seems that SigA and cyAbrB1 may be equally important contributors to hox control/expression, and in the case of cyAbrB1, possibly also to chromosome conformation. cyAbrB1 appears to have the same binding sites as cyAbrB2, and has been reported to interact with cyAbrB2. Given this association, it is possible that the two proteins may affect the binding of each other, and that loss of one might lead to enhanced binding by the other (or binding may require heterooligomerization?). Probing the regulatory interplay between these two proteins (or at least discussing it) feels important. Conducting e.g. mobility shift assays with each protein, both individually and together, could possibly allow for some understanding of how they function together. 

      We agree that the biochemistry of cyAbrB2 and cyAbrB1 may explain why cyAbrB1 and cyAbrB2 bind long tracts of AT-rich genome regions in vitro. We would like to put the biochemistry future plan as we think biochemistry data is beyond the present study.

      The idea that cyAbrB1 and cyAbrB2 cooperate to form heterooligomers and broad binding to the genome is a very rational and interesting prediction. We add this idea to the discussion “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.”(ll.287-290). We also compared our transcriptome of ∆_cyabrb2 with the recent study of cyabrb1 knockdown (ll. 162-165), and concluded “they partially share their target genes for suppression (Figures 3A S5C and S5D)” (l. 293).

      (3) Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means. It appears that when cyAbrB2 binds, any given protected region can be quite extensive, which can be suggestive of polymerization along the chromosome. Are the boundaries for binding sites typically clearly delineated, and this changes when the cultures are growing under microoxic conditions? There is also no mention made anywhere about oligomerization potential for cyAbrB2, which would be important for the polymerization, and bridging suggested for cyAbrB2 in the model presented in Figure 5. Previous publications (Song et al., 2022; Ishi et al., 2008) have suggested that it can exist as a dimer in vivo, but that in vitro it is largely monomeric. The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means.

      In order to clearly describe “cyAbrB2 binding becomes blurry”, we rearranged the figure composition and made an exclusive figure (Figure 5). We also rephrased the description by adopting the reviewer’s word “boundaries for binding sites”, as this phrase well describes the change. “When cells entered microoxic conditions, the boundaries of the cyAbrB2 binding region and cyAbrB2-free region became obscure (Figure 5), “(ll.319-320)

      There is also no mention made anywhere about oligomerization potential for cyAbrB2,

      We added the discussion about oligomerization “DNA-bound cyAbrB2 is expected to oligomerize, based on the long tract of cyAbrB2 binding region in our ChIP-seq data. However, no biochemical data mentioned the DNA deforming function or oligomerization of cyAbrB2 in the previous studies and preference for AT-rich DNA is not fully demonstrated in vitro (Dutheil et al. 2012; Ishii and Hihara 2008; Song et al. 2022)”(ll. 277-280) and “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.” (ll.287-290)

      The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      We added the discussion integrally considering known features of cyAbrB2, novel findings on cyAbrB2, and the comparison with known NAPs (ll.269-290).

      (4) Given that the major take-away for the authors (based on the title) seems to be the nucleoid-associated protein potential for cyAbrB2, the Discussion would benefit from some additional focus in this area. How similar is cyAbrB2 to other nucleoid-associated proteins? (e.g. H-NS, Lsr2) How does counter-silencing work for other nucleoid-associated proteins? Can the authors definitively exclude the possibility of binding site competition/occlusion, given that cyAbrB2 covers the promoter region of hox? What is other nucleoid-associated proteins have been characterized in the cyanobacteria? 

      We agree with the point, so we additionally discussed cyAbrB2 comparing with H-NS and Lsr2, the canonical NAPs (ll. 269-290).

      We did not deny the possibility of the exclusion of RNAP by cyAbrB2, but the previous manuscript insufficiently discussed that. To emphasize that cyAbrB2 excludes RNA polymerase, we simplified Figure 6 and employed mosaic plots showing anti-co-occurrence of cyAbrB2 binding regions and SigE peaks. Furthermore, we added discussion about SigE exclusion by cyAbrB2 (ll. 355-359)

      We mention the possibility of other nucleoid-associated proteins in cyanobacteria in the discussion. “Furthermore, the conformational changes by deletion of cyAbrB2 were limited, suggesting there are potential NAPs in cyanobacteria yet to be characterized.” (ll.336-339)

      (5) Previous work (Song et al., 2022) showed that changing the AT content of cyAbrB2 binding sites did not affect its ability to bind DNA. There are also previous papers suggesting that cyAbrB2 may be subject to diverse post-translational modifications (e.g. phosphorylation - Spat et al., 2023; glutationylation - Sakr et al., 2013), as well as association with cyAbrB1. These collectively suggest there may be other factors that contribute to cyAbrB2 binding specificity/activity. These seem like relevant points to discuss, particularly given the transient nature of the cyAbrB2 effects on some genes.

      We have included the discussion about AT content, post-translational modifications and transient regulations, and association with cyAbrB1 (ll. 284-295)

      (6) Given the major binding site for SigA upstream of the hox operon, it seems that it likely also contributes to hox cluster expression, together with SigE. Is there a sense for the relative contribution of each sigma factor to hox cluster expression? And whether both are subject to the same inhibitory effect of cyAbrB2? 

      As described above response to the public review, the SigA binding site upstream of the hox operon should be assigned to the TSS of TU1715 (Figure 6C). Transcription of hox operon is highly dependent on SigE as shown in Figure S2, and residual transcription in sigE∆ strain is derived from other sigma factors (SigABCD). Estimating the relative contribution of sigma factors other than SigE is difficult at present because SigABCDE can partially compensate for each other.

      As the different impact of NAPs on the primary and alternative sigma factor is observed in H-NS (Shin et al. 2005), whether both the primary sigma factor (SigA) and the alternative sigma factor (SigE) are inhibited by cyAbrB2 to the same extent is a very interesting question.

      We calculated the odds ratio of SigE and SigA being in the cyAbrB2-free region and wrote in the result; “SigE preferred the cyAbrB2-free region in the aerobic condition more than SigA did (Odds ratios of SigE and SigA being in the cyAbrB2-free region were 4.88 and 2.74, respectively).” (ll.193-195) and discussed “The higher exclusion pressure of cyAbrB2 on SigE may contribute to sharpening the transcriptional response of hox and nifJ on entry to microoxic conditions.” (ll.357-359)

      (7) The 3C experiments suggest there are indeed changes in chromosome architecture in the hox region as growth conditions change and when different regulators are present. Across the chromosome, analogous changes are expected; however, it may be premature to draw this conclusion based on changes at one locus. Is there a reason that the authors did not take full advantage of their 3C samples and sequence them, to capture the full chromosome interactome at the two time-points? This would allow broader conclusions to be drawn regarding changes in chromosome structure and the impact of cyAbrB2.

      In response to the suggestion, we performed an additional 3C assay on the nifJ region by utilizing residual 3C samples. Expanding to genome-wide sequence (Hi-C) needs concentration of ligated fragments by the biotinylation, which were omitted in our 3C sample.

      We rewrote the result as obtained from the 3C data of hox and nifJ (ll.220-245) and omitted the schematic image of an entire chromosome of cyanobacteria (previous Figure 5E).

      Editorial comments: 

      (1) The data presentation in Figure 1 is very effective. 

      (2) Line 87: please rephrase - you can have 'high similarity' or 'high levels of identity', but not high levels of homology - genes/proteins are either homologous or not.

      (3) Line 118: classified into four 'groups'? 

      (4) Line 590: remove 'the'. 

      (5) Figure 2S, panel B: please define acronyms in the legend (GT, IP) and write out 'FLAG' in full for AbrB1.

      (2) to (5) have been corrected.

      (6) Please provide information on or a reference for the tagging of SigA for use in the ChIP-seq experiments within the Materials and Methods.

      Added (l.365)

      (7) Line 648: space between 'binding' and 'regions'. 

      corrected.

      (8) Fig 4E: please make the solid lines thicker - they are currently difficult to see.

      We have made Figure 6C (former 4E) larger and the line thicker.

      (9) Line 666: location. 

      (10) Line 673: Individual. 

      (11) Figure S5, panel C graph title: should this be 'Relative'? 

      (12) Figure S7: What is 'GT'? Should this be 'WT'? 

      (9) to (12) have been corrected.

      (13) In addition to the data presented in Figure 3G, it would be nice to have a small table or Venn diagram summarizing the number of cyAbrB2 binding sites that fall into the different categories (full gene/operon; downstream of a gene; within a gene; promoter region). 

      In response to the comment, we noticed the categories we had applied (full gene/operon; downstream of a gene; within a gene; promoter region) were arbitrary. Therefore, we categorized transcriptional units (TUs) according to the extent of occupancy by cyAbrB2. (Figures 4B and 4C)

      (14) Line 280-281: suggest replacing 'mediates' with 'influences'. 'Mediates' sounds like a direct interaction (for which the evidence is not currently strong without some additional biochemical data), but 'influences' could better accommodate both direct and indirect possibilities. 

      (15) Line 410: it is not clear what this means. 

      We have omitted “As a result, DNA ~600-fold condensed DNA than 3C samples were ligated.”, as it does not give any information about the experimental procedure.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in the cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shortened within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.<br /> However, I propose that the sections about MAD and certain parts of the discussions need substantial revisions.

      In this study, we leverage high spatiotemporal resolutions of high-speed atomic force microscopy (HS-AFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. Furthermore, we experimentally demonstrate the inherent variability in twist conformations of bare actin filaments. Our study integrates HS-AFM with Principal Component Analysis (PCA) to elucidate the actin structure-dependent preferential cooperative binding of cofilin. We provide experimental evidence to substantiate a "proof of principle" regarding the flexible helical twists of actin filaments that regulate the functions of actin-binding proteins. This important study enhances our understanding of actin filaments’ dynamics and polymorphic structures which play crucial roles in a broad spectrum of cellular activities.

      We appreciate the comments from Reviewer 1. Below, we address their concerns point by point.

      MAD analysis

      The authors have presented findings that the mean axial distance (MAD) within actin filaments exhibits a significant dependency on the helical twist, a conclusion not previously derived despite extensive analyses through electron microscopy (EM) and molecular dynamics (MD) simulations. Notably, the MAD values span from 4.5 nm (8.5 pairs per half helical pitch, HHP) to 6.5 nm (4.5 pairs/HHP) as depicted in Figure 3C. The inner domain (ID) of actin remains very similar across C, G, and F forms (2, 3), maintaining similar ID-ID interactions in both cofilactin and bare actin filaments, keeping the identical axial distance between subunits in the both states. This suggests that the ID is unlikely to undergo significant structural changes, even with fluctuations in the filament's twist, keeping the ID-ID interactions and the axial distances. The broad range of MAD values reported poses a challenge for explanation. A careful reassessment of the MAD analysis is recommended to ensure accuracy.

      The central challenge to study “Protein Dynamics” in real time lies in bridging the gap in time scales: HS-AFM captures dynamics of proteins within the milliseconds to seconds range, whereas molecular dynamics (MD) simulations typically operate within the femtoseconds to microseconds domain. Protein dynamics encompass a spectrum of temporal scales, from atomic vibrations to molecular tumbling and collective motions in simulations. HS-AFM stands out as a potent technique for delving into protein dynamics, including processes like protein folding and conformational changes triggered by drugs or protein interactions. Additionally, a significant limitation of MD simulation is the spatial modeling constraint (~50 x 50 nm unit), which restricts the study of large complex biological systems. However, utilizing HS-AFM enables the construction of intricate protein models facilitating the real time imaging of their structures and dynamics during functional activity.

      Regarding the suggestion about ID-ID interactions in both cofilactin and bare actin filaments, maintaining identical axial distances (ADs) between subunits in both states, our HS-AFM cannot provide atomic-level structural insights to address this issue. However, we demonstrate that the variability of OD twists in actin protomers could potentially lead to globally shorter half helical pitches (HHPs) and fewer protomer pairs per HHP (Figure 2, Figure supplement 2) (see lines 218-222). The fluctuation in filament’s twist is further supported by currently available experimental data, including our findings (Figure 3C) in this study (see our Discussion in lines 555-560).

      The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (Figure supplement 2). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560).

      We have additionally reassessed the fluctuation and dynamics of MAD in F-ADP-actin and F-ADP.Pi-actin over time at high temporal resolution (Figure supplement 3, Video 3, Table supplement 5). These data are further explained in the Results section (lines 264-270).

      Furthermore, we reassessed the broad range of MAD values in F-ADP-actin segments on both sides of large cofilin clusters over time (Figure supplement 8, Video 5). These findings are explained in the Results section (lines 333-337) and further discussed in the new results (lines 555-560).

      In determining axial distances, the authors extracted measurements from filament line profiles. It is advised to account for potential anomalies such as missing peaks or pseudo peaks, which could arise from noise interference. An example includes the observation of three peaks in HHP6 of Figure Supplement 5C, corresponding to 4.5 pairs. Peak intervals measured from the graph were 5, 11.8, 8.7, and 5.7 nm. The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      We acknowledge the difficulty in identifying peaks within the regions of bare actin segments adjacent to cofilin clusters or within the cofilactin region. In the revised Figure supplement 6C (originally Figure supplement 5C), we did not assess peak intervals as suggested by Reviewer 1. The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see HS-AFM data analysis and processing, highlighted in purple).

      Additionally, the purpose of presenting these Figures supplement 6-7 is to directly compare the half helices and the number of protomer pairs per HHP between bare actin filaments and actin segments near the boundary between cofilactin and bare actin segments on the PE side in the same AFM images. In an original version of this paper, we have avoided including the MAD values measured in the cofilactin region (HHP6, HHP7) in Figure Supplement 7E, to mitigate the measurement errors.

      Compiling histograms of axial distances (ADs) rather than focusing solely on MAD may provide deeper insights. If the AD is too long or too short, the authors should suspect the presence of missing peaks or pseudo-peaks due to noise. If 4.4 or 5.5 pairs/HHP regions tend to contain missing peaks and 7.5-8.5 pairs/HHP regions tend to contain pseudo peaks, this may explain the MAD dependency on the helical twist.

      The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see Analyses of pseudo AFM images of F-actin and C-actin structures constructed from existing PDB structures (e.g., Figure supplement 2); and HS-AFM data analysis and processing, highlighted in purple).

      We disagree with Reviewer 1’s suggestion that compiling histograms of ADs, rather than focusing solely on MAD, may provide deeper insights. AFM imaging provides only a 2-dimensional (2D) surface structure, unlike the 3-dimensional (3D) structure offered by Cryo-EM. In AFM imaging, we cannot capture the object from different angles as Cryo-EM does. Therefore, AD values measured in 2D AFM images do not accurately represent the axial distance between two adjacent protomers along the same actin filament. Consequently, we relied on MAD values. Our results, including the fluctuation in the number of protomer pairs per HHP, are further supported by other studies (see our Discussion in lines 555-560).

      Additionally, Figure 3E indicates a first decay constant of 0.14 seconds, substantially shorter than the frame rate (0.5 sec/frame). This suggests significant variations in line profiles between frames, attributable either to overly rapid dynamics or a low signal-to-noise ratio. Implementing running frame averages (of 2-3 frames) is recommended to distinguish between these scenarios. If the dynamics are indeed fast, the averaged frame's line profile may degrade, complicating peak identification. Conversely, if poor signal-to-noise ratio is the cause, averaging frames could facilitate peak detection. In the latter case, the authors can find the optimal number of frame averages and obtain better line profiles with fewer missing and pseudo-peaks.

      We utilized state-of-the-art HS-AFM with high temporal and spatial resolution to capture the dynamic structures of F-ADP-actin and F-ADP.Pi-actin segments at higher frame rate of 0.2 sec/frame and 0.1 sec/frame, respectively (Figure supplement 3). As suggested, we implemented running frame averages (3 frames) in the ACF analyses. Consistently, our results indicate that the first time constant (t1) remains around 0.1-0.4 seconds, independent of the imaging rates (0.1 – 0.5 sec/frame), for AD between two adjacent actin protomers in F-actin bound with ADP or ADP.Pi (Table Supplement 5), and in the similar range of (t1), shown in Figure 3E. These significant experimental results support the notion that helical twists, the number of actin protomers per HHP, and MAD in bare F-actin segments, are intrinsically dynamic and fluctuate around the mean values over time (see further in lines 264-270; 333-337; and 555-560). It should be noted that our original ACF analyses did not include the averaging of running frames, thus eliminating the possibility of low signal/noise ratio in our analysis, as shown in Figure 3E-F.

      Discussions

      The authors suggest a strong link between the C-form of actin and the formation of a short pitch helix. However, Oda et al. (3) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. Moreover, in one strand of the cofilactin, interactions between actin subunits are limited to those between the inner domains (ID-ID interactions), which are quite similar to the interactions observed in bare actin filaments. This similarity implies that ID-ID interactions alone are insufficient to determine the helical parameters, suggesting that the presence of cofilin is essential for the formation of the short pitch helix in the cofilactin filament. Thus, crossover repeats are not necessarily shortened even if the actin form is C-form.

      We have experimentally observed a shortened bare half helix adjacent to cofilin clusters on the PE side at high spatial resolution, comprising fewer protomers than normal half helices. Thus, we hypothesized that crossover repeats are shortened if the actin protomers in the bare half helix neighboring the cofilin cluster on the PE side resembles a C-actin structure. This assumption is further explained by referring to C-actin structure in Figure 2 and Figure supplement 2. Even though the C-form, as suggested in Oda et al., 2019, is unstable, it intrinsically fluctuates around the mean value over time and adopts various conformations. A single PDB structure resolved by Cryo-EM through the ensembles of averaging structural images should be referenced as a single atomistic structure, one of many possible conformations, regardless it is resolved by Cryo-EM, X-ray diffraction or crystallography, or NMR (see Figure 1, legend of Figure supplement 1).

      We highlight two main points regarding this issue: (1) The short helical pitch at the global scale is associated with the twisting of the OD at the local scale for individual protomers; (2) Actins in different nucleotide or cofilin bound states exhibit varying ranges, distributions, spectra, variations of both local OD twist and global helical pitch (Figure 1-2, Figure supplement 1-2). The first point underscores that the twist/untwist of the OD determines the shortness of the helical pitches, rather than the ID-ID interactions. The latter point is more related to the global length of the filament. The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (see pseudo AFM images in Figure supplement 2 for canonical actin filament and cofilactin segments with the same length (comprising 62 protomers). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560) and independent on the ID-ID interactions.

      Narita (4) proposes that the facilitation of cofilin binding may occur through a shortening in the helix pitch, independent of a change to the C-form of actin. Furthermore, the dissociation of the D-loop from an adjacent actin subunit leads directly to the transition of actin to the G-form, which is considered the most stable configuration for the actin molecule (3).

      See also our explanation above. We have incorporated these points in a Discussion section. See lines 497-499; 510-511.

      Furthermore, our PCA analysis indicates that the transition from C-actin to G-actin necessitates the opening of the nucleotide cleft (resulting in a decrease in PC1) and is more readily achieved than the direct transition from F-actin to G-actin (which requires decreases in both PC1 and PC2). Whether this transition is directly triggered by the dissociation of the D-loop remains a topic for our future investigations. Our PCA analysis reveals that the D-loop is deeply buried within the core of the filament (Figure 2). Further experiments will be conducted to elucidate its roles.

      The mechanism by which the shortened pitch propagates remains a critical and unresolved issue. It appears that this propagation is not a result of the C-form's propagation but likely involves an unidentified mechanism. Identifying and understanding this mechanism represents an essential direction for future research.

      It's worth mentioning that our HS-AFM data and spatial ACF analysis lend support to a hypothesis suggesting that 2-4 bare actin protomers adjacent to cofilin clusters on the PE side adopt C-actin-like structures. Additionally, we have proposed several hypotheses aimed at better understanding the mechanisms driving the unidirectional binding and expansion of cofilin clusters toward the PE side. These hypotheses will require further examination in future experiments. Additional information can be found in lines 328-329; 344-351; and 416-430.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).<br /> (3) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).<br /> (4) A. Narita, ADF/cofilin regulation from a structural viewpoint. Journal of muscle research and cell motility 41, 141-151 (2020).

      We have cited them accordingly in the paper.

      Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      We are grateful for the positive feedback from Reviewer 2.

      Weaknesses:

      The paper also has a number of limitations, which I detail below.

      In addition to AFM, the authors also propose a Principal Component Analysis (PCA) of exisiting structural data on actin protomers. However, this part seems very similar to another published work by others (Oda et al. JMB 2019), which is not even cited.

      We addressed this issue and explained it in Methods section, lines 612-621.

      The asymmetrical growth of cofilin clusters has so far only been seen using AFM, by the same authors (Ngo et al. eLife 2015). Using fluorescent microscopy, others have reported a very symmetrical expansion of cofilin clusters (Wioland et al. Curr Biol 2017). This is not mentioned at all, here. It should be discussed, and explanations for this discrepancy could be proposed.

      We have cited this paper (Wioland et al. Curr Biol 2017) in the current manuscript (see lines 361-362). However, we are unable to evaluate the technical distinctions between our methods and theirs. Instead, we have referred to a more recent paper that employed similar techniques to those used by Wioland et al. in Current Biology 2017. Our findings align with those reported by Bibeau JP et al. in the Journal of Molecular Biology 2021 (see their Results on page 7, titled “Cofilin clusters elongate preferentially towards the actin filament pointed end”. At the minimum, we believe this is appropriate.

      Regarding the AFM technique, I have the following concerns.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3A, 4BC, 5A). Why is that? Isn't there a risk that this could affect the result? This suggests there is some interaction between the filaments.

      In this study, as well as in many similar studies of actin filaments alone or in interaction with other actin binding proteins (ABPs) including cofilin, we have carefully considered the density of filaments when designing experiments. We used highly dense, but not packed, actin filaments to minimize free space between filaments and the surface, which helps maintain stable tip-scanning during AFM imaging. This strategy technically allows us to capture high spatial and temporal resolutions of actin filaments’ structures.

      The actin filaments, resemble paracrystal structures, are represented as densely packed actin filaments (see our data in Ngo and Kodera et al., eLife 2015, Figure 1C). Thus, the data presented in this paper is technically appropriate and does not risk misinterpretation due to lateral interactions impacting the structures and function of actin filaments and cofilin.

      The properties of the lipid layer and its interaction with the actin filaments are not clear at all. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachement likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.)

      We are confident that our lipid membrane bilayer is the optimal choice for immobilizing actin filaments in a controlled manner for HS-AFM experiments, achieved through the variation of positively charged lipids. In this study, we have fine-tuned the surface charge for our specific purposes.

      As an example, to capture high-spatial resolution images of actin structures (Figure 5-6, Figure supplement 5B, 6), we strongly fixed the filaments on DPPC/DPTAP (50/50 wt%) after the binding reaction between actin filaments and cofilin in solution was completed. This experiment yielded valuable information, including: (i) the ability to replicate the conformation of cofilactin and hybrid cofilactin/bare actin segments in solution, akin to the first steps in sample preparation for Cryo-EM techniques; and (ii) the capability to capture these structures, reflecting their solution states, by firmly fixing them on a lipid surface. On the lipid surface, these structures were retained stably during AFM imaging.

      If there is a choice, we advise against using amino-silane and other positively charged polymers typically used for modifying glass surfaces to fix actin filaments in studies using fluorescence microscopy. The strong immobilization by these chemicals can alter the structural dynamics and functions of actin filaments, lead to non-specific binding of cofilin on the modified glass surface, and potentially affect data interpretation.

      On a local scale, the reviewer may argue about the "lower availability of binding sites on the filament facing the surface". However, on a global scale, we maintain that two single strands forming helical twists of long F-actin segments should have an equal chance to bind cofilin even when fixed on a lipid membrane. The evidence shown in Figure 8A and Video 7, which demonstrates that small cofilin clusters associate and dissociate locally without developing into large clusters along the actin filament, supports our conclusion that flexibility and dynamics in helical twists plays a crucial role in facilitating the binding and growth of cofilin clusters.

      The lipid surface utilized in our study with actin filaments and cofilin provides an ideal surface, as it is flat and minimizes the nonspecific binding of cofilin to the lipid membrane (see an example of the lipid surface in Video 5).

      How do we know that the variations over time are not mostly experimental noise, i.e. variations between repeats of the same measurement? As shown in Fig 3, correlation is mostly lost from one image to the next, and rather stable after that.

      This question is similar to the above question of Reviewer 1. Please also refer to our response in lines 264-270; 333-337; 555-560, measurement Methods, and Figure supplement 3 and Table supplement 5.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), but not in between, thereby setting the resolution for the localization of cluster borders to one HHP. It thus seems difficult to claim that there is a change in helicity without cofilin decoration over this distance. In Fig 7, the change in helicity could be due to cofilin decoration that is undetected because cofilins have not yet reached the next peak.

      There are several important criteria to distinguish the "supertwisted half helix" in cofilactin region from the "normal half helix". As illustrated in the pseudo AFM images constructed for normal F-actin and C-actin segments (with and without cofilin decoration) from PDB structures, it is evident that these two structures differ significantly in length and the number of protomer pairs per HHP (see Figure Supplement 2). In both pseudo and experimental AFM images, these parameters can be easily detected by measuring the distance between two cross-over points. Furthermore, the height or thickness difference between the cofilactin and bare actin regions is approximately 10-15 Å, which is well resolved by HS-AFM due to its exceptional z-axis resolution of ~1 Å. Technically, we were able to detect these differences by creating a longitudinal section profile that covered both bare actin and cofilactin areas, as shown in Figure supplement 6.

      We experimentally reveal that a critical cofilin cluster comprising 2-4 molecules (Figures 5-6) or larger cofilin clusters (Figures 7-8, Figure Supplements 6-8) could equally supertwist a bare half helix on the PE side. The observation that a small cofilin cluster (2-4 molecules) can shorten a half helix by reducing number of protomers per HHP to 9 or 11 (4.5 or 5.5 protomer pairs), which typically requires full decoration by 9-11 cofilin molecules, strongly suggests that supertwisting or the change in helicity does not always require complete cofilin decoration. We predicted that 2-4 bare actin protomers neighboring a cofilin cluster on the PE side can adopt the C-actin-like structure. See further in lines 324-329.

      Figure 7 captures a live binding event of cofilin at low spatial resolution, yet (i) the half helical pitches and (ii) the thickness of the cofilactin and bare actin segments can still be clearly distinguished. This demonstrates that changes in helicity within the cofilactin region propagate to an unbound half helix on the PE side, rearranging the helical twist by reducing the number of actin protomers per HHP, prior to recruiting additional cofilin for binding and expanding clusters.

      Reviewer #1 (Recommendations For The Authors):

      I believe C-form and G-form are better than C-actin like structure or G-actin like structure.

      We avoid using terms like "G-form", "F-form", or "C-form", as defined by Cryo-EM (Oda et al., 2019), because they refer to specific nucleotide and cofilin-bound states in other original papers. Instead, we use “G-actin”, “F-actin”, “C-actin”, “G-actin-like”, and “C-actin-like” to emphasize "Structural Dynamics" and "Structural Polymorphism". This highlights that even F-actin structures without cofilin bound can adopt "C-actin-like" conformations with fewer OD twists, resulting in a shorter global helical pitch. ADP-bound F-actins exhibit greater variability in helical twists than ADP-Pi-bound F-actin (Figure 9), indicating that ADP-bound F-actin protomers can adopt more C-actin-like conformations than ADP-Pi-bound F-actin protomers (Figure 1, Figure supplement 1).

      Technical terms describing actin structures do not need to be the same between Cryo-EM and HS-AFM, as the two techniques are fundamentally different. Our work underscores the importance of considering “structural dynamics and heterogeneity” in different nucleotide states of filamentous actin structures, both with and without cofilin, over time.

      Figure 1A

      A very similar analysis has already been performed by Oda et al (1). The authors should describe the relationships with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621.

      Figure 1B, C

      A very similar analysis has already been performed by Tanaka et al. (2). The authors should describe the relationship with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621 and legend of Figure 1.

      Lines 397-398

      "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when ATP or ADP was present."

      I believe other experiments also contain ATP in the solution. I could not catch the meaning of this sentence.

      We addressed this issue in the Results section, line 412. "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when only ADP was present."

      Additionally, we enhanced the description in the Methods section to avoid any confusion regarding nucleotides in the buffer. Please refer to the Methods section under “HS-AFM imaging”, lines 702-738.

      Lines 427-429

      "Consequently, the proportion of naturally supertwisted half helices with HHPs shorter than 30 nm was 5.8% for F-ADP-actin but only 1.1% and 0.2% for F-ADP.Pi-actin and phalloidin-stabilized F-actin, respectively."<br /> Similar discussion was made in (3) for the actin filaments with tension. It might be comparable with the current data.

      We cited it accordingly, line 447 for Okura et al., 2023.

      Lines 553-557

      "Nonetheless, it remains plausible that the structural flexibility exhibited 553 by ADP-bound actin protomers could result in subtle variations in the conformations of the DNase binding loop (Dloop) G46-M47-G48-N49, as suggested in (Chou and Pollard, 2019). We suggest that the absence of bound Pi possibly increases the torsional flexibilities during helical twisting of ADP bound actin filaments in contrast to their ADP.Pi-bound counterparts."

      The crystal structure of the F-form (4) showed that Pi in ADP.Pi connects the two large domains of the actin molecule, stabilizing F-form. Pi release largely weakens the connection. This might be useful for the discussion.

      We incorporated this point with the suggested citation in lines 582-584.

      (1) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).

      (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).

      (3) K. Okura et al., Mechanical Stress Decreases the Amplitude of Twisting and Bending Fluctuations of Actin Filaments. Journal of molecular biology 435, 168295 (2023).

      (4) Y. Kanematsu et al., Structures and mechanisms of actin ATP hydrolysis. Proceedings of the National Academy of Sciences of the United States of America 119, e2122641119 (2022).

      Reviewer #2 (Recommendations For The Authors):

      Line 190: "Noticeably, PCA analysis revealed higher structural flexibility in F-ADP-actin (red dots), exploring a larger space than F-ADP-Pi-actin structures (orange dots) within the F-actin cluster (inset in Figure 1A)". Is there a quantification to support this claim? Visually, things are not so clear.

      We have improved Figure 1 by adding 2 circles to an inset, providing clearer quantification to support our claim.

      In the PCA part: isn't it a bit obvious, or at least expected, that the conformation adopted by actin in the cofilactin structure is the most favorable one for binding cofilin?

      We agree this point with the reviewer and have added this point accordingly in the Results section, lines 202-204.

      I found it a bit unclear how the structures in Fig 2 were obtained.

      We further explained it by adding “Zoom-in views of these long filaments are shown in Figure 2” in Methods section, line 661.

      In the AFM images, the authors always seem to know the polarity of the filaments. Unless I missed it, how they know this is not explained. In their earlier work (Ngo et al. 2015) they used a subfragment of myosin II which indicates polarity when bound to F-actin. I found no such explanation here.

      We have addressed this issue in the legend of each figure accordingly.

      For clarity, I suggest writing "C-actin-like structures" (with two hyphens) rather than "C-actin like structures".

      We agree and are currently incorporating this change in the text.

      The term "cluster" in PCA can be confusing because it is used for cofilin clusters throughout the text.

      "Cluster" is a common term used in PCA analysis. To clarify, we revised the legend in Figure 1 and Figure Supplement 1, changing "PCA clusters" to distinguish them from “cofilin clusters” or “F-actin clusters”.

      There are many acronyms. Readibility of the figure legends (which can be consulted independently from the main text) would be improved if acronyms were explicited there as well.

      We have revised some of the acronyms in the legend of each figure accordingly. At the minimum, we believe it is appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation. 

      Strengths: 

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1. 

      Weaknesses: 

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function). 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6, sentences highlighted in blue)

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data. 

      This section has been re-written for better clarity (see page 7). We note that this assay was originally developed and published by Lee, M. S., M. Henry, and P. A. Silver in their 1996 paper in G&D and has since been reported in numerous subsequent studies. Reassuringly, our conclusion is bolstered by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally, suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates. 

      P-bodies consist of both RNA and proteins (reviewed in doi: 10.1021/acs.biochem.7b01162). The significance of this experiment lies in its contribution to further confirming the co-localization of Sfp1 with mRNAs and Rpb4. This observation could also yield valuable insights for future investigations into the role of Sfp1.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1. 

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here. 

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we delved into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4. See blue paragraph in page 20.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable. 

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This  method does not requires any drug or stressful treatment.  The results obtained by this method were consistent with those obtained after thiolutin addition. Using both methods, we discovered that disruption of Sfp1 results in substantial mRNA destabilization. Nevertheless, in our revised manuscript, we show results obtained by subjecting cells to a temperature shift to 42°C, a natural method to inhibit transcription. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on half-lives. Indeed, this assay clearly determine HL under heat stress. Thus it can clearly demonstrate that, at least during heat shock, Sfp1 stabilizes mRNAs. Since the results are similar to those obtained by the GRO method at 30oC, we concluded that Sfp1 stabilizes mRNA under optimal and hot conditions.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below: 

      Comments on methodology and results: 

      (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6)

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated. 

      We agree with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can lead to non-specific effects. It is evident that nup49-313 does not prevent Sfp1 export to the cytoplasm. In the case of rpb1-1, these non-specific effects are expected due to transcriptional arrest, which can eventually result in a reduction in protein content. However, this process takes some time, while the impact on export is more rapid. It is worth noting that this assay was developed and previously published by Pam Silver (Henry and Silver G&D 1996) and has been reported in many subsequent papers. Importantly, our conclusion is supported by the observation that Sfp1 binds both nascent RNA (co-transcriptionally) and mature mRNA (cytoplasmic). These observations, along with the reduced mRNA export upon transcription blocking, are consistent with our proposal that Sfp1 is exported in association with mRNA.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1. 

      The submitted PDF figure is of low quality. We believe that high quality figure of the final submission is convincing. 

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The NON-CRAC+ selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We would like to thank Reviewer 2 for bringing this issue up, as it helped us to clarify it in the revised paper.

      First, we emphasized in the Discussion that many CRAC+ genes do not fall into the category of highly transcribed genes. Please see more detailed discussion below.

      Secondly, we examined various features of the 264 genes - classified as CRAC+ - to estimate their specificity and biological significance. Our various experiments revealed that the CRAC+ genes represent a distinct group with many unique features.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. In fact, all the experiments and analyses that we have pursued indicate the unique nature of the CRAC+ genes. Some examples are:

      (1) Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.

      (2) Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif located near the 3’ ends of the mRNAs.

      (3) Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whiles the vast majority of RiBi non-CRAC+  promoters do not. (Fig. 3C).

      (4) Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi non-CRAC+ mRNAs do not. Fig. 4B shows similar results due to Sfp1 depletion.

      (5) Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for non-CRAC+ genes. This is most clearly visible in RiBi genes.

      (6) Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for non-CRAC+.

      (7) In Fig. S4B, the chromatin binding profile of Sfp1 is shown to be different for CRAC+ and non-CRAC+ genes.

      Taken together, the many unique features, in fact, any feature that we examined, indicate the specificity and significance of this group, demonstrating that our CRAC results are biologically significant.

      Most importantly, these genes do not all fall into the category of highly transcribed genes.  On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes behaves differently from the Q1 group. Evidently, despite the heterogeneous transcription of CRAC+ genes (as mentioned above), the Rpb4/Rpb3 profile decreases more substantially than that of the highly transcribed genes (Q1).  Moreover, despite similar expression levels among all RiBi mRNAs, only a portion of them binds Sfp1.

      Thus, all our results indicate that CRAC+ genes represent biologically significant group, irrespective of the expression of it members. In response to this comment, we included a new paragraph discussing the validity of our conclusions. See page 18, blue paragraph.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results. 

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B.  The results of Fig. 3 led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear (in my opinion). However, they exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating HLs through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, our experience along the years reassured approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we supplemented the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). The new results are shown in Fig. S3B. They are consistent with our conclusion that Sfp1 stabilizes mRNAs.

      Using a repressible promoter to determine mRNA HL is, unfortunately, not suitable in this paper because the promoter itself is involved in HL regulation. This observation is supported by Bregman et al. (2011) and depicted in Fig. 3, which illustrates that the promoter is critical for mRNA imprinting, consequently regulating HL.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020). 

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, we do not think that heightened sensitivity of RP mRNA degradation in response to stress is responsible for the pronounced difference in the configuration of the Pol II elongation complex that is detected in CRAC+ genes, mainly because this experiment was performed under standard (non-stress) culture conditions.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The correlations shown in these panels are dependent on Sfp1. Indeed, RP genes are sensitive to stress. However, we used non-stressed conditions. Furthermore, CRAC+ genes did not display any apparent unusual destabilization but rather exhibited higher (not lower) mRNA stability compared to non-CRAC+ genes (Figure 7C).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper combines phenotypic and genomic analyses of the "sheltered load" (i.e. the accumulation of deleterious mutations linked to S-loci that are hidden from selection in the homozygous state) in Arabidopsis. The authors compare results to previous theoretical predictions concerning the extent of the load in dominant vs recessive S-alleles, and further develop exciting theory to reconcile differences between previous theory and observed results.

      Strengths:

      This is a very nice combination of theory and data to address a classical question in the field.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      The "genetic load" is a poorly defined concept in general, and its quantification via the number of putatively deleterious mutations is quite difficult. Furthermore counting up the number of derived mutations at fully constrained nucleotides may not be a great estimate of the load, and certainly does not allow for evaluation of recessivity -- a concept critical to ideas concerning the sheltered load. Alternative approaches - including estimating the severity of mutations - could be helpful as well. This imperfection in available approaches to test theory must be acknowledged more strongly by the authors.

      As suggested by the reviewer, we implemented alternative approaches to estimate the severity of deleterious mutations and now report the results of SNPeff and

      SIFT4G analyses in Table S6. The results we obtained with these other metrics were overall very similar to those based on our previous counting of mutations at 0-fold and 4-fold degenerate sites. More generally, we tried to improve the presentation of our strategy to estimate the genetic load (clarified in lines 262-268, 271, 292-295, 297. In particular, we made it clear that our population genetic analysis cannot assess the recessivity of the observed mutations (lines 428-434).

      Reviewer #2 (Public Review):

      Summary:

      This study looks into the complex dominance patterns of S-allele incompatibilities in Brassicaceae, through which it attempts to learn more about the sheltering of deleterious load. I found several weak points in the analyses that diminished my excitement about the results. In particular, the way in which deleterious mutations were classified lacked the ability to distinguish the severity of the mutations and thus their expected associated dominance.

      First, we would like to clarify that our goal with this study is NOT to learn something about dominance of the linked deleterious mutations (we can not). Instead, we compare the accumulation of deleterious mutations linked to dominant vs recessive S-ALLELES, but are agnostic regarding the dominance level of the LINKED mutations themselves. The rationale is that the different intensities of natural selection between dominant vs recessive S-alleles provide a powerful way to examine the process by which deleterious mutations are sheltered in general. We further clarified this aspect on lines 70-73 and 399-401.

      Second, as mentioned above in response to Reviewer 1, we complemented the analysis by predicting the severity of the deleterious mutations by SIFT4G and SNPeff. The results were largely consistent, with the exception that the number of sites included in SIFT4G was low, such that the statistical power was reduced (lines 296-300).

      Furthermore, the simulation approach could have provided this exact sort of insight but was not designed to do so, making this comparison to the empirical data also less than exciting for me.

      As explained above, studying dominance of the linked mutations we observed is an interesting research question (albeit a difficult one), but it was not our goal here. Instead, our study was designed as an empirical test of the predictions presented in Llaurens et al (2009), and we re-analysed some aspects of the model outcome to illustrate our points.

      We now better explain that we based our choice of parameters on the fact that in the theoretical study by Llaurens et al (2009), recessive deleterious mutations are predicted to accumulate in a much more straightforward manner (line 316-318).

      We now dedicate a paragraph of the discussion to explain how our stochastic simulations could be improved, and acknowledge that a full exploration of the interaction between dominance of the S-alleles and dominance of the linked deleterious mutations would be an interesting follow-up - albeit beyond the scope of our study (line 437-441).

      Major and minor comments:

      I think the introduction (or somewhere before we dive into it in the results) of the dominance hierarchy for the S-alleles needs a more in-depth explanation. Not being familiar with this beforehand really made this paper inaccessible to me until I then went to find out more before continuing. I would expect this paper to be broad enough that self-contained information makes it accessible to all readers. For example, lines 110-115 could be in the Introduction.

      We thank the reviewer for this useful remark. We now give a more comprehensive description of the dominance hierarchy and introduce the classes of dominance in A. lyrata already in the introduction, on lines 64-70.

      Along with my above comment, perhaps it is not my place to comment, but I find the paper not of a broad enough scope to be of interest to a broad readership. This S-allele dominance system is more than simple balancing selection, it is a very complex and specific form of dominance between several haplotypes, and the mechanism of dominance does not seem to be genetic. I am not sure that it thus extrapolates to broad comments on general dominance and balancing selection, e.g. it would not be the same as considering inversions and this form of balancing selection where we also expect recessive deleterious mutations to accumulate.

      We disagree with these interpretations by the reviewer, for two reasons:

      First, the mechanism of dominance is actually entirely genetic. In fact, we uncovered some years ago that it is based on the molecular interaction between small non-coding RNAs from dominant alleles and their target sites on recessive alleles (Durand et al. Science 2014, see lines 68-70). If there is something specific with this system, it is that the dominance phenomenon is better understood at the mechanistic level than in most other cases, but the resulting phenomenon in itself (a dominance hierarchy) is rather common.

      Second, the kind of variation in the intensity of linked selection created by this mechanism is actually a general phenomenon, so our results have broad relevance beyond our particular study system. We modified the introduction to explain this point

      more clearly, highlighting in particular the fact that the situation we study closely resembles the case of sex chromosomes, where X (or Z) chromosomes are genetically recessive and Y (or W) chromosomes are genetically dominant. We cite this example in lines 83-87 of the introduction and also several well-studied other examples on lines 480-489 of the discussion.

      It would have been particularly interesting, or a nice addition, to see deleterious mutations classed by something like SNPeff or GERP where you can have different classes of moderate to severe deleterious variants, which we would expect also to be more recessive the more deleterious they are. In line with my next comment on the simulations, I think relative differences between mutations expected to be more or less dominant may be even more insightful into the process of sheltering which may or may not be going on here.

      We agree with the reviewer, and as detailed above we have now integrated such analyses with SNPeff and SIFT4G (Table S6). These new results reinforce our conclusion that while S-allele dominance influences the fixation of deleterious mutations, it has no effect on their total number. See lines 270-272 and 296-300.

      In the simulations, h=0 and s=0.01 (as in Figure 5) for all deleterious mutations seems overly simplistic, and at the convenient end for realistic dominance. I think besides recessive lethals which we expect to be close to h=0 would have a much larger selection coefficient, and other deleterious mutations would only be partially recessive at such an s value. I expect this would change some of the simulation results seen, though to what degree I am not certain. It would be nice to at least check the same exact results for h=0.3 or 0.2 (or additionally also for recessive lethals, e.g. h=0 and s=-0.9). I would also disagree with the statement in line 677, many studies have shown, particularly those on balancing selection, that partially recessive deleterious mutations are not eliminated by natural selection and do play a role in population genetic dynamics. I am also not surprised that extinction was found for higher s values when the mutation rate for such mutations was very high and the distribution of s values was constant. An influx of such highly deleterious mutations is unlikely to ever let a population survive, yet that does NOT mean that in nature, the rare influx of such mutations does lead to them being sheltered. I find overall that the simulation results contribute very little, to none, to this paper, as without something more realistic, like a simultaneous distribution of s and h values, you cannot say which, if any class of these mutations are the ones expected to accumulate because of S-allele dominance.

      We understand that the previous version of our manuscript was confusing between dominance of the S-alleles and dominance of the linked deleterious mutations. We clarified that our study focuses on the effect of the former only (lines 99, 263-264 and 581-583).

      We agree that a complete exploration of the interaction between dominance of the S-alleles and dominance of the linked mutations being sheltered would have been an asset, but as explained above this is not the focus of our study. The previous work by Llaurens et al (2009) has already established that deleterious mutations can fix within S-allele lineages, especially when linked to dominant S-alleles, and when the number of S-alleles is large. Under the conditions they examined, deleterious mutations were much more strongly eliminated if not fully recessive (h=0 vs h=0.2), so for the present study we decided to simulate fully recessive mutations only. We now formally acknowledge the possibility that some complex interaction may take place between dominance of the S-alleles and dominance of the linked deleterious mutations (lines 440-442). However, as explained above we feel that fully exploring this complex interaction would require a detailed investigation, which is clearly beyond the scope of the present study.

      Rather they only show the disappointing or less exciting result that fully recessive, weakly deleterious mutations (which I again think do not even exist in nature as I said above) have minor, to no effect across the classes of S-allele dominance. They provide no insight into whether any type of recessive deleterious mutation can accumulate under the S-allele dominance hierarchy, and that is the interesting question at hand. I would either remove these simulations or redo them in another approach. The authors never mention what simulation approach was used, so I can only assume this is custom, in-house code. Yet I do not find that code provided on the github page. I do not know if the lack of a distribution for h and s values is then a choice or a programming limitation, but I see it as one that should be overcome if these simulations are meant to be meaningful to the results of the study.

      The code we used (in C) was adapted from the previous study by Llaurens et al. (2009), which at the time was not deposited in a data repertory, unfortunately. With the agreement of the authors of that study, this code is now available on Github:

      (https://github.com/leveveaudrey/model_ssi_Llaurens; line 723).

      It is correct that our simulations were not aimed at determining whether “any type of recessive deleterious mutation can accumulate”, but we strongly believe that they help interpreting the observations made in the genomic data.

      Recommendations for the authors:

      Notes from the editor:

      I found Table 1 confusing, with column headings of observed proportion but perhaps numbers reflecting counts.

      Thank you for pointing out this confusion. There was indeed an error in the last column, which we have now corrected.

      I found Figure 2 a bit hard to parse, with the vertical lines being unclear and the x-axis ticks of insufficient resolution to evaluate the physical extent of the signals.

      We increased the size of the label on the x-axis and detailed it on the Figure 2, which is now hopefully more clear. Moreover, we increase the size of the vertical lines.

      Finally, I wonder, given the rapid decay of signal in lyrata, whether 25kb is the right choice for evaluating load and whether the pattern may look different on a smaller scale.

      It is true that the signal decays rapidly in A. lyrata, as can be seen in the haplotype structure analysis and in line with our previous analysis of the same populations Le Veve et al (MBE 2023; in this study we explored the effect of the choice of the size of the chromosomal region analyzed; lines 266-269). However, for the sake of comparison, we prefer to stick to the same window size. The fact that we still see an effect of dominance in spite of the lower statistical power associated with the more rapid decay (because a smaller number of genes is expected to be impacted) actually reinforces our conclusions.

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional suggestions to improve the manuscript.

      (1) How does the load linked to the S-locus compare to that observed in other genomic regions? It would be useful to provide a comparison of the results quantified in Figures three and four to comparable genomic regions unlinked to the S-locus. How severe is the linked load?

      This comparison to the genomic background was actually the core of our previous study (Le Veve et al MBE 2023), which was based on the same populations. This analysis revealed that polymorphism of the 0-fold degenerate sites was more than twice higher in the 25kb immediately flanking the S-locus than in a series of 100 unlinked control regions. Here, the main focus of the present study is on the effect of linkage to particular S-alleles (which was not possible previously because haplotypes had to be phased).

      (2) Details of the GLM for data underlying Figures 3 and 4 are somewhat unclear. Is the key explanatory variable (Dominance) treated as continuous? Categorical? Ordinal etc…

      Dominance is considered as a continuous variable. We specify this in line 162 of the results, in the legends of Figures 3 and 4, in the Material and Method (lines 627 and 660) and in the legend of Table S4.

      (3) I had some trouble understanding the two different p-values in columns five and six of table one. Please provide more detail.

      We understand that the two p-values in Table 1 were confusing. The first was related to the binomial test and the second to the permutation test. To be consistent with the rest of the manuscript, we conserved only the p-value of the permutation test.

      (4) As mentioned in the "weaknesses" above, the authors should be more clear about what they are quantifying. They are explicitly counting the number of variants at 0-fold degenerate sites as a proxy for the genetic load. How good this proxy is is unclear. The most egregious misstatement here was on line 314 in which they make reference to the "total load." However, this limitation should be acknowledged throughout the manuscript and deserves more attention in the methods and discussion.

      As mentioned above, we now integrate additional methods to define and quantify the load (SIFT4G and SNPeff), which reinforced our previous conclusions (lines 271-272, 297-302).

      We clarified our wording and replaced the mention of “total load” by “mean number of linked deleterious mutations per copy of S-allele” (line 324-325). In the discussion we tried to better explain the limitations of approaches to estimate the genetic load (line 431-437).

      Reviewer #2 (Recommendations For The Authors):

      Line 60, it should be specified that this is only for recessive deleterious mutations.

      Non-recessive deleterious mutations would certainly not be expected to accumulate.

      As explained in details above, the question of whether and how non-recessive deleterious mutations can accumulate when linked to the S-locus is difficult and would in itself deserve a full treatment, which is clearly beyond the scope of the present study. We clarified this point on line 56.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      More details should be provided in terms of inclusion and exclusion criteria for the participants, as well as missing data due to the non-cooperation of newborns during the experimental process. Potential differences between preterm and full-term infants are worth exploring. Several aspects of EEG data analyses and data interpretation should be better clarified.

      Here I have several comments and questions to improve the manuscript.

      (1) It would be wise to know whether there was any missing data due to the non-cooperation of newborns during the experimental process.

      Thank you for the suggestion. While our initial aim was to include 120 neonates in the final data analysis, we actually recruited 198 neonatal participants for this study. The 78 EEG datasets were excluded from the data analysis due to non-cooperation of neonates (n = 75) or technical issues (n = 3). We have incorporated this detailed information in the Subjects subsection (lines 375-383) in the revised manuscript.

      (2) The authors investigated the impact of gestational age on emotional perceptual sensitivity in newborns by grouping infants of varying gestational ages in the experiment. The methods section mentions that the study conducted experiments within 24 hours after the birth of the newborns. When do preterm infants (with a gestational age of 35 and 36 weeks) begin to exhibit emotional discrimination comparable to full-term newborns? 

      This is indeed an intriguing question that merits exploration. However, in our study, we recruited relatively healthy preterm neonates, many of whom were discharged from the hospital with their mothers within 3-5 days after birth. It would have been challenging to arrange for another EEG testing session once these preterm infants reached full-term age, as their parents were unwilling to return to the hospital.

      (3) When analyzing EEG data, excluding artifacts with peak deviations exceeding ±200 μV is a relatively lenient criterion, potentially resulting in the retention of some large-amplitude artifacts or noise. What is the rationale behind the author's choice of this criterion? Or, in other words, what considerations led to this specific selection?

      In our standard practice, we typically employ a stricter threshold of ±100 μV for artifact removal in studies involving healthy adults and a median threshold of ±150 μV for data from adult patients, such as those with schizophrenia. However, when analyzing neonatal data, we often resort to the loosest criterion of ±200 μV. This decision is primarily due to the inherent challenges associated with neonatal EEG recordings, as we cannot expect newborns to cooperate or remain quiet during the recording process. Consequently, neonatal EEG data tend to contain more artifacts compared to those from healthy adults. Furthermore, the excitability of the newborn brain is notably elevated. This heightened excitability arises from an imbalance in the distribution and function of excitatory and inhibitory neurotransmitter systems. Typically, the expression of excitatory neurotransmitters and their receptors surpasses that of inhibitory neurotransmitters, resulting in increased excitability in the immature brain. This heightened excitability can occasionally lead to the occurrence of paroxysmal electrical activity. As a result, neonatal EEG recordings may at times display large amplitudes, exceeding even 100 μV. In this revision, we have referenced other neonatal/infant EEG studies or technique pipelines that have used the threshold of ±200 μV to support this criterion (lines 483-484).    

      (4) In the Discussion section, the authors mentioned the biomarkers, such as the fusiform gyrus and hippocampus, which have been identified as potential predictors of autism risk. It is suggested that the authors briefly elucidate the crucial role of these biomarkers in processing social information, which would enhance the readability and logicality of this manuscript.

      Thank you for the thoughtful suggestion. We have expanded the discussion concerning the involvement of the fusiform gyrus and hippocampus in social information processing (lines 314-319).

      Reviewer #2 (Public Review):

      First, readers need to see spectrograms that show the 0-4000 Hz in more detail, rather than what is now shown (0-10,000 Hz). The vocal signals in clearer spectrograms will show I believe the initial consonant burst and formant frequencies that are unique to human speech and give rise to the perception of the consonant sounds in the vocal signals like 'dada' and 'tutu' that were tested. The control signals will presumably not show these abrupt acoustic changes at their onset, even though they appear (from the oscillograms) to approximate the amplitude envelope. The primary cue distinguishing the happy and neutral signals in both the vocal and control signals is the pitch of the signals (high vs low), but the burst of energy representing the consonants is only contained in the vocal signals; it has no comparable match in the control signals. It is possible that the presence of a sharp acoustic onset (a unique characteristic of consonants in human speech) is especially alerting to the infants, and that this acoustic cue, in the context of the pitch change, enhances discrimination in the vocal case. One way to test this would be to use only vowel sounds to represent the vocal signals, without consonants.

      Thank you for your expert comments and considerations. We have redrawn Figure 3 using Praat software with a frequency range of 0-5000 Hz, as suggested by Praat’s default parameters. Based on the spectrograms, we acknowledge the potential role of consonants in accounting for differences in stimuli. Consequently, we have included this consideration as one of the limitations of our study in this revised version (lines 325-330).

      Another critical detail that the authors need to include about the signals is an explanation of how the control signals were generated. The text states that the Fo and amplitude envelope of the vocal signals were mimicked in the control signals, but what was the signal used for the controls? Was a pure tone complex modulated, or was pink noise used to generate the control signals? Or were the original vocal signals simply filtered in some way to create the controls, which would preserve the Fo and amplitude envelope? If merely filtered, the control signals still may be perceived as 'vocal' signals, rather than as nonspeech (the Supplement contains the sounds, and some of the control sounds can be perceived, to my ear, as 'vocal' signals).

      We sincerely appreciate your attention to detail regarding the generation of control signals. As a non-specialized laboratory in audio editing, our approach involved filtering the original vocal sounds around the fundamental frequency (f0) and ensuring a balanced mean intensity between vocal and nonvocal stimuli (as now stated in lines 432-437). However, it became evident that certain “vocal” components persisted in the control sounds, particularly noticeable in the sound “tutu”. In this revision, we openly acknowledge this oversight (lines 331-333). We extend our gratitude once again for highlighting the importance of meticulous consideration when generating control sounds for a study.

      Second, there is no information in the manuscript or supplement about the auditory environment of the participants, nor discussion of the fetus' ability to hear in the womb. In the womb, infants are listening to the mothers' bone-conducted speech (which is full of consonant sounds), and we know from published studies that infants can discern differences not only in the prosody of the speech they hear in the womb, but the phonetic characteristics of the mother's speech. The ability at 37 weeks GA or beyond to discriminate the pitch changes in the vocal, but not control signals, could thus be due to additional experience in utero to speech. Another experiential explanation is that the infants born at 37 weeks GA and beyond may be exposed to greater amounts of speech after birth, when compared to those born at 35 and 36 weeks GA, from the attending nurses and from their caregivers, and this speech is also full of consonant sounds. What these infants hear is likely to be 'infant-directed speech,' which is significantly higher in pitch, mirroring the signals tested here. At 37 weeks GA, infants are likely more robust, may sleep less, and are likely more alert. If infants' exposure to speech, either after birth, or their auditory ability to discern differences in speech in utero, is enhanced at 37 weeks GA and beyond, then an 'experience-related' explanation is a viable alternative to a maturational explanation, and should be discussed. Perhaps both are playing a role. As the authors state, many more signals need to be tested to discern how the effect should be interpreted, and other viable interpretations of the current results discussed.

      We acknowledge the importance of considering the auditory environment of participants and the fetus' ability to hear in the womb. In our study, neonates were exposed to a native language environment both before and after birth (as added in lines 385-386), and we took efforts to minimize their exposure to speech stimuli other than those used in the experiment. Specifically, all neonates participated the experiment and underwent EEG recording within the first 24 hours after birth (lines 386-387). They were promptly transported to a dedicated testing room for EEG recording as soon as their condition stabilized after birth. During recording sessions, they were separated from their mothers to minimize exposure to natural speech (as added in lines 459-461). As a result, we believe that both preterm and term neonates were exposed to comparable amounts of speech after birth and before the experiment. We also ensured that all participants were in a natural sleep state during EEG recording. However, it is possible that term neonates slept less and were more attentive to the limited speech stimuli in their environment before the experiment compared to preterm newborns.

      The debate surrounding nature versus nurture in neonate and infant development persists. We recognize the potential impact of prenatal auditory experiences on neonatal perceptual sensitivity. Therefore, we have added a brief discussion regarding innate- or experience-related explanations for emotional prosodic discrimination in neonates, aiming to shed light on future research directions (lines 343-351).

    1. Author response:

      The following is the authors’ response to the previous reviews

      It is unclear to us why you did not adjust the title to better reflect the well-supported claims of the paper, i.e., that this is a valuable model for human loss-of-function mutations in IQCH.

      Thanks for the editor’s suggestion. We have changed the title to “Deficiency of IQCH causes male infertility in humans and mice.” Additionally, we have provided the original images of the gels or blots as a zipped folder.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors explore ER stress signalling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6; this proposed function for Calreticulin is intriguing and constitutes an important finding. The evidence presented is based on CHO genetic evidence and biochemical results and is convincing. 

      We thank the editors for their favourable assessment of our work.

      Reviewer #1 (Public Review): 

      Summary: 

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signalling using a CRISPR screen and characterize the functional interaction between ATF6 and CALR. 

      Strengths: 

      The manuscript is well written and interesting with an innovative experimental design that provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented. Findings are novel and bring together glycoprotein quality control and activation of one sensor of the UPR. This is a novel perspective on how the integration of ER homeostasis signals could be sensed in the ER. 

      We thank the reviewer for their favourable assessment of our work.

      Weaknesses: 

      Several points remain to be documented to support the authors' model. 

      Major comments 

      (1) It is interesting that BiP, PDIs, and COPII are not identified in the screen. Might this indicate some bias in the system perhaps limiting its sensitivity or pleiotropic effects of the reporter? 

      The reviewer raises a valid concern. Our CRISPR screen aimed to identify genes that selectively modulate ATF6⍺. Therefore, we excluded from consideration genes whose inactivation had effects on the broader ER environment. This would disfavour the selection of genes encoding BiP, PDI and COPII components. Additionally, a positive selection screen inherently removes essential genes like BiP. The absence of COPII components among the hits could be due to essentiality or that those components are not strong selective modulators for ATF6⍺ activation, as the stronger ATF6⍺ modulators as S1P, S2P and transcription factor S2P and NFY were among our top hits. Cell type specificity may also play a role. For example, ERp18, a small PDI previously implicated in ATF6⍺ activation (Oka et al 2019; PMID: 31368601), despite the presence of sgRNAs targeting hamster ERp18 in the library. Interestingly, depletion of ERp18 in our dual UPR reporter CHO-K1 cell line did not affect the ATF6⍺ and IRE1⍺ UPR branches in CHO-K1 cells. This new information has been incorporated into the revised manuscript as Supplemental Figure S6E and the discussion has been edited in line with these comments.

      (2) CLR interacts with ATF6 independently of ATF6 glycans (and cysteines). How do the authors reconcile this observation with the lectin functions of CALR? What is the interaction mode then - if the CALR N (lectin) domain is not involved, is it the P domain that is responsible for the interaction? All the binding experiments are performed in the presence of 1 mM CaCl2, is calcium necessary for CALR to achieve binding? 

      These points merit clarification. The Biolayer Interferometry (BLI) assay reported on an interaction between ATF6 and CRT that is independently of ATF6⍺ glycans. However, cellbased experiments revealed a contribution of glycan-dependent interactions to the binding and repression. Therefore, we conclude that the interaction of CRT with ATF6⍺ likely involves both lectin-dependent and lectin-independent interactions (dependent on the P-domain). Indeed, this hybrid model has previously been suggested as the mode of stable interaction of CRT with other substrates, as cited in the discussion section (Wijeyesakere et al., 2013; PMID: 24100026). CRT is a known calcium-dependent protein, and all the in vitro experiments were conducted in the presence of 1 mM CaCl2. We do not have data from experiments without CaCl2.

      (3) Does the introduction of the reporter system affect the normal BiP (or ATF6) protein levels in the cells? 

      To address this question, we have conducted new experiments comparing endogenous BiP protein levels between the reporter-containing cells and the parental CHO-K1 cells using immunoblotting and an anti-BiP antibody. These data indicate that the reporter system does not affect to the endogenous BiP protein levels. This new information has been incorporated as revised Supplemental Figure S1C.

      (4) Does the depletion of CRT affect BiP interaction with ATF6? The absence of CRT may lead to misfolding of glycoproteins and titration of BiP away from ATF6 leading to activation. An indicator of ER stress levels that is independent of ATF6 and IRE1 might be useful. 

      To further assess ER stress levels in CRT-depleted cells, we compared expression levels of endogenous ER resident proteins containing a KDEL signal (e.g., P3H1, GRP94, BiP and PDI) in parental CHO-K1 cells, dual UPR reporter cell lines (XC45-6S) and CRT-depleted cells (CRT∆#2P) under basal conditions and during ER stress by immunoblotting. This comparison confirmed the basal elevation in BiP protein level in cells lacking CRT, consistent with previous findings (Figure 2D) and more broadly the integrity of UPR signalling in cells lacking CRT. In the interest of time, we did not extend the analysis to other branches of the UPR. This new information has been incorporated as Supplemental Figure S5 and in the text of the revised manuscript.

      (5) Does CALR depletion alter ATF6 redox status. 

      We thank the reviewer for raising this interesting point. In response, we compared ATF6⍺ redox status in parental and CRT-depleted cells using non-reducing SDS-PAGE. Overall, the redox pattern was similar in parental and CRT-depleted cells with the detection of two redox forms: an inter-chain disulfide-stabilised dimer and the monomer. Under basal conditions, ATF6⍺ predominantly existed as a monomer, while under ER stress, the monomer band decreased with a corresponding increase in a disulfide-stabilised dimer form in parental cells, as previously reported (Oka et al, 2022; PMID: 35286189). However, under ER stress, CRTdepleted cells showed a significantly higher fraction of monomer versus dimer compared to parental cells. Taking all together, these data suggest that the loss of CRT may favour the monomeric form of ATF6α, which is proposed to be more efficiently trafficked (Nadanaka, et al 2007; PMID: 17101776), aligning with our observations that CRT depletion is associated to constitutive activation of ATF6α. These new data have been included as Supplemental Figure S7 and are detailed explained in the results section of the revised manuscript.

      (6) Figure 4C would benefit from some immunoblotting against BiP.

      Although we acknowledge the validity of this suggestion and understand the referee's interest in comparing the amount of CRT in pulldown with that of BiP, the necessity of generating additional samples makes this experiment impractical. Consequently, we opted not to include in our conclusion any comparison regarding the retention of ATF6α by BiP relative to CRT.

      (7) Overlooked requirement of cysteines for ATF6 functionality (Figure 5B). 

      We interpret this comment to refer to the inactivity of the cysteine-free allele of ATF6⍺. Whilst this is a reproducible observation of significance to the structure-activity features of ATF6⍺’s luminal domain, it is less informative in terms of understanding trans-active regulators of ATF6⍺ and was therefore not explored further.

      (8) Without a clear definition of the role of CRT in ATF6 folding, one cannot infer that the observed phenotype is not based on defects in ATF6 "folding" and glycosylation considering the possibility of activation of newly synthesised un-glycosylated ATF6. 

      If the main role of CRT were to assist ATF6⍺ folding, one would expect that depletion of CRT would lead to a non-functional ATF6⍺, resulting in ER retention and less activity. However, our data indicate that the loss of CRT correlates with the constitutive activation of the ATF6⍺ fluorescent reporter and increased Golgi trafficking and processing of ATF6⍺. Therefore, these data suggest that in CRT-depleted cells, the majority of ATF6⍺ is likely to fold to a functional state.

      (9) ATF6 was defined in several studies as a natively unstable protein and shows a close relationship with the ERAD machinery, is the role of CALR also involved in a quality control mechanism for natively unfolded ATF6? 

      The reviewer brings up a valid point too. Although we have not closely evaluated the role of CRT in the quality control machinery, we observed that the loss of CRT was not associated with an increased levels of ATF6⍺ in CRT depleted cells in basal conditions compared with parental cells (Fig 3B.1, compare line 1 and line 7; Figure 3B.2, compare line 1 and line 5). These observations suggest that if ATF6⍺ were degraded by ERAD and loss of CRT compromised ERAD functionality, CRT-depleted cells should exhibit increased levels of endogenous ATF6⍺. The fact that endogenous ATF6⍺ levels are slightly reduced in CRT depleted cells does not support a role for CRT in the quality control mechanism for natively unfolded ATF6⍺.

      (10) C618 in ATF6 is located within the BiP binding site and in close proximity of an Nglycosylation site. Is this region of particular importance for CALR binding? 

      It is an interesting point that we have not explored in this study. Consequently, without experimental data, we cannot infer the possible implications of C618 in CRT binding.

      (11) The authors have mutated all the N glycosylation sites at once; they should be mutated one by one and the impact on ATF6 stability evaluated independently of the CALR status. 

      We agree that analysing each N-glycosylation site individually would provide further insight into their contributions to ATF6⍺ stability/functionality. However, given the scope of the paper in its present form we have elected not to addressing this point.

      (12) The relationship between the absence of CALR and IRE1 remains weak. The authors do not exclude the possibility that CALR could have a direct effect on IRE1 itself. This should be either removed or further investigated. 

      We beg to differ. The relationship between the absence of CRT and IRE1 is not weak; loss of CRT in CHO-K1 cells represses IRE1; we conceded readily that the relationship is incompletely understood. ATF6⍺ signalling involves crosstalk with the IRE1 pathway, partly mediated by direct heterodimerisation of N-ATF6⍺ with XBP1s (Yamamoto et al., 2007, 2004). Additionally, recent research has shown that ATF6⍺ activity can repress IRE1 signalling (Walter et al., 2018). Therefore, given that our results indicate that the loss of CRT leads to constitutive activation of ATF6⍺, we suggest that a negative feedback loop in which ATF6⍺ represses IRE1 contributes to the observations made here on the relationship between CRT and IRE1. This does not exclude other aspects to the relationship, a point that is now clarified further in the revised manuscript. 

      Minor point 

      In the introduction on page 3 it is mentioned that loss of ATF6 impairs survival in cellular and animal models, this is not completely true as ATF6a ko in mice has no clear deleterious phenotype and only the double ko ATF6a/b has some dramatic impact.

      We have modified that sentence on the revised manuscript. 

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors set out to use an unbiased CRISPR/Cas9 screen in CHO cells to identify genes encoding proteins that either increase or repress ATF6 signalling in CHO cells. 

      Strengths: 

      The strengths of the paper include the thoroughness of the screens, the use of a novel, double ATF6/IRE1 UPR reporter cell line, and follow-up detailed experiments on two of the findings in the screens, i.e. FURIN and CRT, to test the validity of involvement of each as direct regulators of ATF6 signalling. Additional strengths are the control experiments that validate the ATF6 specificity of the screens, as well as, for CRT, the finding of focus, determining roles for the glycosylation and cysteines in ATF6 as mechanistically involved in how CRT represses ATF6, at least in CHO cells. 

      We thank the reviewer for their favourable assessment of our work.  

      Weaknesses: 

      (1) The weaknesses of the paper are that the authors did not describe why they focused only on the top 100 proteins in each list of ATF6 activators and repressors. 

      We concede that the more genes one studies the better. However, In whole genome CRISPR screens where thousands of hits arise, it is a common practise that researchers prioritise candidates with the greatest significant as those genes are likely to have a more meaningful impact on the phenotype under investigation. Therefore, our decision to focus on the top 100 genes was based on a desire to identify the most prominent and potentially impactful candidates for further analysis, ensuring a manageable scope for in-depth study while maintaining a measure of relevance and significance. Moreover, setting the threshold at 100 hits to perform GEO enrichment analysis is a practise used by previous researchers (PMID: 30323222; PMID: 37251921). In our case, the top 100 hits included the genes with an adjusted P < 0.005. For interested readers, the full ranked list is accessible in the GEO databank (GSE254745) and as supplemental Table S1.

      (2) Additionally, there were a few methodology items missing, such as the nature of where the insertion site in the CHO cell genome of the XBP1::mCherry reporter. Since the authors go to great lengths to insert the other reporter for ATF6 activation in a "safe harbor" location, it leads to questions about whether the XBP1::mCherry reporter insertion is truly innocuous. 

      We appreciate the opportunity to clarify certain aspects of our experimental procedures. In order to generate a double UPR reporter cell line, we employed a previously established the XC45 CHO-K1 clone with an integrated XBP1s::mCherry reporter (Harding et al., 2019; PMID: 31749445). Since the ROSA26 safe harbor locus was available in the XC45 CHO-K1 cell line, we directed integrated the ATF6⍺ reporter there. To provide further clarity, the revised manuscript includes additional details in the Methods section regarding the creation of the XBP1 reporter.

      (3) An additional weakness is that the evidence for the physical interaction between ATF6LD and CRT is not strong, being dependent mainly on a single IP/IB experiment in Figure 4C that comprises only 1 lane on the gel for each of the test cases. Moreover, while that figure suggests that the interaction between CRT and ATF6 is decreased by mutating out the glycosylation sites in the ATF6LD, the BLI experiment in the same figure, 4B, suggests that there are no differences in the affinities of CRT for ATF6LD WT, deltaGly and deltaCys. 

      We would like to highlight that in the IP/IB experiments (see Figure 4C), where wildtype ATF6 (ATF6⍺_LDWT) and GFP-ATF6_LD∆Gly were transiently transfected, GFP-ATF6_LD∆Gly was expressed at lower levels than ATF6⍺_LDWT. This lower expression levels might explain why CRT is more prominently immunoprecipitated with ATF6⍺_LDWT and could account for the differences observed among in vitro and in vivo assays.

      (4) An additional detail is that I found Figure 6A to be difficult to interpret, and that 6B was required in order for me to best evaluate the points being made by the authors in this figure. 

      We have simplified Figure 6A in the revised manuscript to make it more interpretable by focussing the reader’s attention on the transfected population. 

      Overall, I believe that this work will positively impact the field as it provides a list of potential regulators of ATF6 activation and repression that others will be able to use as a launch point for discovering such interactions in cells and tissues or interest beyond CHO cells. However, I agree with the authors that these findings were in CHO cell lines and that it is possible, if not likely, that some of the interactions they found will be cell type/line specific. 

      We accept this point and re-emphasize the qualification that our conclusions cannot be glibly extrapolated to other cell lines.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The goal of the current study was to evaluate the effect of neuronal activity on blood-brain barrier permeability in the healthy brain, and to determine whether changes in BBB dynamics play a role in cortical plasticity. The authors used a variety of well-validated approaches to first demonstrate that limb stimulation increases BBB permeability. Using in vivo-electrophysiology and pharmacological approaches, the authors demonstrate that albumin is sufficient to induce cortical potentiation and that BBB transporters are necessary for stimulus-induced potentiation. The authors include a transcriptional analysis and differential expression of genes associated with plasticity, TGF-beta signaling, and extracellular matrix were observed following stimulation. Overall, the results obtained in rodents are compelling and support the authors' conclusions that neuronal activity modulates the BBB in the healthy brain and that mechanisms downstream of BBB permeability changes play a role in stimulus-evoked plasticity. These findings were further supported with fMRI and BBB permeability measurements performed in healthy human subjects performing a simple sensorimotor task. There is literature to suggest that there are sex differences in BBB dysfunction in pathophysiological conditions and the authors have acknowledged the use of only males as a minor limitation of the study that should be addressed in the future. Future studies should also test whether the upregulation of OAT3 plays a role in cortical plasticity observed following stimulation. Overall, this study provides novel insights into how neurovascular coupling, BBB permeability, and plasticity interact in the healthy brain. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study builds upon previous work that demonstrated that brain injury results in leakage of albumin across the blood brain barrier, resulting in activation of TGF-beta in astrocytes. Consequently, this leads to decreased glutamate uptake, reduced buffering of extracellular potassium and hyperexcitability. This study asks whether such a process can play a physiological role in cortical plasticity. They first show that stimulation of a forelimb for 30 minutes in a rat results in leakage of the blood brain barrier and extravasation of albumin on the contralateral but not ipsilateral cortex. The authors propose that the leakage is dependent upon neuronal excitability and is associated with an enhancement of excitatory transmission. Inhibiting the transport of albumin or the activation of TGF-beta prevents the enhancement of excitatory transmission. In addition, gene expression associated with TGF-beta activation, synaptic plasticity and extracellular matrix are enhanced on the "stimulated" hemisphere. That this may translate to humans is demonstrated by a break down in the blood brain barrier following activation of brain areas through a motor task. 

      Strengths: 

      This study is novel and the results are potentially important as they demonstrate an unexpected break down of the blood brain barrier with physiological activity and this may serve a physiological purpose, affecting synaptic plasticity. 

      The strengths of the study are: 

      (1) The use of an in vivo model with multiple methods to investigate the blood brain barrier response to a forelimb stimulation. 

      (2) The determination of a potential functional role for the observed leakage of the blood brain barrier from both a genetic and electrophysiological view point 

      (3) The demonstration that inhibiting different points in the putative pathway from activation of the cortex to transport of albumin and activation of the TGF-beta pathway, the effect on synaptic enhancement could be prevented.  (4) Preliminary experiments demonstrating a similar observation of activity dependent break down of the blood brain barrier in humans. 

      Weaknesses: 

      The authors adequately addressed most of my points. A few remain: 

      (1) Although the reviewers have addressed the possible effects of anaesthesia on neuro-vascular coupling. They have not mentioned or addressed the possible effects of ketamine (an NMDA receptor antagonist) on synaptic plasticity. Indeed, the low percentage of SEP increase following potentiation (10-20%) could perhaps be explained by partial block of NMDA receptors by ketamine.

      We agree and apologize for this oversight. This important issue is now addressed in the Discussion.

      “Notably, the antagonistic effect of ketamine on NMDA receptors might attenuate the magnitude of SEP potentiation recorded in our experiments (Anis et al., 1983; Salt et al., 1988).”

      (2) The experimental paradigms remain unclear to me. Now, it appears that drugs are applied for 50 minutes and that the stimulation occurs during the "washout period". The more conventional approach would be to have the drug application during the stimulation period to determine if the drugs occlude or enhance the effects of stimulation and then washout the drugs. The problem is that drugs variably washout at different rates depending upon their lipid solubility.

      We agree that the more conventional approach would have been to continue applying the drug throughout the experiment and that differential rates of washout may add variability to our experiments. However, despite this limitation, within each treatment group we found that the SEP response at 50 minutes (immediately after the drug application window) does not differ from SEP response at 80 minutes (after 30 minutes of stimulation and washout) [Figure 3H&G]. This suggests that the drug effects were still present despite terminating drug application and performing potentiation-inducing stimulation. Moreover, our analysis showed that animals within each treatment group (except AP5) had similar SEP responses with little intra-group variability.

      (3) It is still not clear to what extent the experimenters and those doing the analysis were blinded to group. If one or both were blind to group, then please put this in the methods.

      Thank you for this comment. We revised the Methods section to clearly confirm that data was collected and analyzed blindly.  

      Reviewer #3 (Public Review): 

      Summary: 

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggest that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB. 

      Strengths: 

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves opening of the BBB with albumin entry. In addition, there are many datasets and both rat and human data. 

      Weaknesses: 

      The conclusions are not compelling however because of a lack of explanation of methods.

      In the revised paper, we added a section titled ‘study design’ that presents an overview of the experimental approach.

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions should be as clear in the paper as it is in the rebuttal.

      We added a new paragraph to the Discussion section explaining this point as we did in the rebuttal:  

      “Our animal experiments show that a 30 min limb stimulation (at 6Hz and 2mA) increases cross-BBB influx, while a 1 min stimulation (of similar frequency and magnitude) does not. We believe that both types of stimulations fall within the physiological range because our continuous electrophysiological recordings showed no signs of epileptiform or otherwise pathological activity. Moreover, the recorded SEP levels were similar to those reported in previous physiological LTP studies in rats (Eckert & Abraham, 2010; Han et al., 2015; Mégevand et al., 2009) and humans (McGregor et al., 2016). In humans, skill acquisition often involves motor training sessions that last ≥30 minutes (Bengtsson et al., 2005; Classen et al., 1998) and result in physiological plasticity of sensory and motor systems (Classen et al., 1998; Draganski et al., 2004; Sagi et al., 2012). Hence, the experimental task in our human study (30 minutes of repetitive squeezing of an elastic stress-ball) is likely to represent physiological activity, with neuronal activation in primarily motor and sensory areas (Halder et al., 2005). Future human and animal studies are needed to explore the BBB modulating effects of additional stimulation protocols – with varying durations, frequencies, and magnitudes. Such studies may also elucidate the temporal and ultrastructural characteristics that differentiate between physiological and pathological BBB modulation. “

      The authors need to ensure other aspects of the rebuttal are as clear in the paper as in the rebuttal too. 

      Thank you for this comment. This was addressed in the revised paper.

      The only remaining concern that is significant is that it is hard to understand the figures. 

      Thank you for this comment. We revised the figures according to the reviewer’s recommendations. We hope that these changes increase the legibility of the figures. 

      Reviewer #3 (Recommendations For The Authors): 

      The manuscript is improved but there are still suggestions that do not appear to have been addressed. More experiments are not involved in addressing these concerns but one wants the paper to be clarified in terms of what was done. 

      Figures. Please use arrows to point to the effect that the reader should see. Please note what the main point is. 

      Major concerns: 

      Please add explanations, exact p values, and other revisions in the rebuttal to the paper. 

      Rebuttal explanations were added to the paper and p values appear in figure legends.

      Fig 1d shows a seizure-like event which the authors don't think is a seizure because it lacks a depolarization ship. This explanation is not convincing because a LFP would not necessarily show a depolarization ship. Another argument of a discussion of the event as a seizure is warranted. Note that expanding the trace might also show it is unlike a seizure. Regarding the idea that 6Hz 2 mA stimuli for 30 min are physiological, the authors make three arguments which are not clear. First, no epileptiform activity was found, but in Fig. 1 it looks like a seizure occurred. Second, memory and skill acquisition in humans open involve a similar training duration - but what about 6Hz 2 mA?

      Rats are known to rhythmically move their whiskers at frequencies ranging between 5 and 15 Hz (Mégevand et al., 2009). We agree that there is no clear way to justify the similarity between the experimental design in humans and rats. However, we believe that both paradigms (paw stimulation in rats and ball squeeze in humans) represent non-pathological input that we found to modulate barrier permeability. This argument was added to the discussion of the paper:

      “We believe that both types of stimulations fall within the physiological range because in rats, activity between 515 Hz represents physiological rhythmic whisker movement during environment exploration (Mégevand et al., 2009).” 

      Seizures are typically induced in rats via direct tetanic stimulation of the brain (at 50 Hz and 0.3-2.5mA) or maximal electroshock test to the cornea (at 50 Hz and 150 mA) (Swinyard et al., 1952). We, therefore, assert that the activity we observe represents physiological responses and not seizures. This argument is beyond the scope of the current paper. 

      Please note a limitation is that the high level of serum albumin is unlikely to be physiological but may not have been as high in the animal because of the low diffusion rate and degradation (please add the refs in the rebuttal). 

      Thank you, we added the following to the Results section: 

      “The relatively high concentration of albumin was chosen to account for factors that lower its effective tissue concentration such as its low diffusion rate and its likelihood to encounter a degradation site or a cross-BBB efflux transporter (Tao & Nicholson, 1996; Zhang & Pardridge, 2001).”

      Fig. 1. 

      Please consider a box in b to show where the expanded traces in the lower row came from. 

      Thank you for the suggestion. We added lines indicating where the trace excerpts were taken from.

      c. Please use arrows to point to the parts that the authors want the reader to note. In the legend, explain what t is, and delta HbT.

      Thank you. We implemented this suggestion.

      d. It is not clear what the double-sided arrows are meant to show compared to the arrow without two sides. 

      We replaced the two-headed arrow with two single ones.

      e. Please explain what the upward lines at the top signify. What does the red asterisk mean? 

      Thank you. We implemented this suggestion.

      f. Is the reader supposed to note the yellow area? Please make it with an arrow or circle if so. 

      Thank you, we added a white circle to mark the area of tracer accumulation.

      g. Please explain what the permeability index is or reference the part of the paper that does. 

      Further to this suggestion, we added a refence to the appropriate methods section to the legend.

      h. Please use arrows to point to the area of interest. 

      Thank you. We implemented this suggestion.

      m-n. Please mark areas of interest with arrows.  m. the top right two images are unclear. I suggest making them say ipsi inset and contra inset instead of using asterisks. 

      Thank you. We added the ipsi and contra labels to panels in m. The images in panel n represent a phenomenon with no particular region of interest, but rather peri-vascular tracer accumulation along the entire depicted blood vessel. We clarified that panel n represents a separate experiment than panel m: “n. In an animal injected with both EB and NaFlu post stimulation, fluorescence imaging shows extravascular accumulation of both tracers along a cortical small vessel in the stimulated hemisphere.”

      Figure 2. 

      (2) a. Middle. What are the vertical lines at the top? The rebuttal states that was explained in the revised legends but I don't see it. 

      Our apologies. We now included an explanation that “an excerpt of the stimulation trace is shown above the middle LFP trace”.

      c and d are very different field potentials in shape and therefore hard to compare. The rebuttal addresses this but the explanation is not in the revised text. 

      We agree that there is variability in SEP responses between animals. We now added a statement acknowledging this in the methods section: “To overcome potential variability in SEP morphology between animals (Mégevand et al., 2009), each animal’s plasticity measures (max amplitude and AUC of post stimulation SEP) were compared to the same measures at baseline.” 

      In d, it is not clear there is potentiation because the traces are not aligned. 

      All panels depicting SEP traces represent raw data with no alignment. The shift observed in panel d exemplifies why we compare post-stimulation parameters of max amplitude and area under curve to baseline in each animal. 

      Exact P values are said to have been added in the rebuttal but they were not. 

      Exact P values appear in Figure legends.

      (3) b. Use arrows to mark the area of interest. 

      Thank you. We added a white circle to mark the area of tracer accumulation similar to Figure 1f.

      d. Why is there an oscillation superimposed on all traces except CNQX? 

      We agree that this is an interesting question. Future studies should determine the source of this SEP pattern.   

      (4) What does the line and the number 2 mean? How were data normalized? What was counted? What area of cortex?

      The number 2 refers to the scale bar line, meaning a log fold change of 2 reflects the size of the scale bar line. 

      The plot shows the log fold change against the mean count of each gene in the contralateral somatosensory cortex between 1 and 24 hours after stimulation.

      The x axis title was changed to “mean expression” and the legend was modified to:

      “Scatter plot of gene expression from RNA-seq in the contralateral somatosensory cortex 24 vs. 1 h after 30 min stimulation. The y axis represents the log fold change, and the x axis represents the mean expression levels (see methods, RNA Sequencing & Bioinformatics). Blue dots indicate statistically significant differentially expressed genes (DEGs) by Wald Test (n=8 rats per group).”

      How were the pericytes, smooth muscle cells, ,etc. distinguished? 

      This was explained under Methods->RNA Sequencing & Bioinformatics: “Analysis of cell-specific and vascular zonation genes was performed as described (Vanlandewijck et al., 2018), using the database provided in (http://betsholtzlab.org/VascularSingleCells/database.html).”

      What were the chi square statistics? If there were cells used instead of rats, please justify. 

      Thank you. The legend was expanded to include the following:

      “The contralateral somatosensory cortex was found to have a significantly higher number of DEGs related to synaptic plasticity, than the ipsilateral side (***p<0.001, Chi-square).”     

      (5) b. what do the icons mean? 

      We agree that the icons were confusing. We simplified this panel to just show when participants were asked to squeeze the ball (black icon). This explanation was added to the Figure legend.

      Abbreviations? 

      Abbreviations of MRI protocols were added to the figure legend for clarity.

      In c-e what are the units of measure? Fold-change? 

      The units represent t-statistics values for each voxel. The label ‘t-statistic’ was added to the figure.  

      What are the white Iines, + and - signs? 

      The white lines point to voxels of highest activation (t-statistic). This was added to the legend.

      And these are not +/- signs these are voxels with significant activation which only appear similar.

      f. Please explain f and g for clarity. 

      Thank you. The explanation was modified for added clarity.

      Supplemental Fig. 4. 

      Original question: If ipsilateral and contralateral showed many changes why do the authors think the effects were only contralateral? 

      The authors replied: Our gene analysis was designed to complement our in vivo and histological findings, by assessing the magnitude of change in differentially expressed genes (DEGs). This analysis showed that: (1) the hemisphere contralateral to the stimulus has significantly more DEGs than the ipsilateral hemisphere; and (2) the DEGs were related to synaptic plasticity and TGF-b signaling. These findings strengthen the hypothesis raised by our in vivo and histological experiments. 

      Could the authors clarify the answer to the question in the text? 

      Thank you. This section was added to the Discussion. 

      Papers referenced in this letter:

      Anis, N. A., Berry, S. C., Burton, N. R., & Lodge, D. (1983). The dissociative anaesthetics, ketamine and phencyclidine, selectively reduce excitation of central mammalian neurones by N-methyl-aspartate. British Journal of Pharmacology, 79(2), 565–575. hQps://doi.org/10.1111/j.1476-5381.1983.tb11031.x

      Bengtsson, S. L., Nagy, Z., Skare, S., Forsman, L., Forssberg, H., & Ullén, F. (2005). Extensive piano practicing has regionally specific effects on white matter development. Nature Neuroscience, 8(9), 1148–1150. hQps://doi.org/10.1038/nn1516

      Classen, J., Liepert, J., Wise, S. P., Hallett, M., & Cohen, L. G. (1998). Rapid plasticity of human cortical movement representation induced by practice. Journal of Neurophysiology, 79(2), 1117–1123. hQps://doi.org/10.1152/JN.1998.79.2.1117/ASSET/IMAGES/LARGE/JNP.JA47F4.JPEG

      Draganski, B., Gaser, C., Busch, V., Schuierer, G., Bogdahn, U., & May, A. (2004). Changes in grey matter induced by training. Nature, 427(6972), 311–312. hQps://doi.org/10.1038/427311a

      Eckert, M. J., & Abraham, W. C. (2010). Physiological effects of enriched environment exposure and LTP induction in the hippocampus in vivo do not transfer faithfully to in vitro slices. Learning and Memory, 17(10), 480–484. hQps://doi.org/10.1101/lm.1822610

      Halder, P., Sterr, A., Brem, S., Bucher, K., Kollias, S., & Brandeis, D. (2005). Electrophysiological evidence for cortical plasticity with movement repetition. European Journal of Neuroscience, 21(8), 2271–2277. hQps://doi.org/10.1111/J.1460-9568.2005.04045.X

      Han, Y., Huang, M. De, Sun, M. L., Duan, S., & Yu, Y. Q. (2015). Long-term synaptic plasticity in rat barrel cortex. Cerebral Cortex, 25(9), 2741–2751. hQps://doi.org/10.1093/cercor/bhu071

      McGregor, H. R., Cashaback, J. G. A., & Gribble, P. L. (2016). Functional Plasticity in Somatosensory Cortex Supports Motor Learning by Observing. Current Biology, 26(7), 921–927. hQps://doi.org/10.1016/j.cub.2016.01.064

      Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M., & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. Journal of Neuroscience, 29(16), 5326– 5335. hQps://doi.org/10.1523/JNEUROSCI.5965-08.2009

      Sagi, Y., Tavor, I., HofsteQer, S., Tzur-Moryosef, S., Blumenfeld-Katzir, T., & Assaf, Y. (2012). Learning in the Fast Lane: New Insights into Neuroplasticity. Neuron, 73(6), 1195–1203. hQps://doi.org/10.1016/j.neuron.2012.01.025

      Salt, T. E., Wilson, D. G., & Prasad, S. K. (1988). Antagonism of N-methylaspartate and synapBc responses of neurones in the rat ventrobasal thalamus by ketamine and MK-801. British Journal of Pharmacology,

      94(2), 443–448. hQps://doi.org/10.1111/j.1476-5381.1988.tb11546.x

      Swinyard, E. A., Brown, W. C., & Goodman, L. S. (1952). Comparative assays of antiepileptic drugs in mice and rats. The Journal of Pharmacology and Experimental Therapeutics, 106(3), 319–330. hQp://jpet.aspetjournals.org/content/106/3/319.abstract

      Tao, L., & Nicholson, C. (1996). Diffusion of albumins in rat cortical slices and relevance to volume transmission. Neuroscience, 75(3), 839–847. hQps://doi.org/10.1016/0306-4522(96)00303-X

      Vanlandewijck, M., He, L., Mäe, M. A., Andrae, J., Ando, K., Del Gaudio, F., Nahar, K., Lebouvier, T., Laviña, B.,

      Gouveia, L., Sun, Y., Raschperger, E., Räsänen, M., Zarb, Y., Mochizuki, N., Keller, A., Lendahl, U., &

      Betsholtz, C. (2018). A molecular atlas of cell types and zonation in the brain vasculature. Nature, 554(7693), 475–480. hQps://doi.org/10.1038/nature25739

      Zhang, Y., & Pardridge, W. M. (2001). Mediated efflux of IgG molecules from brain to blood across the blood– brain barrier. Journal of Neuroimmunology, 114(1–2), 168–172. hQps://doi.org/10.1016/S01655728(01)00242-9

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Wang, He et al have constructed comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes. 

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors. 

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data. 

      Reviewer #2 (Public Review): 

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways. 

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons. 

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled. 

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution. 

      Reviewer #3 (Public Review): 

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change. 

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement. 

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells contribute to symbiont colonization. 

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems. 

      We extend our sincere gratitude to all the reviewers for their positive comments and kind words. We highly value the substantial efforts they made in helping us improve and enhance our manuscript. Additionally, we appreciate the reviewers for pointing out the limitations of our current study, which will guide us in improving our future researches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      This study system is so interesting and this is a truly unique and exciting dataset. Most of my suggestions are aimed at improving readability and making it more accessible for a broader audience, since I predict many fields will find it interesting. 

      Line 60: which species of mussel? Is this the same one? 

      We appreciate the comments from the reviewer. The reference here is to deep-sea bathymodiolin mussels, which, in most cases, possess enlarged gill filaments that accommodate symbionts.

      Line 237-230: citation of previous findings missing 

      We appreciate the comments from the reviewer. After carefully reviewing these paragraphs, we believe that all the previous findings have now been properly cited.

      Line 256: it might be a good idea to give a brief description of what slingshot analysis is here 

      We appreciate the comments from the reviewer. We have revise the corresponding part of our manuscript to make it clear.

      This parts of manscript now reads: “We performed Slingshot analysis, which uses a cluster-based minimum spanning tree (MST) and a smoothed principal curve to determine the developmental path of cell clusters. The re-sult shows that the PEBZCs might be the origin of all gill epithelial cells, including the other two proliferation cells (VEPC and DEPC) and bacteriocytes (Supplementary Fig. S6).” Line 203-207 of the revised manscript.

      Line 289: Wording is a bit confusing- what is meant by morphological analysis?

      We acknowledge that our wording might be a bit confusing here. We are referring to the TEM ultrastructural analysis. Therefore, we have changed “morphological analysis” to “ultrastructural analysis.” Line231 in the revised manuscript.

      Line 351-354: how did you calculate distances? How many dimensions were used? 

      We calculated the centroid coordinates for each cell type in each state on the 2-dimensional UMAP plot (Fig. 6A). Then, for each cell type, we determined the Euclidean distance between the centroid coordinates of each pair of states. We have revised the manuscript with this more detailed description. Line 292-295 of revised manuscript.

      Line 462: identify -> identified 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version. Line396 of the revised manscript.

      Line 509: what does the size of the dot represent? 

      In this context, the color and intensity of each dot represent a specific gene’s expression level in the single-cell cluster. The dot size is universal and therefore does not convey a specific meaning.

      Fig 3A: What is the blue cluster highlighted? 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the revised manuscript.

      Fig 3K: Wording in key is confusing. 

      We have modified our description of Fiugre 3K in the figure legneds. Now it reads: “Schematic of water flow agitated by different ciliary cell types. The color of arrowheads corresponds to water flow potentially influenced by specific types of cilia, as indicated by their color code in Figure 3A.” Line462-464 in the revised manscript.

      Fig 5B: which population of mussels was used to take these images? 

      These mussels from “Fanmao” (methane rich) site were used to take these images. We have revised our material and methods to make it clear. Line602-603 of the revised manuscript.

      Fig 5E,5G,5H: panels not referenced in text 

      We apologize for our mistake and appreciate the reviewer’s thorough reading. This error has been corrected in the new version of the manuscript. Line233 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      Fig. 3A - the teal box in the legend lacks a label 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the

      Reviewer #3 (Recommendations For The Authors): 

      My enthusiasm for the manuscript remains high and I appreciate the authors care in responding to the various reviewer questions and concerns. 

      In regards to the cell proliferation results, I have modified my public review and look forward to your future work in this area. The data for both pHistone H3 and anti PCNA are compelling! 

      One typo I did catch occurs on line 520. I believe you meant to say "outer" not "otter." 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version.

    1. Author response:

      Reviewer #1 (Public Review):

      This study excellently complements the previous one by unveiling the properties of NPRL2 in augmenting the effect of immune checkpoint inhibitors such as pembrolizumab in KRAS mutant lung cancer models.

      The following points should be clarified:

      (1) In KRAS mutant cell lines with LKB1 co-mutations or deletions, such as A549 cells, does treatment with NPRL2 not increase the efficacy of immunotherapy? Is this correct? Similarly, does the delivery of NPRL2 only potentiate the effect of immunotherapy in KRAS mutant cell lines without associated LKB1 mutations?

      NPRL2, when used as a single-agent immunotherapy, induces robust antitumor activity in immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R), where immunotherapy is ineffective regardless of LKB1 co-mutation or deletion status. The antitumor effect of NPRL2 combined with aPD1 immunotherapy was not significantly different from NPRL2 alone in immunotherapy-resistant models but was significantly greater than immunotherapy alone. However, a synergistic antitumor effect was observed with NPRL2 and aPD1 immunotherapy in KRAS wild-type and immunotherapy-moderately-responsive models, such as H1299 (KRASwt/aPD1S).

      (2) Do the authors analyze by western blot if NPRL2 influences or restores STING and LKB1 in the A549 cell line that lacks LKB1 and STING?

      NPRL2 induces antitumor immunity on Kras mutant, aPD1 resistant models regardless of LKB1 co-mutations or deletions, however, it would be interesting to look into the effect of NPRL2 on the STING pathway in this LKB1 deleted A549 cell line.

      (3) Mechanistically, is there any explanation as to why NPRL2 delivery increases the efficacy of immunotherapy? Is there any effect on FUS or MYC?

      NPRL2 is a multifunctional tumor suppressor gene that is downregulated or absent in many cancers. NPRL2 has been shown to induce apoptosis, inhibit cell proliferation, and cause cell cycle arrest in various cancer types. Compelling evidence highlights the critical role of NPRL2 in causing DNA damage and double-strand breaks, which can trigger dendritic cell (DC) activation, antigen presentation, and priming of tumor-specific CD8+ T cells in the tumor microenvironment (TME). Our data indicate that NPRL2 treatment is associated with the induction of DC activation and maturation.

      The cellular mechanism of NPRL2 suggests that NPRL2-mediated antitumor immunity depends on the presence of CD4+ T cells, CD8+ T cells, and macrophages. Interestingly, the expression of FUS1, another tumor suppressor gene, was mostly absent or severely downregulated in most non-small cell lung cancers (NSCLC) and was unaffected by NPRL2 treatment. While MYC expression was not assessed in this study, it remains an area of interest for future research.

      (4) Is there any way to carry out a clinical study of systematically delivering NPRL2 in KRAS lung cancer patients?

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      Reviewer #2 (Public Review):

      Summary:

      NPRL2 gene therapy induces effective antitumor immunity in KRAS/STK11 mutant anti-PD1 resistant metastatic non-small cell lung cancer (NSCLC) in a humanized mouse model by Meraz et al investigated the antitumor immune responses to NPRL2 gene therapy in aPD1R / KRAS/STK11mt NSCLC in a humanized mouse model, and found that NPRL2 gene therapy induces antitumor activity on KRAS/STK11mt/aPD1R tumors through DC-mediated antigen presentation and cytotoxic immune cell activation.

      Strengths:

      The novelty of the study.

      Weaknesses:

      (1) The inconsistent effect of NPRL2 combined with pembrolizumab. Figure 2I-K, showed a similar tumor intensity in the NPRL2 group and combination group. However, NPRL2 combined with pembrolizumab was synergistic in the KRASwt/aPD1S H1299 tumors in Figure 4.

      NPRL2, as a single agent immunogen therapy, induces robust antitumor activity on both immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R) and immunotherapy sensitive model such as H1299 (KRASwt/aPD1S) where immunotherapy was ineffective or limitedly effective. A synergistic antitumor effect of NPRL2 and Pembrolizumab combination was found only in immunotherapy moderately responsive models, not in immunotherapy resistant models where PD-1/PD-L1 signaling is impaired shown in Figure 1A.

      (2) The authors stated that NPRL2 combined with pembrolizumab was not synergistic in the KRAS/STK11mt/aPD1R tumors but was synergistic in the KRASwt/aPD1S H1299 tumors. How did the synergistic effect defined in the study, more details need to be provided here.

      Our biostatistician used generalized linear regression models to study the tumor growth over time. Two-way ANOVA with the interaction of treatment group and time point was performed to compare the difference of tumor intensity changes from baseline between each pair of the treatment groups at each time point. The nonparametric Mann-Whitney U test was applied to compare significance in different treatment groups. Differences of P < 0.05, P < 0.01, and P < 0.001 were considered statistically significant. When the combination antitumor effect of NPRL2 and pembrolizumab was found to be statistically significant compared to both single-agent effects synergy was confirmed using the method of Huang et al.

      Huang L, Wang J, Fang B, Meric-Bernstam F, Roth JA, Ha MJ. CombPDX: a unified statistical framework for evaluating drug synergism in patient-derived xenografts. Sci Rep 12(1):12984, 7/2022. e-Pub 7/2022. PMCID: PMC9338066.

      (3) Nearly all of the work was performed pre-clinically. Validation in the clinical setting would provide more strong evidence for the conclusion.

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      (4) Figure 5 and Figure 6 have the same legend. These 2 figures could be merged as a new one.

      Agreed.

      (5) Figure 5B & C, n=9 in the Figure 5B. However, the detail number in Figure 5C was less than 9.

      At least n=7-9 mice/group are shown in the figure 5C. We will revise accordingly.

      Reviewer #3 (Public Review):

      Summary:

      NPRL2/TUSC4 is a tumor suppressor gene whose expression is reduced in many cancers including NSCLC. This study presents a novel finding on NPRL2 gene therapy, which induces antitumor activity on aPD1-resistant tumors. Since KRAS/STK11 mutant tumors were reported to be less benefited from ICIs, this study has potential clinical application value.

      Strengths:

      This work uncovers the advantage of NPRL2 gene therapy by using humanized models and multiple cell lines. Moreover, via immune cell depletion studies, the mechanism of NPRL2 gene therapy has focused on dendritic cells and CD8+T cells.

      Weaknesses:

      A major concern would be the lack of systematic, and logical rigor. This work did not present a link between apoptosis and antigen presenting induced by NPRL2 restoration. There is no evidence proving that the PI3K/AKT/mTOR signaling pathway is related to antigen presenting, which is the major reason of NPRL2 induced antitumor response. Therefore, the two parts may not support each other logically.

      Thank you for your review and comments. We agree that future studies are necessary to establish a direct link between apoptosis and antigen presentation induced by NPRL2 restoration, as well as NPRL2-mediated downregulation of PI3K/AKT/mTOR signaling and its direct effect on antigen presentation. Although NPRL2 restoration directly induced apoptosis in several cell lines shown in Figure 1C and Figure 8Q and significantly increased the number of antigen-presenting DC cells in the tumor microenvironment upon NPRL2 treatment or NPRL2 restoration. Similarly, NPRL2 restoration downregulated the PI3K/AKT/mTOR pathway, which was associated with increased antitumor immunity.

    1. Author response:

      We thank the reviewers for their thorough comments on our manuscript. We appreciate their recognition of the strengths in our study, including addressing the significant problem of neonatal sepsis in preterm infants using a preterm piglet model, the robustness of our multi-omics dataset, and our multi-pronged approach to examining the physiological changes under different glucose management regimens.

      This document addresses our initial responses to the main concerns of the 3 reviewers. We will provide more detailed responses to their comments and revise the manuscript at a later date.

      In response to Reviewer #1, we acknowledge the concern about high blood glucose levels in the control group. This work is a follow-up from our previous work (Muk et al, JCI insight 2022) where we explored different PN glucose regimens. Taken together, our experiments suggest a linear relationship between glucose provision and infection severity, indicating increased glucose may heighten mortality risk, while radical reduction could reduce mortality due to sepsis, but cause hypoglycemia and brain damage. As for the discrepancy in survival rates between Figures 1B and 6B, this is due to a shortened follow-up time in the follow-up experiment. This was done to minimize animal suffering because relevant differences in immune-responses were detectable within 12 hours in the primary experiment. As for the relationship between bacterial burdens and glucose, we agree that lower bacterial density in piglets receiving the reduced glucose PN may result from slower bacterial growth. However, we analyzed the relationship between bacterial burdens and mortality and found that it did not correlate within each of the treatment groups. This finding inspired us to further explore the relationship between bacterial burdens and infection responses in our model which has resulted in our recent preprint: Wu et at. Regulation of host metabolism and defense strategies to survive neonatal infection. BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      For Reviewer #2, The distinction between early (EOS) and late onset sepsis (LOS) in the time cut-off makes sense clinically because they are likely to be caused by different organisms and origins (EOS with maternal origin and LOS with postnatal origin) and therefore require different empirical antibiotics regimes. However, it is also important to acknowledge that the pathophysiology of “sepsis” may be similar despite timing and pathogen and depends on the degree of immune activation. Therefore, even though the infection in our model is initiated on the first day after birth the organism that we use, Staphylococcus epidermidis (most common bacteria detected in LOS), makes it a better model for LOS. As for neutrophil specific transcripts, we only collected the whole blood transcript during the experiments, which reflects the transcriptomic profile of all the leucocytes. Since we did not do single cell RNA sequencing during the experiment there is no possibility of isolating the neutrophil transcriptome at this time. As for the question of a “safe glucose infusion rate”, there likely is none as the immune responses to glucose intake do not seem binary but increase with glucose intake. Our reduced glucose PN was chosen as it corresponded with the low end of recommended guidelines for PN glucose intake. However, the reduced glucose intervention still resulted in significant morbidity and a 25% mortality within 22 hours. There is therefore still vast room for improvement, but even though further reduction in PN glucose intake would probably provide further protection, it would entail dangerous hypoglycemia. The findings in this paper have prompted us to explore several alternative strategies to both reduce infection-related mortality and maintain glucose homeostasis. Thus, the optimal PN for infected newborns would probably differ from standard PN in all macronutrients compartments and will require much more pre- and clinical research.

      For Reviewer #3, we acknowledge the variability in data collected from animals at euthanasia. These endpoints represent snapshots of the animals' states at euthanasia, which is a clear limitation of our method. Therefore, we do not know what metabolic processes precede the development of lethal sepsis, although the increases in plasma lactate suggest a higher rate of glycolysis in animals on high glucose PN. However, we believe the data still heavily imply a causal relationship between energy metabolic processes, especially glycolytic breakdown of glucose, and the pro-inflammatory responses leading to sepsis. In our recent preprint mentioned above we further explored the metabolic responses in pigs that succumbed to sepsis, compared to those that survived and found that survival was strongly associated with increases in mitochondrial metabolism and reduction in glycolysis.

      We hope these clarifications and our commitment to further research address your concerns satisfactorily. Thank you for your valuable feedback.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      “but an obvious influencing factor that the authors could investigate in their own data set is the retinal input. In Fig1b, the authors even show these data in the form of gaze and pupil size. In these example data, by eye, it looks like the pupil size is positively correlated with the run speed. This would of course have large consequences on the activity in V1, but the authors do not do anything with these data. The study would improve substantially if the authors would correlate their run speed traces with other factors that they have recorded too, such as pupil size and gaze.”

      Absolutely. We have added a first level of eye movement (and pupil size) analyses to the revised manuscript, resulting in an additional figure. In short, we found that eye movements are unlikely to play a significant role in our primary results, as the patterns of eye movements differed only slightly between running and stationary periods, and the measured impacts of such eye movements were also quantitatively much smaller than the primary effect sizes.

      We also note that in analyzing the eye movements, we also found that pupil size was larger during running than stationary. This is suggestive evidence that running is correlated with increases in arousal. Although more work will be needed to calibrate and quantify how much this factor affects neural responses (and perhaps to dissociate it from running per se), the simple analysis we present suggest that the large differences we observe could be explained by a difference between how arousal and running are correlated in the monkey versus the mouse. Instead, it appears that both species have at least qualitatively similar relations between pupil size (a standard proxy for arousal) and running.

      On this issue, we have added extensive discussion of the relevant recent work by Talluri et al. (2023) who attempted a similar cross-species analysis that considered spontaneous body movements and their effect on cortical activity (as well as the possibility that eye movements are a critical mediator in these modulations). Due to delays in revising our manuscript, we regret that our earlier submission had not cited this work, but we now do our best to highlight its importance and the synergy between these two papers. The full citation is listed below:

      Talluri BC, Kang I, Lazere A, Quinn KR, Kaliss N, Yates JL, Butts DA, Nienborg H. Activity in primate visual cortex is minimally driven by spontaneous movements. Nat Neurosci. 2023 Nov;26(11):1953-1959. doi: 10.1038/s41593-023-01459-5.

      There is a finer level of analysis that we hope to do in the future along these lines. It would rely on detailed characterization of each receptive field, building an image-computable model linking those receptive fields to the neural activity, and doing so at a finer time grain that links individual eye movements and changes in the spike train within a stimulus presentation (as opposed to working at the level of spike counts per stimulus presentation). Because these steps need to be accomplished together— and each requires substantial additional work and would go beyond the first-order findings we report in this work— we hope to report on such finer analyses in a standalone paper later. We are working on being able to do this in both marmoset and mouse.

      More generally, we want to emphatically agree that what is missing from this paper is the “why?”! We have done our best to show that a fair comparison reveals quantitatively different phenomena in marmoset and mouse. In the revised discussion, we lay out many lines of work that we hope will gain traction on this deeper mechanistic point. There’s a lot to do, and several of the possibilities are already current topics of exploration in our ongoing work.

      “Looking at the raster plot, however, shows that this strong positive correlation must be due entirely to the lower half of the neurons significantly increasing their firing rate as the mouse starts to run; in fact, the upper 25% or so of the neurons show exactly the opposite (strong suppression of the neurons as the mouse starts running). It would be more balanced if this heterogeneity in the response is at least mentioned somewhere in the text.”

      We are also intrigued by the heterogeneity of effects at the single neuron level. That is why the next section of the paper is dedicated to analyzing effects on a cell-by-cell basis. The fractions of neurons showing either increases or decreases are described separately, to get at this very issue.

      Reviewer 2 (Public Review)::

      “For example, it is known that the locomotion gain modulation varies with layer in the mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset, it was not reported from which cortical layer the neurons are from, leaving this point unanswered.”

      Reviewer 2 called for more consideration of details that have been addressed in the mouse literature, such as the cortical layer of the cells, and related aspects of circuitry. We have greatly re-worked the Discussion to address several of these issues. In short, the manuscript’s set of data were collected without strong traction on layers or cell types, and it will be quite interesting to get a better handle on this using both refinements to our recording procedures as well as new techniques that are now possible in the marmoset for future studies.

      “In this regard, it is worth noting that the authors report an interesting difference between the foveal and peripheral parts of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of the primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral-related modulation in the primate brain.”

      Same page! We have expanded the discussion to better emphasize these points and are already deep in follow up experiments to explore the foveal and peripheral representations.

      Reviewer 3 (Public Review)::

      “However, the authors did not take full advantage of the quantity and diversity of the marmoset visual cortex recordings in their analyses. They mention recording and analyzing the activity of peripheral V1 neurons but mainly present results involving foveal V1 neurons. Foveal neurons, with their small receptive fields strongly affected by precise eye position, would seem to be less likely to be comparable to rodent data. If the authors have a reason for not doing so, they should provide an explanation.”

      We agree, and hope the reviewer finds our overall reply, detailed response to Reviewer 1 (who raised a similar issue), and corresponding updates to the manuscript appropriate for this stage of understanding.

      “Given that the marmosets are motivated to run with liquid rewards, the authors should provide more context as to how this may or may not affect marmoset V1 activity. Additionally, the lack of consideration of eye movements or position presents a major absence for the marmoset results, and fails to take advantage of one of the key differences between primate and rodent visual systems - the marmosets have a fovea, and make eye movements that fixate in various locations on the screen during the task.”

      In addition to the response above, we have made edits to the manuscript to speak to issues of arousal and eye movements (also detailed in previous responses). Given the modest decrease in activity we see, the usual concerns about potential increases in neural activity related to eye movements (which we quantify in the revision) and other issues related to motivation are hard to specifically relate to existing literature. But in the revised Discussion we talk more about how future work can/should dissociate these factors, as has been done in the mouse literature.

      “Finally, the model provides a strong basis for comparison at the level of neuronal populations, but some methodological choices are insufficiently described and may have an impact on interpreting the claims.”

      We have also clarified the shared-gain model’s description, which we agree needed additional detail and clarification.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a useful comparison of the dynamic properties of two RNA-binding domains. The data collection and analysis are solid, making excellent use of a suite of NMR methods. However, evidence to support the proposed model linking dynamic behavior to RNA recognition and binding by the tandem domains remains incomplete. The work will be of interest to biophysicists working on RNA-binding proteins.

      We thank eLife for taking the time and effort to review our manuscript. Evidence from the literature and our study shows a great deal of parity between the dynamic behavior of dsRBDs and its dsRNA-recognition and -binding that helped us culminate in proposing a fair model. As already mentioned in the manuscript, we have been working on the suggested experiments to support our proposed model further.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Differential conformational dynamics in two type-A RNA-binding domains drive the double-stranded RNA recognition and binding," Chugh and co-workers utilize a suite of NMR relaxation methods to probe the dynamic landscape of the TAR RNA binding protein (TRBP) double-stranded RNA-binding domain 2 (dsRBD2) and compare these to their previously published results on TRBP dsRBD1. The authors show that, unlike dsRBD1, dsRBD2 is a rigid protein with minimal ps-ns or us-ms time scale dynamics in the absence of RNA. They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics. Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.

      We thank the Reviewer for sending us an encouraging review. We have combined the findings reported in the literature with new ones that led us to propose the dsRNA-binding model by tandem A-form dsRBDs.

      We propose that dsRBD1 can first recognize a variety of sequential and structurally different dsRNAs. dsRBD2 assists the interaction with a higher affinity, thus fortifying the interaction between TRBP and a possible substrate. This may enable the other associated proteins like Dicer and Ago2 to perform critical biological functions.

      However, we feel that a few statements in the comment above are factually incorrect.

      Statement 1. “They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics.”

      We have explicitly shown the perturbation in dsRBD2 dynamics upon RNA binding.

      Statement 2. “Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.”

      Our previously published data suggests that dsRBD1, owing to its high conformational dynamics in solution, is able to recognize a variety of structurally and sequentially different dsRNAs ([Paithankar et al., 2022]). dsRBDs preferably bind to the double-stranded region (minor-major-minor-groove) of an A-form RNA ([Acevedo et al., 2016]; [Vuković et al., 2014]) and do not search for bulge and internal loop structures as a part of the binding event. Even though dsRBDs preferably bind to the double-stranded region, they can still accommodate perturbation in the A-form helix due to mismatch and bulges with decreased binding affinity ([Acevedo et al., 2015]). However, it is a matter of future research to identify how much of a deviation from the A-form structure can be accommodated by the dsRBDs. The diffusion event observed in the literature ([Koh et al., 2013]) also does not show any direct implication for searching for bulge and internal loop structures.

      Strengths:

      The authors expertly use a variety of NMR techniques to probe protein motions over six orders of magnitude in time. Other NMR titration experiments and ITC data support the RNA-binding model.

      Weaknesses:

      The data collection and analysis are sound. The only weakness in the manuscript is the lack of context with the much broader field of RNA-binding proteins. For example, many studies have shown that RNA recognition motif (RRM) domains have similar dynamic characteristics when binding diverse RNA substrates. Furthermore, there was no discussion about the entropy of binding derived from ITC. It might be interesting to compare with dynamics from NMR.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA binding domain that is able to read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of target specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc finger domains, etc., showcase a unique RNA-binding behavior. Thus, it would be really difficult to draw a single rule of thumb for RNA-recognition behavior for all these diverse domains.

      Thank you for pointing out the entropy of binding from ITC. We have now included the entropy of binding discussion in the main text, page 7.

      Reviewer #2 (Public Review):

      Summary:

      Proteins that bind to double-stranded RNA regulate various cellular processes, including gene expression and viral recognition. Such proteins often contain multiple double-stranded RNA-binding domains (dsRBDs) that play an important role in target search and recognition. In this work, Chug and colleagues have characterized the backbone dynamics of one of the dsRBDs of a protein called TRBP2, which carries two tandem dsRBDs. Using solution NMR spectroscopy, the authors characterize the backbone motions of dsRBD2 in the absence and presence of dsRNA and compare these with their previously published results on dsRBD1. The authors show that dsRBD2 is comparatively more rigid than dsRBD1 and claim that these differences in backbone motions are important for target recognition.

      Strengths:

      The strengths of this study are multiple solution NMR measurements to characterize the backbone motions of dsRBD2. These include 15N-R1, R2, and HetNOE experiments in the absence and presence of RNA and the analysis of these data using an extended-model-free approach; HARD-15N-experiments and their analysis to characterize the kex. The authors also report differences in binding affinities of dsRBD1 and dsRBD2 using ITC and have performed MD simulations to probe the differential flexibility of these two domains.

      Weaknesses:

      While it may be true that dsRBD2 is more rigid than dsRBD1, the manuscript lacks conclusive and decisive proof that such changes in backbone dynamics are responsible for target search and recognition and the diffusion of TRBP2 along the RNA molecule. To conclusively prove the central claim of this manuscript, the authors could have considered a larger construct that carries both RBDs. With such a construct, authors can probe the characteristics of these two tandem domains (e.g., semi-independent tumbling) and their interactions with the RNA. Additionally, mutational experiments may be carried out where specific residues are altered to change the conformational dynamics of these two domains. The corresponding changes in interactions with RNA will provide additional evidence for the model presented in Figure 8 of the manuscript. Finally, there are inconsistencies in the reported data between different figures and tables.

      We thank the reviewer for the comprehensive and insightful review. A larger construct carrying both RBDs was not used because of the multiple challenges pertaining to dynamics study by NMR spectroscopy (intrinsic R2 rates of the dsRBD1-dsRBD2 construct would be high, resulting in broadened peaks) as per our previous experience ([Paithankar et al., 2022]). There would be additional dynamics in that construct coming from domain-domain relative motions, and it is difficult to deconvolute the dynamics information. Further, the dsRNA needed to bind to this construct will be longer, causing further line broadening in NMR.

      Coming to mutational studies, careful designing of domain mutants remains as a challenge because the conformational dynamics in both the domains are distributed all through the backbone rather than only in the RNA-binding residues. The mutational studies would need an exhaustive number of mutations in protein as well as RNA to draw a parallel between the binding and dynamics. Having said that, we are working on making such mutations in the protein (at several locations to freeze the dynamics site-specifically) and the RNA (to change the shape of the dsRNA) to systematically study this mechanism, which will be out of scope of this manuscript.

      The reviewer has rightly pointed out some subtle superficial differences in the reported data between different figures and tables. These superficial differences are present because of the context in which we are describing the data. For example, in Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we were able to analyze between two magnetic field strengths 600 and 800 MHz. Whereas in Figure 6, we are comparing the averages of the core (159-227) dsRBD residues at 600 MHz, in the presence and absence of D12RNA. The differences, however, are minute falls well within the error range.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments -

      In regards to ITC data, dsRBD1 does not bind canonical A-form RNA with high affinity. What is dsRBD1 and dsRBD2 affinity to the miR-16 RNA?

      We have not performed ITC-based studies with miR-16 RNA for the domains. The study by Acevedo et al. has shown the effect of lengths of Watson-Crick duplex RNAs upon TRBP2 dsRBD binding. In this study, they have compared the ds22 RNA to miRNA/miRNA* duplex. By using EMSA, they show that the Kd,app (μM) for dsRBD1 is 3.5±0.2 and for dsRBD2 is 1.7±0.1, indicating a higher affinity by the latter ([Acevedo et al., 2015]).

      What was the amount of time used for the 1H saturation in the heteronuclear NOE experiment? Based on the average T1 (1/1.44 s-1) = 0.69 s, a recovery delay of >7 s should have been used for this experiment.

      According to Cavanagh et al., a minimum recovery/recycle delay should be greater than 5*1/R1 to make sure that 99% of the 1HN and 15N magnetizations are restored ([“Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1.,” 1996]). In our study, we have used a relaxation delay of 5 s, which is greater than 7*1/R1avg thus ensuring at least 99% of the 1HN and 15N recover their bulk magnetization.

      Recommendations for improving writing and presentation -

      Figure 3 - The legend in panel C is incomplete.

      Figure 3 (Figure 4 in the revised manuscript) has been updated, and the legend now reads complete.

      Figures 3 E and F - The three views can be combined into one as is done in Figures 4 C and D.

      Thanks for the kind suggestion. We have depicted the kex in the three ranges to highlight the difference between the two domains at each range. Since there are three different exchange regimes with different populations, we believe this gives us an uncomplicated picture while classifying and comparing the dynamics between the two. Combining the three views into one becomes too overwhelming to visualize kex and population distribution in the protein.

      Figure 3 - The residues indicated in the text (e.g., R200, L212, and R224) should be indicated in panels E and F.

      We have marked the residues described in the text in Figure 4C (revised Figure 5C), and thus, they are not mentioned in Figures 3E and 3F (revised Figures 4E and 4F).

      The results and discussion put these findings into minimal context. Most comparisons are made between dsRBD1 and dsRBD2. What about other RNA-binding proteins? There is a wealth of structure/dynamics/functional data about RNA recognition motifs, which do exactly the same thing as described here but are missing.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA-recognition motif that can read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of sequence specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc-finger domains, etc., showcase a unique RNA-binding behaviour. Thus, with the current knowledge, it would not be possible to draw a single rule of thumb for RNA-recognition behaviour for all these diverse domains. Hence, the findings of this study are not comparable to those of other RNA-binding domains and are beyond the scope of this study.

      Results, page 8 - I'm not sure that allosteric quenching is appropriately invoked here. The amount of residues showing dynamics in the apo state is small and the number only moderately increases upon RNA binding. The observation that some residues show an increase and a neighboring residue shows a decrease (or vice versa) upon RNA binding could just be random with the small number of observations. This observation would be more convincing if it were happening to larger regions within the protein.

      We agree with the reviewer that the number of residues showing dynamics in the apo-state of the dsRBD2 is small when compared with that of dsRBD1, and the number only moderately increases upon RNA-binding. However, we believe it is quite important to invoke the allosteric quenching as all the new residues where dynamics is induced, do lie in the spatial proximity, as also observed in the dsRBD1 ([Paithankar et al., 2022]). It is a parameter to not only compare the differences and similarities in the two domains but also to highlight the presence of this phenomenon common in both the type-A dsRBDs of TRBP.

      Minor corrections -

      Introduction, page 2 - The order parameter should be defined for non-NMR experts.

      Thank you for the suggestion. The definition of order parameter has now been included on page 2 of the revised manuscript.

      Introduction, page 2 - TRBP should be defined in the main text the first time used.

      We have now defined TRBP on page 2 of the revised manuscript, where it is used in the main text for the first time.

      Results, page 5 - The reference for the HARD experiment should be given earlier in that paragraph.

      Thank you for the suggestion. We have now referenced the HARD experiment earlier in the last paragraph on page 5 of the revised manuscript.

      Results, page 7 - What is the limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements?

      The limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements is 0.05 equivalent (RNA:Protein= 50 mM:1000 mM). It has now been included on page 7 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, NMR datasets are not consistent with one another (a few examples are listed below).

      Figures S4, 6, and Table S4: (a) It is unclear why relaxation data for certain residues are missing in Table S4 (e.g., S156, V168, E177, F192, etc.).

      We thank the reviewer for pointing this out. We have now reanalyzed the data for all the above-mentioned residues and other missing residues. In the revised manuscript, we have added the data for the above-mentioned residues like E177, R189, and many more N- and C-terminal residues. Unfortunately, for some residues like V168, S184, F192, S209, and L222, we witnessed severe peak broadening while measuring the R2 rates and/or nOe. Hence, data for V168, S184, F192, S209, and L222 are missing in Table S4. We have explicitly mentioned this in the table legends about missing data for a few residues.

      (b) The reported values are not consistent. For example, Figure S4 says that the average 15N-R2 rate is 10.85 +/- 0.36 s-1 whereas Figure 6 says the 15N-R2 rate is 11.02 +/- 0.39 s-1 for the same dataset.

      The superficial differences are present because of the context in which we are describing the data (now mentioned in the methods section on page 13). In Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we could analyze between two magnetic field strengths, 600 and 800 MHz. Whereas in Figure 6 (revised figure 3), we compare the averages of all the analyzed core dsRBD residues at 600 MHz in the presence and absence of D12RNA. The differences, however, are insignificant, falling well within the error range.

      (c) There is also a discrepancy in reported R2 values (at 600 MHz) in Table S4. It is unclear to me what the reported values are, as most of these are below 1 s-1.

      Thank you very much for pointing out our mistake here. The Table S4 seems to have the wrong values for R2 at 600 MHz. However, the raw data submitted to the BMRB as entry 52077 holds the correct information. We have now updated the Table S4.

      (d) It is also unclear as to why perfectly resolved residues (e.g., L230, A232, D234, etc.) have been omitted from these data (and other datasets such as 15N-CPMGs shown in Figure S6).

      The residues L230, A232, D234, etc., are the C-terminal residues of TRBP-dsRBD2 beyond the core (159-227 aa) fold of dsRBD. They have now been included in the revised figures S6 and S11 for completeness.

      (e) Figure 6 reports a 15N-R2 of 21 s-1 for one of the residues in the absence of RNA. This data point has been omitted from Figure S4.

      In Figure S4, we are talking about relaxation rates and nOe values only for the common residues we could analyze between the two magnetic field strengths, 600 and 800 MHz. Thus, that 15N-R2 value has been omitted.

      The S2 order parameters reported in Figures S5 and S10 are inconsistent with one another, as additional residues are shown in S10 (e.g., N159).

      Thank you for pointing it out. We have now reanalyzed the data for S2 order parameter and Rex by including more residues (e.g., N159, R189, etc) in the core and have updated both Figures S5 and S10. Please see the revised supplementary information.

      Tables S6 and S7 report values for residue R189. This residue has been omitted in every other dataset. Based on the 1H-15N HSQC spectrum shown in Figure S3, this residue gives a well-resolved crosspeak (which lies adjacent to V228). Can the authors explain why they omit data for this residue in Figures S4, 6, and Table S4?

      The reviewer is correct in pointing out that data for R189 is missing in the fast dynamics data, such as Figure S4, Figure 6 (revised figure 3), and Table S4. We have now reanalyzed our raw data and included data for R189 and other missing residues in our updated manuscript. Please see the revised figures S4 and 6 (revised figure 3) and the revised table S4.  

      Moreover, this residue lies in the loop2 region of this domain. Based on the MD simulations (Figure 2), this region is more flexible compared to the rest of the domain. Does the corresponding 15N-relaxation data support this claim?

      Yes, the apo 15N-relaxation data do strongly support this claim. R189 showed a higher than core average R2 rate (R189 = 15.44 +/- 0.69 s-1; core = 10.92 +/- 0.37 s-1) and a lower than core average nOe (R189 = 0.49 +/- 0.05; core = 0.73 +/- 0.03) which indicate a higher flexibility than the rest of the core (updated Figure 3 and Table S4). Additionally, the S2 order parameter for R189 was found to be 0.52 +/- 0.03, slightly lower than the core average of 0.59 +/- 0.03, indicating a more flexible region than the core (updated Table S14). Moreover, the dynamics parameters extracted from HARD experimental data using the geoHARD method for apo TRBP2-dsRBD2 shown in Table S18 depict a high kex value of 31748.72 +/- 955.20 Hz for R189. This supports the claim that this residue is highly flexible with a high exchange rate.

      Figure S9. I was not able to follow this dataset as the data points are not consistent between different residues.

      In Figure S9, the residue-wise peak intensities plotted against the RNA concentration indicate that line broadening was witnessed for all the core residues (irrespective of the initial peak intensity). Another interesting observation is that the terminal residues do not undergo the same line broadening as seen in the core residues.

      It is also unclear why residue G185 is highlighted.

      It is taken as an example and magnified to show the extent of line broadening. This is now explicitly mentioned in the figure caption in the revised supplementary information.

      It is also not clear exactly what the authors are trying to fit, as I see no chemical shift changes upon the addition of RNA (Fig. S8), and the equation used for data fitting (pg. 11) uses chemical shift changes (and not the changes in intensities).

      The same equation can be used to fit the chemical shift perturbation and peak intensity perturbation as a function of ligand concentration. Here, we have tried to fit the intensity perturbation. We have now modified the statement on page 11 in the revised manuscript.

      Table S2: The ITC analysis reports an n value of ~3. Can authors elaborate as to what this means?

      The stoichiometry ~3 indicates the number of TBDP2-dsRBD2 that can interact with D12 RNA in a single binding event. The minimum binding register for dsRBDs is known to be >8 bp (12 bp for optimal binding) ([Ramos et al., 2000]), and one single domain only covers one-third of the face of the cylindrical RNA ([Masliah et al., 2018]). Hence, 3 dsRBD2 could interact with a 12-mer RNA in solution.

      The reported Kd values between the main text (page 7) and Figure 5 are not consistent with one another (one lists 1.18 uM while the other says 1.11 uM). Table S2 does not list the parameters for interactions between dsRBD1 and D12.

      Figure 5 (revised figure 6) depicts the information of a single isolated experiment out of a total of three, whereas in the main text, we say 1.18 μM as the average Kd value (table S2).

      Figure S4: The red axis should read "211" instead of "111".

      Thank you for your helpful insight. We have now changed it in the revised figure.

      Table S3 lists the structural motifs of the two dsRBDs, which are nearly identical to one another, and yet the manuscript claims that these are different (page 4, paragraph 1).

      We agree with the reviewer that the differences are minute but important, which we have tried to highlight in this paper. In particular, loop 2, critical for dsRNA-binding ([Masliah et al., 2012]), is 1 residue longer in dsRBD2 and has a possible effect in enhanced substrate binding.

      Figure S8 shows severe signal attenuation for many residues upon the addition of 100 uM RNA. The most notable among these are residues M194, T195, and C196. Can the authors explain how they measure 15N-relaxation rates for these residues in the presence of 50 uM D12?

      First, we have recorded the measured 15N-relaxation rates for these residues in the presence of 50 mM D12 (RNA:Protein= 50 mM:1000 mM)), corresponding to 0.05 equivalent RNA. The amount of RNA used is less than that used for the HSQC-based titration shown in Figure S8, 0.1 equivalent RNA (RNA:Protein = 5 mM:50 mM), where we witness line broadening for residues like M194, T195, and C196. Second, we increased the overall protein concentration from 50 mM (used in HSQC-based titration) to 1000 mM (used in relaxation measurements) to ensure a better signal-to-noise ratio in all the spectra.

      Use the same coloring scheme for Figures S7 and S8.

      Thank you for the suggestion. We have now edited Figure S8 accordingly.

      Figures are often listed out-of-order, making it difficult to follow the manuscript.

      Thank you for the suggestion. We have now amended the main text to refer to the figures sequentially. While doing so, we have renumbered Figure 6 as Figure 3, Figure 3 as Figure 4, Figure 4 as Figure 5, and Figure 5 as Figure 6.

      Figure captions for the relaxation data should specify the temperature at which these datasets were collected.

      Thanks for the valuable suggestion. We have now added the temperature wherever applicable.

      References

      Acevedo R, Evans D, Penrod KA, Showalter SA. 2016. Binding by TRBP-dsRBD2 Does Not Induce Bending of Double-Stranded RNA. Biophys J 110:2610–2617. doi:10.1016/j.bpj.2016.05.012

      Acevedo R, Orench-Rivera N, Quarles KA, Showalter SA. 2015. Helical Defects in MicroRNA Influence Protein Binding by TAR RNA Binding Protein. PLoS ONE 10:e0116749. doi:10.1371/journal.pone.0116749

      Koh HR, Kidwell MA, Ragunathan K, Doudna JA, Myong S. 2013. ATP-independent diffusion of double-stranded RNA binding proteins.

      Masliah G, Barraud P, Allain FH-T. 2012. RNA recognition by double-stranded RNA binding domains: a matter of shape and sequence. Cell Mol Life Sci 70:1875–1895. doi:10.1007/s00018-012-1119-x

      Masliah G, Maris C, König SL, Yulikov M, Aeschimann F, Malinowska AL, Mabille J, Weiler J, Holla A, Hunziker J, Meisner‐Kober N, Schuler B, Jeschke G, Allain FH. 2018. Structural basis of siRNA recognition by TRBP double‐stranded RNA binding domains. EMBO J 37:e97089. doi:10.15252/embj.201797089

      Paithankar H, Tarang GS, Parvez F, Marathe A, Joshi M, Chugh J. 2022. Inherent conformational plasticity in dsRBDs enables interaction with topologically distinct RNAs. Biophys J 121:1038–1055. doi:10.1016/j.bpj.2022.02.005

      Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1. 1996. . J Magn Reson, Ser B 113:277. doi:10.1006/jmrb.1996.0189

      Ramos A, Grünert S, Adams J, Micklem DR, Proctor MR, Freund S, Bycroft M, Johnston DS, Varani G. 2000. RNA recognition by a Staufen double‐stranded RNA‐binding domain. EMBO J 19:997–1009. doi:10.1093/emboj/19.5.997

      Vuković L, Koh HR, Myong S, Schulten K. 2014. Substrate Recognition and Specificity of Double-Stranded RNA Binding Proteins. Biochemistry 53:3457–3466. doi:10.1021/bi500352s

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      We are grateful for your pointing out the strengths in our article, particularly the assessment of e-values and the comparison with another medication to mitigate confounding by indication. We extend our sincere gratitude to the reviewer for identifying multiple concerns and offering constructive feedback to help improve our manuscript. We will incorporate these suggestions into our revisions.

      Weaknesses:

      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      We agree with the reviewer that PPIs are sometimes used for only short amounts of time, as indicated in our manuscript. We acknowledge that it is a limitation of the UK Biobank cohort, and we have discussed this in the discussion section as follows:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk.” (Page 14, Line 8-10)

      In addition, to alleviate these concerns, we have conducted effect medication for the subgroup of potential long-term users, which were defined by participants with indications of PPI use. This information has been included in the discussion section:

      “In addition, no effect moderation was observed in subgroup analyses for the main outcome among PPI users with indications (more likely to regularly use PPIs for a long period) compared to those without indications, indicating the risks remained increased among long-term PPI users.” (Page 14, Line 12-15)

      We hope that in the future, the concerns highlighted by the reviewer can be resolved by utilizing datasets with close follow-up, especially regarding medication use:

      “Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 15-17)

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      Because of the limitations of data from the UK Biobank, such as the absence of details on initiation of medications and regular monitoring, we were restricted to using a prevalent user design to assess the associations between PPI use and respiratory outcomes. We have discussed it in the limitation section:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk. However, the prevalent user design could underestimate the actual risks of PPI use for respiratory infections, which indicates the real effect might be stronger [38]……Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 8-17)

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      Your kind suggestions are greatly appreciated. We have plotted Kaplan Meier curves adjusted for confounding by inverse probability weighting with the measured confounders according to the reviewer’s advice. The methods and results are demonstrated as follows:

      “The event-free probabilities were compared by Kaplan-Meier survival curves with inverse probability weights adjusting for the measured covariates.” (Page 8, Line 13-15)

      “Regular PPI users had lower event-free probabilities for influenza and pneumonia compared to those of non-users (Supplementary Figure 2 A-B).” (Page 9, Line 21-23)

      “PPI users had lower event-free probabilities for COVID-19 severity and mortality, but not COVID-19 positivity compared to those of non-users (Supplementary Figure 2 C-E).” (Page 10, Line 9-10)

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      We apologize for misusing the term “multivariate” and “multivariable” in our previous manuscript. We have corrected the misused terms throughout the manuscript:

      “Univariate and multivariable Cox proportional hazards regression models were utilized to assess the association between regular use of PPIs and the selected outcomes.” (Page 7, Line 19-20)

      “The remaining imbalanced covariates (standardized mean difference ≥ 0.1) after propensity score matching were further adjusted by multivariate multivariable Cox regression models to calculate HRs and 95% CIs.” (Page 8, Line 23-25)

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      We have revised our interpretation for the results, particularly for those without statically significant association based on the reviewer’s advice, and clearly recognize that the conclusions should be interpreted with cautions:

      “In contrast, the risk of COVID-19 infection was not significant with regular PPI use…” (Page 2, Line 11-12)

      “PPI users were associated with a higher risk of influenza (HR 1.74, 95%CI 1.19-2.54), but the risks with pneumonia or COVID-19-related outcomes were not evident.” (Page 2, Line 14-16)

      “…while the effects on pneumonia or COVID-19-related outcomes under PPI use were attenuated when compared to the use of H2RAs.” (Page 2, Line 18-19, in the Abstract)

      “…while their association with pneumonia and COVID-19-related outcomes is diminished after comparison with H2RA use and remains to be further explored.” (Page 15, Line 21-22, in the Conclusion)

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      We agree with the reviewer that there might still exist one or more unmeasured risk factors that have effect sizes larger than 2. Hence, we cannot affirm that the findings are robust to unmeasured confounding in the current analysis, which is a limitation of our study. We have deleted the previous statement, and added more discussion in the limitation section:

      “Moreover, patients with exacerbations of respiratory disorders (e.g., asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38]. Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      Thank you for your question. The reasons for the absence of follow-up are mainly classified into five categories, including: (1) Death reported to UK Biobank by a relative; (2) NHS records indicate they are lost to follow-up; (3) NHS records indicate they have left the UK; (4) UK Biobank sources report they have left the UK; (5) Participant has withdrawn consent for future linkage. According to the data from UK Biobank (https://biobank.ndph.ox.ac.uk/ showcase/field.cgi?id=190), the major reason for the loss of follow-up among participants is their departure from the UK (84.7% of participants who were lost to follow-up). In addition, not including those who were less healthy in the study might also underestimate the risk, leading to lower estimated effects of PPIs for respiratory infections. We have supplemented this in our revised manuscript:

      “Among them, 1,297 participants without follow-up, which were mainly determined by reported death, departure from the UK, or withdrawn consent, had been removed after initial exclusion.” (Page 4, Line 25-27)

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      Thank you for your concerns. In the data collection by the UK Biobank, the participants can enter the generic or trade name of the treatment on the touchscreen to match the medications they used. We have added this important information to the method section:

      “The exposure of interest was regular use of PPIs. The participants could enter the generic or trade name of the treatment on the touchscreen to match the medications they used (Supplementary Table S1).” (Page 5, Line 6-8)

      We acknowledge that specific information on prescribed or over-the-counter use of medications is lacking in the UK Biobank. We have discussed it in the limitation section:

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      Thank you for your question on the definition of deprivation index. We have proved the details  about the deprivation index in the manuscript:

      “…socioeconomic status (deprivation index, which was defined using national census information on car ownership, household overcrowding, owner occupation, and unemployment combined for postcode areas of residence)…” (Page 6, Line 14-17)

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      We apologize for not elucidating how variables were incorporated into the main text. Previously, the linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses. For example, after evaluation with the Martingale residuals plot, age demonstrated non-linearity, and we incorporated it as a categorical variable for the analysis of COVID-related mortality.

      We have supplemented the information in the method section:

      “The linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses.” (Page 6, Line 28 to Page 7, Line 1)

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      We are sorry for not providing the statistics about the Schoenfeld residual in our previous manuscript. We have supplemented the information in our revisions:

      “Schoenfeld residuals tests were used to evaluate the proportional hazards assumptions, while no violation of the assumption was detected (Supplementary Table S3).” (Page 7, Line 27 to Page 8, Line 1)

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      Thank you for your insightful suggestions that the discussion about unmeasured confounding should be extended. We agree with the reviewer that, in addition to the comorbidities themselves, their severity could also have an important impact on the use of PPIs. We have added the discussion in the limitation section with citing the article (PMC7832226):

      “Moreover, patients with exacerbations of comorbid disorders (e.g., diabetes, asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38] (Supplementary Figure S4). Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

      We acknowledge the reviewer's point about the UK Biobank's highly selective nature potentially leading to collider stratification bias in the evaluation of COVID-19-related outcomes. We have discussed this in detail and are cautious when generating conclusions.

      “Furthermore, the highly selective nature of the UK Biobank might create collider stratification bias for the evaluation of COVID-19-related outcomes, and thus the conclusions should be interpreted with cautions [39].” (Page 15, Line 2-4)

      Reviewer #2 (Public Review):

      Summary:

      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.

      Strengths:

      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      We are grateful to the reviewer for pointing out the merits of our articles, which include adjusting for a wide range of covariates, employing diverse statistical analyses, and using robust data. We will revise our manuscript further based on the reviewer's suggestions.

      Weaknesses:

      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

      Thank you for your questions for information on the dose/duration of acid suppressants, the source of diagnosis, and the health-seeking behavior of participants. For the data from the UK Biobank, the dose or duration of acid suppressant use was not available since the information was not collected as baseline or follow-up. In addition, the outcome of interest was also retrieved from the hospital ICD diagnosis. We apologize for not clarifying it in our previous manuscript. Moreover, we agree with the reviewer that the health-seeking behavior could have an impact on the analyses, whereas the correlated data are still not available from the UK Biobank. We have discussed them in the method and limitation section:

      “Briefly, the first reported occurrences of respiratory system-related conditions within primary care data,  and hospital inpatient data defined by the International Classification of Diseases (ICD)- 10 codes were categorized by the UK Biobank.” (Page 5, Line 21-25)

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      Reviewer #1 (Recommendations For The Authors):

      Analysis code should be made available.

      Thank you for your question. We have provide the sources of the analysis code we used for this study in our revised manuscript:

      “The codes used in this study can be found at: https://epirhandbook.com/en/ and https://cran.r-project.org/doc/contrib/Epicalc_Book.pdf.” (Page 16, Line 21-22)

      Reviewer #2 (Recommendations For The Authors):

      It might be interesting to study whether including self-reported infections changes the results, as people using PPI may more easily consult their GP even for a self-limiting disease such as influenza and therefore are more likely diagnosed/confirmed with such a respiratory infection.

      Thank you for your insightful suggestions on conducting analyses including self-reported infections. Therefore, we have included the self-reported cases as sensitivity analyses, and the results were not significantly altered, which confirms the robustness of our results:

      “Self-reported infections, except for COVID-19-related outcomes due to the lack of data, were also included for the outcomes as sensitivity analyses. The self-reported cases were reported at the baseline or subsequent UK Biobank assessment center visit.” (Page 8, Line 17-19)

      “Inclusion of the self-reported cases did not significantly alter the results (Supplementary Table S4).” (Page 9, Line 17-18)

      Moreover, to address the above-mentioned, sub-analyses differentiating between over-the-counter and prescribed medication might be interesting.

      Thank you for your questions on differentiating between over-the-counter and prescribed medication. We have thoroughly looked up the data provided by the UK Biobank, but it is a pity that they are not provided. We have discussed this in the limitation section:

      “Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1.1) I thought the manuscript was very clear. While I realize the authors included the reference to medulloblastoma in the introduction based on previous reviewer comments, I think this speculation is better left in the discussion.

      Whilst we appreciate the reviewers feedback here, we felt it was important to include a reference to medulloblastoma and developmental disorders associated with the cerebellum to put this work into a broader context.

      We removed the sentence “Medulloblastoma can be a consequence of uncontrolled proliferation of granule cell progenitors, with BMP overexpression being a potential therapeutic avenue to inhibit this proliferation” to limit the speculation in this statement.

      (1.2) line 81: It would be better to cite the 2 original papers (Hendrikes et al 2022, Smith et al 2022) rather than the Phoenix commentary article. I'm not sure the Phoenix article needs to be cited at all within this paper.

      We have cited the two suggested papers and removed the citation to Phoenix et al.

      (1.3) line 102: confusing sentence with the unexpected separation of do and not: "the same conditional deletions of BMP pathway elements that fail to block early granule cell specification at the rhombic lip do result not in a larger cerebellum as might be expected, but either have no affect".

      We thank the reviewer for pointing out this error and have corrected the text to “do not result in a larger cerebellum”.

      (1.4) line 133: inconsistent acronyms (for example, W9 vs pcw9).

      This has been corrected to PCW in all occurrences.

      (1.5) line 139: coronal vs transverse? it seems like you show transverse sectioning but refer to it as coronal in the text.

      We thank the reviewer for highlighting this and have corrected the text to “transverse”.

      (1.6) fig 2C: would it be possible to provide a similar inset as 2D?

      We thank the reviewer for this suggestion and have added the insets in 2C. We agree that this is now clearer and more consistent with the rest of the figure.

      (1.7) line 368/369/435/436 missing arrows.

      The arrows have been re-added- it appears that they did not show up on the uploaded PDF.

      (1.8) line 517 missing word: rhombic-lip-derived.

      This typo has been corrected.

      Reviewer #2 (Public Review):

      (2.1) Fig. 3 M Why are there asterisks both above and below the brackets?

      This was a formatting error that has now been corrected.

      (2.2) Fig. 8. The arrows (BMP up and BMP down) are touching the right ")" in the figure, which makes it hard to read.

      This was also a formatting issue which has been corrected.

      (2.3) Fig. 4 and 8 legends. There are spaces in the text which I believe are for arrows to be inserted "(BMP )", but the arrows have been omitted in the PDF that I read.

      This is the same as reviewer 1’s comment- these have been re-added to the text and appears to have been an issue with the PDF upload.

      (2.4) Fig. 3 legend gets very hard to read at the end, where it seems some punctuation is missing.

      We have re-worded the legend for Fig. 3 to make it easier to read.

      (2.5) Significant figures in some of the text are probably too much given the accuracy at which they can be measured with.

      We appreciate the reviewer’s concerns here, however these were added in response to the original reviewer’s request to “provide some additional support to otherwise qualitative observations”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      In my opinion, the three most important controls (hopefully easy):

      (1) Include no ATR controls for optogenetic activation experiments (not all, just one or two, e.g., Figure 4B, C, or D, for the highest activation condition). The concern is that it can be quite hard to use light to both monitor neural responses while also using light to activate the function of other neurons.

      We thank the reviewer for the suggestions. We use a 2-photon 910-nm laser (which does not activate Chrimson) for imaging of GCaMP and a 624-nm LED (which does not activate GFP) for Chrimson activation. Calcium (GCaMP) signals are detected by PMT during Chrimson activation. With this setup, we are able to image GCaMP signals without crosstalk during activation of Chrimson.

      We performed calcium imaging in animals that were not fed ATR and found that SS04185 showed no response to LED stimulation at the strongest intensity (µW/mm) (New Figure 4 – figure supplement 1B).

      (2) Demonstrate that their RNAi constructs do indeed knock down the intended target gene. They showed nicely in Figure 5A that SeIN128 expresses GABA. Presumably, these neurons also express VGAT. Is it possible to check the expression of VGAT after RNAi knockdown? The concern is that using only a single RNAi introduces the possibility of off-target effects. Using multiple RNAi lines for VGAT or other parts of the pathway would also alleviate this (minor concern).

      We thank the reviewer for raising this point. We agree that using only one RNAi line (HMS02355) for VGAT in Figure 5A is a weakness. 

      Accordingly, we have performed additional experiments to quantify the effect of RNAi knockdown of VGAT using HMS02335 in all neurons, followed by subsequent immunostaining against GABA or VGAT. We found that both VGAT and GABA were significantly reduced in the neuropil (Figure 5 – figure supplement 1C and D). These data strongly suggest that HMS02355 knocks down VGAT and reduces GABA at axon terminals. We note that HMS02355 has been used previously for knocking down GABA signaling in the following studies.

      (1) Kallman BR, Kim H, Scott K (2015). Excitation and inhibition onto central courtship neurons biases Drosophila mate choice. eLife 4:e11188. https://doi.org/10.7554/eLife.11188

      (2) Zhao W, Zhou P, Gong C et al. (2019). A disinhibitory mechanism biases Drosophila innate light preference. Nat Commun 10, 124. https://doi.org/10.1038/s41467-018-07929-w

      (3) Yamagata N, Ezaki T, Takahashi T, Wu H, Tanimoto H (2021). Presynaptic inhibition of dopamine neurons controls optimistic bias. eLife 10:e64907. https://doi.org/10.7554/eLife.6490

      (3) Include genetic controls for their driver line.

      In Figure 1, it would be nice to see one half or the other half of their split GAL4 line in their manipulations. The concern is that perhaps the phenotype is coming from something unexpected in the genetic background.

      We thank the reviewer for the suggestion. We have added half of the GAL4 lines (AD or DBD) as controls (New Figure 1 – figure supplement 2). We found that SS04185 showed reduction of rolling, whereas AD only or DBD only (split control) did not (half of the split lines). 

      In the discussion:

      It seems that activation of SS014185 has additional effects beyond what the authors have quantified. Specifically, larvae do not appear to re-initiate rolling in the same manner as Basin activation alone. Also, there appears to be an off-response, turning.

      We appreciate the reviewer’s comments. We have included a section in the discussion to consider the differences patterns of rolling observed during joint stimulation of Basins and SS04185 and during stimulation of Basins alone, as well as the increase in turning following the offset of joint stimulation of Basins and SS04185 compared with stimulation of Basins alone (lines 464 to 481). Although the reasons for these differences are beyond the scope of the paper, we have added Figure 2 – figure supplement 1K, which shows that co-activation of SS04185-MB and Basins is sufficient to evoke turning following the offset of stimulation, suggesting that the increased turning may be due to the activation of SS04185-MB neurons and independent of SS04185-DN neurons.  

      The labeling of the Figure panels could be improved. In many places, it is not clear that Basins are being stimulated in the background, whereas in nearby panels, it is clearly labeled. This is confusing for the reader.

      We thank the reviewer for the constructive suggestions. We have modified all relevant figures to read “Basins>Chrimson” above the pink line indicating the period of optogenetic activation.

      Reviewer #2 (Recommendations For The Authors):

      Claims, rigorousness, repeatability, and accuracy of terms.

      (1) In line 254, the authors suggest that the slow response of SeIN128 neurons is due to the input they receive from SEZ, but in line 453, they suggest it is due to axo-axonal connections. However, their evidence does not support one factor over the other. Overall, only the axo-axonal connection was strongly suggested in the discussion. The authors could clarify that the delay of SeIN128 activity may also be caused by multisynaptic connections involving SEZ or other neurons in the last section of the Discussion.

      Although SeIN128 primarily receives inputs from the SEZ, it also receives inputs within the VNC from Basin-2 (Figure 4 – figure supplement 2). Specifically, in the VNC, the axons of SeIN128 make inhibitory synaptic contacts onto the axon of Basin-2, which in turn makes reciprocal excitatory contacts onto the axon of SeIN128, thereby forming a feedback loop. However, by the time we wrote the original discussion, we had inadvertently focused on the potential of the negative feedback loop formed by these axo-axonal synapses in the VNC to mediate the slow response of SeIN128, overlooking the possibility that other as yet unidentified pathways could convey Basin or A00c activity indirectly to SeIN128 dendrites in the SEZ. Therefore, we have revised the original text, which read “These data suggest that the main synaptic inputs onto SeIN128 neurons in the SEZ mediate the slow responses upon activation of Basins or A00c neurons” to “These data suggest that the delay of SeIN128 activity may be caused by multi-synaptic connections involving the SEZ or a feedback loop involving axo-axonal connections between SeIN128 and Basin-2 or A00c” (revised, Lines 259 and 261). Accordingly, we have also adjusted the relevant discussion section to be consistent with this change (Lines 460 and 466).

      (2) Please clarify the following: How does the algorithm define rolling and crawling? Healthy larvae complete 360{degree sign} rolls, in each roll they rotate from dorsal up to dorsal up. It is possible that a larva rolls for an incomplete cycle and straightens up. Does the algorithm simply label individual frames as “roll”, “non-roll”, or “unknown”, and defines rolling by the existence of “roll” frames? If so, then larvae that rolled for 90{degree sign} and straightened would be counted as “rolling” though they failed to complete a full rolling bout. Also, how were “hunch” “turn” and “back” identified? Lastly, is there any manual quality control involved? Address this and related issues in the methods:

      a)  Expand the description of the classifier algorithm.

      b)  How are rolling and non-rolling animals defined in the "rolling%" assay? Were all "rolling" animals able to do at least one 360{degree sign} roll?

      c)  How are "rolling duration" and "end of 1st rolling" defined? Is the algorithm able to distinguish different rolling bouts? In these two assays, were the animals rolled for <1 second (in total or their "first roll") able to complete a 360{degree sign} roll?

      The Multi-worm Tracker (MWT) records only the contours of animals (no real video image data). Thus, the data fed into the classifier algorithm only includes features based on contour time-series data. The algorism uses movement perpendicular to the body axis—the characteristic feature of larval rolling—to classify rollers and non-rollers. Although the algorithm cannot determine whether a rolling event involves a rotation of more than 360 degrees, we ensure that rolling events are at least 360 degrees by removing any events that are shorter than 0.2 s (the minimum time to complete a 360-degree roll).

      We have accordingly revised the section of “Behavior detection” relating to the behavior classification algorithm in the methods section as follows (Lines 600 to 620).

      “After extracting behavioral parameters from Choreography, we used an unsupervised machine learning behavior classification algorithm to detect and quantify the following behaviors: hunching (Hunch), headbending (Turn), stopping (Stop), and peristaltic crawling (Crawl) as previously reported (Masson et al., 2020). Escape rolling (Roll) was detected with a classifier developed using the Janelia Automatic Animal Behavior Annotator (JAABA) platform (Kabra et al., 2013; Ohyama et al., 2015). JAABA transforms the MWT tracking data into a collection of ‘per-frame’ behavioral parameters and regenerates 2D dorsal-view videos of the tracked larvae. Based on such videos, we defined rolling as a rotation around the body while the larva maintains a C-shape, which results in a movement perpendicular to larval body axis (Supplementary videos 1 and 2). Using this definition, we trained the algorithm in the JAABA platform by labeling ~10,000 randomly chosen frames as rolling or non-rolling to develop the rolling classifier. If a larva did not curl into a C-shape or move sideways, it was labeled as a “non-roller.” Every animal with at least one rolling event longer than 0.2 s in a given period was labeled as a “roller” (i.e., it was assumed to have rolled at least 360 degrees), based on the observation that when the start and end of rolling events were precisely measured, the algorithm could identify rolling events completed in 0.2 s.

      The rejection of false positives, especially at the beginning and the end of each rolling bout, enhanced accuracy. The algorithm integrated these training labels and parameters generated with Choreography in a time series, such as speed, crabspeed, and body curvature, to generate a score for rolling detection. Above a certain threshold, the classifier labeled the frame as rolling. This classifier, which has false negative and false positive rates of 7.4% and 7.8%, respectively (n = 102), was utilized to detect rolling in this paper.”

      Readability of text

      (1) I suggest giving the SS04185 line and SeIN128 neuron common names that are easier to remember and follow (after mentioning their full name once).

      We acknowledge the reviewer’s concerns. However, because SS04185 was initially named using the Janelia split-line pipeline, and SeIN128 was named independently in a more recent study (Ohyama et al., 2015), we have retained these designations in the present manuscript.

      Figures and figure legends

      (1) It would help if the authors could put visual representations of rolling and crawling, such as a cartoon larva performing the rolling-crawling switch, and still frames of rolling and crawling of real larvae, especially in Figure 1. Also, please consider including a video of rolling and crawling in real larvae (preferably comparing control and experimental groups).

      We appreciate the reviewer’s suggestion. We have added a cartoon of the behavioral sequence in Figure 1A, as well as a Figure 1 supplement video based on MWT data, which shows rolling followed by crawling. 

      (2) To give the reader a take-home message, it would help if the authors could make a simplified version of Figure 4A and put it at the end of the paper.

      We thank the reviewer for the suggestion. To assist the reader, we have added schematics depicting how the circuit may function in panel I of Figure 8.

      (3) In Figure 1A, add the text "activation " after the neuron names.

      We have added “Chrimson” following “Basins>” to the new Figure 1B (old Figure 1A) and other figures (Figure 1C and D, Figure 5A, Figure 6A, and figure supplements).

      (4) Figure 1G: a data point is misaligned (at the top of the graph). 

      We have aligned the data point accordingly.

      (5) Figure 1B can benefit from a better design. If possible, please separate the crawling speed into an independent graph (or at least use a different line shape to code for crawling speed and indicate it on the in-graph legend). Is the speed of Basin/SS04185 co-activation studied?

      We appreciate the reviewer’s suggestion. We have separated the plots for rolling and crawling speed into different panels (Figure 1C and D). As shown in Figure 1D, the crawling speed observed during coactivation of Basins and SS04185 was similar to that during activation of Basins alone.

      (6) Figure S1 uses a different color-coding scheme from Figure 1. I suggest making the color coding consistent between figures.

      We are grateful for the reviewer’s suggestion. We have adjusted the color-coding scheme accordingly.

      (7) Line 692 (Figure 2 legend), "Killer Zipper" is misspelled as "Kipper Zipper". Out of curiosity, is there a way to remove or reduce SS04185-DN expression in the same manner as SS04185-MB reduction?

      We have corrected the text in the legend for Figure 2. As for the reviewer’s question, we did attempt to reduce or abolish SS04185-DN expression with tsh-LexA and LexAop-Kip+ but found no effect. Other identified LexA constructs with SeIN128 expression, however, all showed SS04185-MB expression. Consequently, we could not use these constructs because they inhibit both SeIN128 and SS04185-DN.

      (8) The color coding of Figure 2 (especially in D) makes it hard to distinguish between the brown and red groups.

      We thank the reviewer for the suggestion. Accordingly, we have changed the color for the brown group to orange.

      (9) In line 926 (Figure S2 legends), the description of F and G seems inverted.

      We appreciate the reviewer for pointing out the error. We have revised the text from “(F) has only SS04185-

      MB expression, and (G) has both SS04185-DN and SS04185-MB expression” to “(F) has both SS04185DN and SS04185-MB expression, and (G) has only SS04185-MB expression.”

      (10) Figure 7B: which line does the top group of asterisks belong to?

      The top group of asterisks indicates that each experimental group differs significantly (p < 0.001) from the control group. We have revised the figure to clarify the comparisons indicated by the asterisks in Figure 7B, as well as the figure legend below (Line 890-894).

      “(B) Cumulative plot of rolling duration. Statistics: Kruskal-Wallis test: H = 69.52, p < 0.001; Bonferronicorrected Mann-Whitney test, p < 0.001 between control and the GABA-B-R11, GABA-B-R12 and GABAB-R2 RNAi groups, p < 0.001 between GABA-A-R and all other experimental RNAi group. Sample size for the colored bars from top (control, black) to bottom (GABA-A-R, red); n = 520, 488, 387, 582, 306.”

      (11) Figure S8 D and F: indicate Basin-2 or Basin-4 activation on graph.

      We have revised Figure 8 – figure supplement D and F accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Lines 86-87: Text needs to be rewritten for clarity. Also, include the genotype in the corresponding figure legend (Figure 1B).

      We thank the reviewer for pointing this out. We have clarified the text accordingly and included the genotype in the figure legend (lines 86 and 87). Specifically, we have revised Figure 1B (New Figure 1C and D) and adjusted the legend accordingly as follows. 

      Lines 86 and 87: Crawling speed during the activation of all Basins following rolling was ~1.5 times that of the crawling speed at baseline (Figure 1D).

      (2) Include the protocol for heat shock-FLP out experiments

      We have added the following paragraph to the Methods section describing the heat shock-FlpOut experiments (lines 537 to 546).

      “Heat shock FlpOut mosaic expression

      First instar Drosophila larvae were exposed to heat shock in a water bath at 37°C for 12 min as previously described (Nern et al., 2015). With precise temporal and thermal control of heat shock, larvae with genotype

      w+, hs(KDRT.stop)FLP/13xLexAop2-IVS-CsChrimson::tdTomato; R54B01-Gal4.AD/72F11LexA;20xUAS-(FRT.stop)-CsChrimson::mVenus/R46E07-Gal4.DBD showed sporadic

      CsChrimson::mVenus expression driven by SS04185 split GAL4. As a result, the ratio of the larvae with SS04185-DN and SS04185-MB expression to those with only SS04185-MB expression was 1:1. Each larva was individually examined with optogenetic stimulation and behavior analysis. After behavioral experiments, mVenus expression in CNS was confirmed under the fluorescence microscope.”

      (3) In the immunohistochemistry, the authors exclude the steps for washings. Recommend the authors to cite the previous literature. Similar to the other protocols detailed in the methods.

      We have added a brief description of the steps involved in washing (lines 641 and 648). We have also provided a citation with similar immunohistology protocols (Patel, 1994).

      (4) Keeping the same Y-axis scale for similar graphical representation would be helpful to compare across different experimental conditions and genotypes-for example, 2E and 2H for the start of the first crawl.

      As suggested by the reviewer, we have adjusted the y-axis scales for Figure 2E and H to be identical.

      (5) The color schematics used for the graph make it hard to visualize the data. The author might reconsider the better presentation of the data by avoiding darker colors.

      We thank the reviewer for the constructive suggestion. We have lightened the shading of all violin plots. We have also modified the shading for the middle group in Figure 2C and E from dark brown to orange.

      (6) Co-activation of the SS04185 and Basins in the figures represented as Basins+SS04185 (Figure 1A) and SS04185 (rest of the figures). Authors might reconsider this terminology to define and distinguish the coactivation of SS04185 and Basins neurons from the activation of SS04185 or Basins alone. It needs to be clarified in the figures.

      We have adjusted the terminology by including “Basins>Chrimson” in all panels in which Basin neurons are optogenetically activated to trigger rolling in the background for all groups. Additionally, we have labeled the control group as “Control” and the experimental group as ”SS04185”. 

      (7) Figure 4A, summarizes the synaptic connection and strength between different neurons - SeIN128, Basins, A00c and mdIV. However, the nature of these synaptic connections - excitatory and inhibitory- is not represented. Based on the previous and current studies, the authors consider providing the schematic for circuit mechanisms of escape behavior sequences in larvae. Also, discussing these findings in light of the downstream output circuit and motor regulation might be informative (See Cooney et al. 2023, PNAS).

      As the reviewer correctly points out, the diagram of the connectome shown in Figure 4A does not indicate whether the connections are excitatory or inhibitory. Accordingly, we have added a new summary panel (Figure 8I) based on the results of examining GABAergic synapses (Figure 5A). The schematics in Figure 8I depict how the joint activity of inhibitory and excitatory synapses (indicated by arrowheads and blunt ends, respectively) may lead to rolling or fast crawling.

      We have also added a section discussing the premotor circuits for crawling and rolling premotor circuit in discussion (Line 512 – 519).

      (8) Percentage rolling present in figure 5B and 6A correspond to the control larvae 13xLexAop2-IVS-CsChrimson::mVenus; R72F11-lexA/+; HMS02355/+ and 13xLexAop2-IVS- Cs-Chrimson::mVenus; R72F11-lexA/+; UAS-TeTxLC.tnt/+. How does the author interpret the observed variability across the experiments? The author might consider discussing the genetic background effect on the observed behaviors, if any.

      As pointed out by the reviewer, we noticed that rolling probability varied depending on genetic background. We have revised the text accordingly (Lines 277 to 280).

      (9) Recheck the arrowheads in Figure 5A.

      We have confirmed the positions of the arrowheads in Figure 5A and modified the figures by outlining the cells with dotted lines.

      (10) Lines 295-298: Data presented in the supplementary figure and p-values in the text (p=0.11) suggest that the first crawl's onset is comparable to controls. Rewrite this text for clarity and include the statistical values in the supplemental figure 6.

      We have revised the text as follows (Lines 302 to 305).

      “Although the duration of each rolling bout, time to onset of the first rolling bout, and time to onset of the first crawling bout did not differ from those of controls (Figure 6–figure supplement 1D, E and G), the time to offset of the first rolling bout was delayed relative to controls (p = 0.013 for Figure 6–figure supplement 1F).”

      (11) Lines 263-264: Data provide evidence for SS04185 receiving inputs Basin-2 and A00c neurons. SS04185, which provides inputs to other neurons, specifically A00c neurons, but still needs clarification.

      We have revised the text as follows (Lines 264 to 266).

      The results thus far indicate that, activation of SeIN128 neurons inhibits rolling (Figure 1A–C), SeIN128 neurons receive functional inputs from Basin-2 and A00c (Figure 4A-C); and SeIN128 neurons make anatomical connections onto Basin-2 and A00c (Figure 4A). 

      (12) In the table that lists the genotypes, instead of '-' or the blank space in the label column, the author might consider using 'control,' consistent with the figures.

      In accord with the reviewer’s suggestion, we have revised the notation of ‘-’ or the blank space, to ‘control’ for all figures.

      (13) Check the typographical errors throughout the manuscript. Some below:

      We have revised the text accordingly as suggested below.

      a.  Lines 100, 142: SS4185 should be SS04185

      b.  Line 230: A00C should be A00c

      c.  Line 180: Expand VNC

      d.  10xUAS-IVS-mry::GFP should be 10xUAS-IVS-myr::GFP

      e.  Lines 444, 449: drosophila should be Drosophila

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      Horn and colleagues present data suggesting that the targeting of GREM1 has little impact on a mouse model of metabolic dysfunction-associated steatohepatitis. Importantly, they also challenge existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells.

      Strengths:

      This is an impressive tour-de-force study around the potential of targeting GREM1 in MASH.

      This paper will challenge many existing papers in the field around our ability to detect GREM1 in circulation, at least using antibody-mediated detection.

      Well-controlled, detailed studies like this are critically important in order to challenge less vigorous studies in the literature.

      The impressive volume of high-level, well-controlled data using an impressive range of in vitro biochemical techniques, rodent models, and human liver slices.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. Below, we give a point-by-point response to the reviewer’s comments and indicate where we plan to adjust the manuscript.

      Weaknesses: only minor.

      (1) The authors clearly show that heparin can limit the diffusion of GREM1 into the circulation-however, in a setting where GREM1 is produced in excess (e.g. cancer), could this "saturate" the available heparin and allow GREM1 to "escape" into the circulation?

      We thank the reviewer for their question. Indeed theoretically, if the production of Gremlin-1 exceeds the capacity of heparin to immobilise Gremlin-1, the protein may be released into solution and thus may enter the circulation. Whilst we have not addressed this possibility in our studies, we agree that it may be a mechanism worthwhile exploring in future studies.

      (2) Secondly, has the author considered that GREM1 be circulating bound to a chaperone protein like albumin which would reduce its reactivity with GREM1 detection antibodies?

      We have thought of the possibility that Gremlin would bind other proteins such as BMPs, and thereby mask assay-antibody epitopes. To minimise this possibility, we used antibody pairs which bind different epitopes. We also used LC-MS for Gremlin-1 detection (data not shown in the manuscript), a method that is not affected by epitope masking. With the LC-MS analysis we did not pick up any gremlin-signal in plasma. We will mention the LC-MS data in the updated manuscript.

      Also, we were able to detect circulating Gremlin-1 after treatment with anti-Gremlin-1 antibodies. As these were the same antibodies that were used in our assays, we should have not been able to detect Gremlin-1 if there had been a masking interaction with circulating high abundant plasma proteins such as albumin.

      Finally, we believe that the assay antibodies would outcompete binding of any other proteins because of their high affinity and very high concentrations used in the assays.

      In summary, we are very confident that Gremlin-1 is not present in circulation. We will though make some minor adjustments to the manuscript in order to stress this important point.

      (3) Statistics-there is no mention of blinding of samples-I assume this was done prior to analysis?

      All reported results were derived from hard quantitative readouts obtained through assays that are not liable to subjective interpretation. This also applies to immunohistochemistry and RNAscope histologic quantification, using Visiopharm Integrator System software ver. 8.4 or HALO v3.5.3577 (Area Quantification v2.4.2 module), respectively. Therefore, no blinding was necessary prior to analysis.

      (4) Line 211-I suggest adding the Figure reference at the end of this sentence to direct the reader to the relevant data.

      We thank the reviewer for the suggestion and will add a reference to Figure 1F here.

      (5) Figure 1E Y-axis units are a little hard to interpret-can integers be used?

      As the y axis in Figure 1E is on the logarithmic scale, integer numbers would be very hard to read because of the large range of numbers. As we acknowledge that the notation used may be difficult to read, we will change it to superscript scientific notation.

      (6) Did the authors attempt to detect GREM1 protein by IHC? There are published methods for this using the R&D Systems mouse antibody (PMID 31384391).

      Parallel to the work described in PMID 31384391 (Dutton et al., Oncotarget, 10: 4630-4639, 2019), we have tested a whole range of commercial and in-house gremlin-1 antibodies. We independently arrived at the same conclusion as Dutton et al namely that goat anti-gremlin antibody R&D Systems AF956 can stain the mouse or rat intestine in the muscularis layer and in the crypts/lower part of the villi, using FFPE sections. As per Dutton et al. we also corroborated this IHC staining by RNAscope - the mRNA was restricted to the muscularis and the connective tissue just below the crypts, suggesting that Gremlin-1 partially diffuses away from the cells that produce it. In contrast, none of the other commercial or in-house gremlin antibodies that we tested provided any useful staining on FFPE sections.

      We also used the R&D Systems AF956 antibody on several rat MASH liver samples. We saw little or no staining in livers from chow-fed rats, with only occasional weak staining around portal areas. Depending on the rat model, we saw from little or no staining to at most weak staining in portal areas and fibrotic areas. Among the various models tested, we observed the strongest staining in the rat CDAA-HFD+cholesterol model, in line with the ISH data.

      However, we were unable to establish IHC on human MASH liver samples using the R&D Systems AF956 antibody (or any other antibody) despite 98% sequence identity at the amino acid level between human and rat gremlin-1. Considering the results in Dutton et al. on rodent intestines, we tested the antibody on some human intestine samples, but the results on the available samples (inflamed appendices) were inconclusive.

      We will include representative IHC staining images for Gremlin-1 protein on rat livers as a Supplementary Figure and mention in the manuscript that IHC for human Gremlin-1 did not work with the available antibodies.

      (7) Did the authors ever observe GREM1 internalisation using their Atto-532 labelled GREM1?

      The Atto-532 Gremlin-1 cell association assay was mainly intended to visualise the association of Gremlin-1 with cell surface proteoglycans and how this interaction is affected by heparin-displacing and non-displacing antibodies. We observed a possible, but inconclusive intracellular association of Atto-532 Gremlin-1. However, this assay was not specifically designed for this purpose, and we did not follow up on this. Therefore, we cannot draw any conclusions on whether cell surface bound Gremlin-1 can be internalised. However, we appreciate that internalisation of Gremlin-1 would be an interesting biological mechanism worth following up in future studies.

      (8) Did the authors complete GREM1 ISH in the rat CDAA-HFD model? Was GREM1 upregulated, and if so, where?

      We have performed Grem1 ISH in the rat CDAA-HFD model and representative images of this are shown in Figure 1F. In chow-fed animals, Grem1 was expressed in a few cells in the portal tract, whereas after CDAA-HFD, Grem1 positive cells became more abundant in the portal tract and were also detectable in the fibrotic septa, as described in the respective results section. However, we performed no co-staining with other markers as we did for human liver samples.

      (9) Supplementary Figure 4C - why does the GFP level decrease in the GREM1 transgenic compared to control the GFP mouse? No such change is observed in Supplementary Figure 4E.

      In Supplementary Figure 4C we show expression of GFP mRNA and GREM1 mRNA in lysates of GFP-control and GREM1-GFP overexpressing LX-2 cells. The x-axis labels indicate the different lentiviruses. Therefore, the right panel in Supplementary Figure 4C shows that GREM1 overexpressing LX-2 cells expressed more GREM1 compared to GFP-control transduced LX-2, while GFP mRNA expression was comparable between the two.

      The results in Supplementary Figure 4E look different because – as can also be seen from the % of GFP+ cells in Supplementary Figure 4D – the GREM1 lentivirus here was more effective in transducing the cells, which is why both GFP and GREM1 mRNA were increased with GREM1 lentivirus compared to the GFP-only control. Unlike LX-2, the lentivirally transduced HHSC were not sorted on GFP positive cells prior to qPCR, which may explain the differences in GFP mRNA expression pattern between the two cell types.

      We acknowledge that the figure may be difficult to interpret and will adjust the figure annotation to improve on this.

      Reviewer #2 (Public Review):

      It is controversial whether liver gremlin-1 expression correlates with liver fibrosis in metabolic dysfunction-associated steatohepatitis (MASH). Horn et al. developed an anti-Gremlin-1 antibody in-house and tested its ability to neutralize gremlin-1 and treat liver fibrosis. This article has the advantage of testing its hypothesis with different animal and human liver fibrosis models and using a variety of research methodologies.

      The experimental design and results support the conclusion that the anti-gremlin-1 antibody had no therapeutic effect on treating liver fibrosis, so there are no other suggestions for new experiments:

      (1) The authors used RNAscope in situ hybridization to establish the correlation between Gremlin-1 expression and NMSH livers or cell lines.

      (2) A luminescent oxygen channelling immunoassay was used to measure circulating Gremlin-1 concentration. They found that Gremlin-1 binds to heparin very efficiently, preventing Gremlin-1 from entering circulation, and restricting Gremlin-1's ability to mediate organ cross-communication.

      (3) The authors developed a suitable NMSH rat model which is a choline-deficient, L-amino acid defined high fat 1% cholesterol diet (CDAA-HFD) fed rat model of NMSH, and created a selective anti-Gremlin-1 antibody which is heparin-displacing 0030:HD antibody. They also used human cirrhotic precision-cut liver slices to test their hypotheses. They demonstrated that neutralization of Gremlin-1 activity with monoclonal therapeutic antibodies does not reduce liver inflammation or liver fibrosis.

      One concern is that several reagents and assays are made in-house without external validation. Also, will those in-house reagents and assays be available to the science community?

      Overall this manuscript provides useful information that gremlin-1 has a limited role in liver fibrosis pathogenesis and treatment.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. We acknowledge the fact that most of our results were derived from assays using in-house generated reagents which will therefore be hard to reproduce externally. Whilst for legal reasons we cannot share the sequences of the monoclonal antibodies, we will be able to share aliquots with fellow scientists upon request. We will include a sentence to this end to the data availability statement.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      In the original version, I pointed out the gap between their pH/calcium imaging data and the hypothesis of ion channel function of TMC7 in the Golgi. Now the author agrees and has changed the description to be reasonable. Additional experiments were also performed, and I can say that they have answered my concern adequately.

      I would say it is good to add any presumed mechanism for the observed changes in pH and calcium concentration in the cytoplasm this time.

      We appreciate your positive comments on our revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they identify TMC7 as a transmembrane channel-like protein with elevated transcript levels in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including reduced sperm count, elongated spermatids, and large vacuoles. Additionally, abnormal acrosome morphology is observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH, and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors suggest that its ion channel activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported by the data that are presented.

      Weaknesses:

      The claim is that TMC7 functions as an ion channel. It is reasonable to assume this given what has been previously published on the more well-characterized TMCs (TMC1/2), but the data supporting this is preliminary here, and more needs to be done to solidify this hypothesis. The authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

      We appreciate this constructive suggestion.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation

      Strengths:

      By using PD21 testes, the revised assays have consolidated that depletion of TMC7 leads to a reduced level of Ca2+ and an elevated level of ROS in the male germ cells. The immunohistochemistry analyses have clearly indicated the reduced abundance of GM130, P115, and GRASP65 in the knockout testis.

      Weaknesses:

      The Discussion section contains sentences reiterating the Introduction and Results of this manuscript (e.g., Lines 79-85 and 231-236; Lines 175-179 and 259-263). Those read repetitive and can be removed.

      We thank the reviewer for this import comment. We have modified the text according to your suggestion.

      Future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

      Thanks. We appreciate this constructive suggestion. We totally agree the reviewer that future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      1. In Fig S6d, the bar of Tmc7-/- is broken in the middle for P-EIF2.

      Thanks. We have remade Fig S6d according to your suggestion in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      None. The reviewers have adequately answered my points. Many thanks!

      We thank the reviewer for accepting our revisions as sufficient.

      Reviewer #3 (Recommendations For The Authors):

      In the revised manuscript, the authors have addressed most of my concerns.

      We are pleased that we were able to adequately address the reviewer’s concerns. We appreciate your suggestions to further improve our study.

    1. Author response:

      Generals:

      We deeply appreciate the efforts by the Senior and Reviewing Editors, and also thank the three reviewers for their careful reading of the MS and their constructive comments, which are very helpful to improve our MS. We agree that we extend our efforts to elaborate the pharmacological analyses including clarification of the penetrance of GAP junction inhibitor(s), and effectiveness and specificity of the drugs. We plan to test at least L-type calcium channel blocker nifedipine. Concerning the reproducibility of the phenotypes, we indeed repeated experiments at multiple times for each of the analyses. While we demonstrated in the current version a series of representative data for simplicity along with explanation in the text that we conducted multiple times of experiments,  in a revised version we will improve the demonstration so that readers/reviewers can be convinced with the reproducibility of the data. We will also try to test other markers to look into cell types constituting the gut contractile organoid

      Specifics:

      Our provisional responses to “The weakness” raised by the reviewers are as follows:

      Reviewer #1:

      Please see the responses shown above (“Generals”).

      Reviewer #2:

      In addition to the responses in “Generals”, our response also includes the followings: We will look into wavelength between contractions/rhythm of the orgnaoid. We agree that our organoids derived from embryonic hind gut (E15) might not necessarily recapitulate the cell function in adult. However, it has well been accepted in the field of developmental biology that studies with embryonic tissue/cells make a huge contribution to unveil how complicated physiological cell function is underpinned. Nevertheless, we will carefully consider in the revised version so that the MS would not send misleading messages. Recent advances have also shown that 3D organoids can somehow “replace/substitute for” a complicated in vivo specimen when a particular cellular function is a focus of study.

      Reviewer #3:

      We appreciate a strong support of our findings.

      (1) We plan to perform positive control experiments, for example, to test if the drugs we use would interfere cardiac muscle functions.

      (2) We plan to do wach-out experiment to  confirm 10uM blebbistatin does not kill the cells. Thank you for this suggestion.

      (3) We plan to conduct tetrodotoxin treatment. Since experiments with such toxic reagents are not enouraged by our institute, we will perform experiments with a necessary-minimum amount.

      (4) We plant to address this point properly

      5) It is well predictable that blebbistatin would stop the gut movement in an explanted hindgut, and it is also well established that gut contractions (movements) are concomitant with Ca2+ transients. It would indeed be interesting to see how GJ inhibitors affect such in vivo gut movement. However, since all the reviewers and the Reviewing Editor pointed out, sensitivity (concentration) and penetrance of the drug is an important point of concern, we think that the in vivo analyses will be a next step to go in near future.

      (6) We have indeed noticed that contraction frequency is reduced after organoidal fusion. It seems as if cells communicate with each other to decide which rhythm they need to be adjusted to. Furthermore, contraction frequency tends to be slow down when the organoid becomes larger in size. It might be attributed to a delay in conductance between cells over growing distance. We plan to either quantify these potentially interesting phenomena or make a concise speculation in the revised version.

      (7)-(10) Thank you for these comments. We will fix them.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In the article by Dearlove et al., the authors present evidence in strong support of nucleotide ubiquitylation by DTX3L, suggesting it is a promiscuous E3 ligase with capacity to ubiquitylate ADP ribose and nucleotides. The authors include data to identify the likely site of attachment and the requirements for nucleotide modification. 

      While this discovery potentially reveals a whole new mechanism by which nucleotide function can be regulated in cells, there are some weaknesses that should be considered. Is there any evidence of nucleotide ubiquitylation occurring cells? It seems possible, but evidence in support of this would strengthen the manuscript. The NMR data could also be strengthened as the binding interface is not reported or mapped onto the structure/model, this seems of considerable interest given that highly related proteins do have the same activity. 

      The paper is for the most part well well-written and is potentially highly significant, but it could be strengthened as follows: 

      (1) The authors start out by showing DTX3L binding to nucleotides and ubiquitylation of ssRNA/DNA. While ubiquitylation is subsequently dissected and ascribed to the RD domains, the binding data is not followed up. Does the RD protein alone bind to the nucleotides? Further analysis of nucleotide binding is also relevant to the Discussion where the role of the KH domains is considered, but the binding properties of these alone have not been analysed. 

      We thank the reviewer for the suggestion. We have tested DTX3L RD for ssDNA binding using NMR (see Figure 4A and Figure S2), which showed that DTX3L RD binds ssDNA. We also tested the DTX3L KH domains for RNA/ssDNA binding using an FP experiment. However, the FP experiment did not show significant changes upon titrating RNA/ssDNA. It seems that the KH domains alone are not sufficient to bind RNA/ssDNA and both KH and RD domains are required for binding. Understanding how DTX3L binds RNA/ssDNA is an ongoing research in the lab. We will revise the Discussion on the KH domains.

      (2) With regard to the E3 ligase activity, can the authors account for the apparent decreased ubiquitylation activity of the 232-C protein in Figure 1/S1 compared to FL and RD? 

      We will address this question in the revision.

      (3) Was it possible to positively identify the link between Ub and ssDNA/RNA using mass spectrometry? This would overcome issues associated with labels blocking binding rather than modification. 

      We have tried to use mass spectrometry to detect the linkage between Ub and ssDNA/RNA, but was unable to do so. We suspect that the oxyester linkage might be labile, posing a challenge for mass spectrometry techniques. Similarly, a recent preprint from Ahel lab, which utilises LC-MS, detects the Ub-NMP product rather than the linkage (https://www.biorxiv.org/content/10.1101/2024.04.19.590267v1.full.pdf).

      (4) Furthermore, can a targeted MS approach be used to show that nucleotides are ubiquitylated in cells? 

      This will require future development and improvement of the MS approach, specifically the isolation of labile oxyester-linked products from cells and the optimisation of the MS detection method.

      (5) Do the authors have the assignments (even partial?) for DTX3L RD? In Figure 4 it would be helpful to identify the peaks that correspond to the residues at the proposed binding site. Also do the shifts map to a defined surface or do they suggest an extended site, particularly for the ssDNA.

      We only collected HSQC spectra which was insufficient for assignments. We have performed a competition experiment using ADPr and labelled ssDNA, showing that ADPr competes against the ubiquitination of ssDNA (Figure 4D). We will provide an additional experiment showing that ssDNA with a blocked 3’-OH can compete against ubiquitination of ADPr. These data, together with our NMR analysis, will further strengthen the evidence that ssDNA and ADPr compete the same binding pocket in DTX3L RD. Understanding how DTX3L RD binds ssDNA/RNA is an ongoing research in the lab.

      (6) Does sequence analysis help explain the specificity of activity for the family of proteins? 

      We will performed sequence alignment of DTX proteins RD domains and discuss this point in the revision.

      (7) While including a summary mechanism (Figure 5I) is helpful, the schematic included does not necessarily make it easier for the reader to appreciate the key findings of the manuscript or to account for the specificity of activity observed. While this figure could be modified, it might also be helpful to highlight the range of substrates that DTX3L can modify - nucleotide, ADPr, ADPr on nucleotides etc. 

      We will modify this Figure as suggested.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Dearlove et al. entitled "DTX3L ubiquitin ligase ubiquitinates single-stranded nucleic acids" reports a novel activity of a DELTEX E3 ligase family member, DTX3L, which can conjugate ubiquitin to the 3' hydroxyl of single-stranded oligonucleotides via an ester linkage. The findings that unmodified oligonucleotides can act as substrates for direct ubiquitylation and the identification of DTX3 as the enzyme capable of performing such oligonucleotide modification are novel, intriguing, and impactful because they represent a significant expansion of our view of the ubiquitin biology. The authors perform a detailed and diligent biochemical characterization of this novel activity, and key claims made in the article are well supported by experimental data. However, the studies leave room for some healthy skepticism about the physiological significance of the unique activity of DTX3 and DTX3L described by the authors because DTX3/DTX3L can also robustly attach ubiquitin to the ADP ribose moiety of NAD or ADP-ribosylated substrates. The study could be strengthened by a more direct and quantitative comparison between ubiquitylation of unmodified oligonucleotides by DTX3/DTX3L with the ubiquitylation of ADP-ribose, the activity that DTX3 and DTX3L share with the other members of the DELTEX family. 

      Strengths: 

      The manuscript reports a novel and exciting observation that ubiquitin can be directly attached to the 3' hydroxyl of unmodified, single-stranded oligonucleotides by DTX3L. The study builds on the extensive expertise and the impactful previous studies by the Huang laboratory of the DELTEX family of E3 ubiquitin ligases. The authors perform a detailed and diligent biochemical characterization of this novel activity, and all claims made in the article are well supported by experimental data. The manuscript is clearly written and easy to read, which further elevates the overall quality of submitted work. The findings are impactful and will help illuminate multiple avenues for future follow-up investigations that may help establish how this novel biochemical activity observed in vitro may contribute to the biological function of DTX3L. The authors demonstrate that the activity is unique to the DTX3/DTX3L members of the DELTEX family and show that the enzyme requires at least two single-stranded nucleotides at the 3' end of the oligonucleotide substrate and that the adenine nucleotide is preferred in the 3' position. Most notably, the authors describe a chimeric construct containing RING domain of DTX3L fused to the DTC domain DTX2, which displays robust NAD ubiquitylation, but lacks the ability to ubiquitylate unmodified oligonucleotides. This construct will be invaluable in the future cell-based studies of DTX3L biology that may help establish the physiological relevance of 3' ubiquitylation of nucleic acids. 

      Weaknesses: 

      The main weakness of the study is in the lack of direct evidence that the ubiquitylation of unmodified oligonucleotides reported by the authors plays any role in the biological function of DTX3L. The study leaves plenty of room for natural skepticism regarding the physiological relevance of the reported activity, because, akin to other DELTEX family members, DTX3 and DTX3L can also catalyze attachment of ubiquitin to NAD, ADP ribose and ADP-ribosylated substrates. Unfortunately, the study does not offer any quantitative comparison of the two distinct activities of the enzyme, which leaves plenty of room for doubt. One is left wondering, whether ubiquitylation of unmodified oligonucleotides is just a minor and artifactual side activity owing to the high concentration of the oligonucleotide substrates and E2~Ub conjugates present in the in-vitro conditions and the somewhat lower specificity of the DTX3 and DTX3L DTC domains (compared to DTX2 and other DELTEX family members) for ADP ribose over other adenine-containing substrates such as unmodified oligonucleotides, ADP/ATP/dADP/dATP, etc. The intriguing coincidence that DTX3L, which is the only DTX protein capable of ubiquitylating unmodified oligonucleotides, is also the only family member that contains nucleic acid interacting domains in the N-terminus, is suggestive but not compelling. A recently published DTX3L study by a competing laboratory (PMID: 38000390), which is not cited in the manuscript, suggests that ADP-ribose-modified nucleic acids could be the physiologically relevant substrates of DTX3L. That competing hypothesis appears more convincing than ubiquitylation of unmodified oligonucleotides because experiments in that study demonstrate that ubiquitylation of ADP-ribosylated oligos is quite robust in comparison to ubiquitylation of unmodified oligos, which is undetectable. It is possible that the unmodified oligonucleotides in the competing study did not have adenine in the 3' position, which may explain the apparent discrepancy between the two studies. In summary, a quantitative comparison of ubiquitylation of ADP ribose vs. unmodified oligonucleotides could strengthen the study. 

      We thank the reviewer for the constructive feedback. We agree that evidence for the biological function is lacking. While we have tried to detect Ub-ssDNA/RNA from cells, we found that Isolating and detecting labile oxyester-linked Ub-ssDNA/RNA products remain challenging due to (1) low levels of Ub-ssDNA/RNA products, (2) the presence of DUBs and nucleases that rapidly remove the products during the experiments, and (3) our lack of a suitable MS approach to detect the product. For these reasons, we feel that discovering the biological function will require future effort and expertise and is beyond the scope of our current manuscript.

      In the manuscript (PMID: 38000390), the authors used PARP10 to catalyse ADP-ribosylation onto 5’-phosphorylated ssDNA/RNA. They used the following sequences which lacks 3’-adenosine, which could explain the lack of ubiquitination.

      E15_5′P_RNA [Phos]GUGGCGCGGAGACUU

      E15_5′P_DNA [Phos]GTGGCGCGGAGACTT

      We will perform the experiment using this sequence to verify this. We have cited this manuscript but for some reasons, Pubmed has updated its published date from mid 2023 to Jan 2024. We will update the Endnote in the revised manuscript.

      We agree that it is crucial to compare ubiquitination of oligonucleotides and ADPr by DTX3L to find its preferred substrate. We have challenged oligonucleotide ubiquitination by adding excess ADPr and found that ADPr efficiently competes with oligonucleotide (Figure 4D). We will perform more thorough competition experiments by titrating with increasing molar excess of either ADPr or ssDNA to examine the effect on the ubiquitination of ssDNA and ADPr, respectively.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors):

      Inclusion of other catalase, peroxidase or superoxide dismutase gene promoters (with ChiP-seq screen shots) and whether they contain sntB binding sites is important to provide other potential downstream pathways controlling oxidative stress mediated regulation of development and aflatoxin metabolism. This can be presented as supplementary material.

      or

      Some more examples of ChiP-seq peaks in the promoters of nsdC, nsdD, sclR, steA, wetA, veA, fluG, sod2, catA, catC would strengthen the paper for the reliability of the ChiP-seq data. Currently, visualisation of the ChIP-seq data is only limited to catC gene promoter, where background ChIP-seq signals are very high (Figure 5F).

      The binding region and motif of SntB on the catA, catB, sod1, and sod2 genes were shown in Figure S7 and described in lane 531-536 and 881-884. The background of ChIP-seq signals is high, but the enrich level in the ip-sntB-HA samples is significant compared to IP-WT.

      Figure 5F, letters are too small, and difficult to read. The same is true for Figure 4. Letters should be enlarged for the readers to read it without problem.

      Thanks. We have revised the Figure 5F and Figure 4. Please see these Figures.

      Reviewer #2 (Recommendations For The Authors):

      The authors fully addressed my concerns and made appropriate changes in the manuscript. The quality of the manuscript is now improved.

      Thanks. We would like to express our sincere gratitude for your affirmation and thoughtful feedback. Your positive comments have been extremely encouraging and have strengthened my confidence in my work. Your time and effort in reviewing my submission are greatly appreciated.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) Only one PITAR siRNA was tested in majority of the experiments, which compromises the validity of the results.

      We thank the reviewer for this comment. We have now used two siRNAs to demonstrate PITAR functions in various assays. In the revised manuscript, we carried out additional experiments with two siRNAs, and the results are presented in Figures 2C, D, F, G, H, I, and J; Figures 5A, B, Supplementary Figure 2B, C, D, E, and F.

      (2) Some results are inconsistent. For example, Fig 2G indicates that PITAR siRNA caused G1 arrest. However, PITAR overexpression in the same cell line did not show any effect on cell cycle progression in Fig 5I.

      The reason for the fact that PITAR silencing showed a robust G1 arrest, unlike PITAR overexpression, is as follows. Since glioma cells overexpress PITAR (which keeps the p53 suppressed), silencing PITAR (which will elevate p53 levels) in glioma cells shows a robust phenotype in cell cycle profile (in the form of increased G1 arrest). In contrast, the overexpression of PITAR in glioma cells fails to show robust changes in the cell cycle profile because glioma cells already have high levels of PITAR.

      (3) The conclusion that PITAR inactivates p53 through regulating TRIM28, which is highlighted in the title of the manuscript, is not supported by convincing results. Although the authors showed that a PITAR siRNA increased while PITAR overexpression decreased p53 level, the siRNA only marginally increased the stability of p53 (Fig 5E). The p53 ubiquitination level was barely affected by PITAR overexpression in Fig 5F.

      We disagree with the fact that PITAR silencing only marginally increased the stability of p53. In the cycloheximide experiment in Figure 5E, the half-life of p53 is increased by 60 % (50 mins to 120 mins), which is quite significant in altering the DNA damage response by p53. Further, we also want to point out that the other arm of p53 degradation by Mdm2 remains intact under these conditions. We also provide an improved p53 ubiquitination western blot in the revised version (Figure 5F). 

      (4) To convincingly demonstrate that PITAR regulates p53 through TRIM28, the authors need to show that this regulation is impaired/compromised in TRIM28-knockout conditions. The authors only showed that TRIM28 overexpression suppressed PITAR siRNA-induced increase of p53, which is not sufficient.

      We thank the reviewer. In the revised manuscript, we demonstrate that PITAR overexpression fails to inhibit p53 in TRIM28 silenced cells (Supplementary Figure 5G; Figure 5K, L, M, N).

      (5) Note that only one cell line was investigated in Fig 5.

      In revised manuscript, the impact of PITAR silencing and PITAR overexpression on p53 functions are demontsrared for one more glioma cell line (Supplemenatry Figure 5B, C, D, and E).

      (6) Another major weakness of this manuscript is that the authors did not provide any evidence indicating that the glioblastoma-promoting activities of PITAR were mediated by its regulation of p53 or TRIM28 (Fig 6 and Fig 7). Thus, the regulation of glioblastoma growth and the regulation of TRIM28/p53 appear to be disconnected.

      We would like to respectfully disagree with the reviewer on this particular point.  We have indeed provided the following evidence in the first version of the manuscript: glioblastoma-promoting activities of PITAR were mediated by its regulation of p53 or TRIM28.

      (1) To show the importance of p53:

      We show that PITAR silencing failed to inhibit the colony growth of p53-silenced U87 glioma cells (U87/shp53#1). We also show that while PITAR silencing decreased TRIM28 RNA levels in U87/shNT and U87/shp53#1 glioma cells, it failed to increase CDKN1A and MDM2 (p53 targets) at the RNA level in U87/shp53#1 cells unlike in U87/siNT cells (Supplementary Figure 6 Panels A, B, C, and D). 

      (2) To show the importance of TRIM28 and p53:

      The importance of p53 is also demonstrated in the context of patient-derived GSC lines. We demonstrate that PITAR silencing-induced reduction in the neurosphere growth (WT p53 containing patient-derived GSC line) is accompanied by a reduction in TRIM28 RNA and an increase in the CDKN1A RNA without a change in p53 RNA levels (Supplementary Figure 7 Panels A, B, C, D, and E). We also demonstrate that PITAR overexpression-induced neurosphere growth is accompanied by an increase in the TRIM28 RNA, and a decrease in CDKN1A RNA without a change in p53 RNA levels (Supplementary Figure 7 Panels F, G, H, and I). However, PITAR silencing failed to decrease neurosphere growth in mutant p53 containing GSC line (MGG8) (Supplementary Figure 7 Panels J, K, L, M, N, and F).

      (3) We show that the TRIM28 protein level is drastically reduced in small tumors formed by U87/siPITAR cells (Supplementary Figure 7 Panel E).

      (4) We show that glioma tumors formed by U87/PITAR OE cells express high levels of TRIM28 protein but reduced levels of p21 protein (Supplementary Figure 7 Panel B).

      Further, we did additional experiments to prove the importance of TRIM28.

      In the revised manuscript, we have carried out an additional experiment to prove the requirement of TRIM28 for tumor-promoting functions of PITAR overexpression. Earlier, we have shown that exogenous overexpression of PITAR promotes glioma tumor growth and imparts resistance to Temozolomide chemotherapy (Figure 7F and G; Supplementary Figure 9A and B). In the revised manuscript, we show that the tumor growth-promoting function of PITAR overexpression requires TRIM28. U87-Luc/PITAR OE cells formed a larger tumor compared to U87-Luc/VC cells (Figure 7H, and I; compare red line with blue line). U87-Luc/shTRIM28 cells formed very small-sized tumors (Figure 7H, and I; compare green line with blue line). Further, PITAR overexpression (U87-Luc/PITAR OE) was less efficient in promoting glioma tumor growth in TRIM28 silenced cells (Figure 7H, and I; compare pink line with red line). Thus, we prove that, as a whole, TRIM28 mediates the tumor growth-promoting functions of PITAR.

      (7) It is not clear what kind of message the authors tried to deliver in Fig 7F/G. Based on the authors' hypothesis, DNA-damaging agents like TMZ would induce PITAR to inactivate p53, which would compromise TMZ's anti-cancer activity. However, the data show that TMZ was very effective in the inhibition of U87 growth. The authors may need to test whether PITAR downregulation, which would increase p53 activity, have any effects on TMZ-insensitive tumors. Such results are more therapeutically relevant.

      Reviewer #1 rightly pointed out that TMZ induces PITAR expression, which should compromise TMZ's anti-cancer activity.

      We demonstrate the same as below:

      Figure 7F&G demonstrates the following two facts:1. PITAR overexpression increases the glioma-tumor growth (Figure 7G, compare red line with the blue line), 2. PITAR overexpressing glioma tumors are resistant to TMZ chemotherapy (Figure 7G, compare the pink line with the green line).

      In addition, Figure 7 F and G also demonstrate that TMZ treatment of tumors formed by U87/VC glioma cells inhibited the growth but not eliminated the tumor growth completely (compare pink line with blue line). We believe that the inability of TMZ to eliminate the tumor growth completely is because of the chemoresistance imparted by the DNA damage induced PITAR.

      Further, in Figure 2I, we indeed show that PITAR-silenced cells are more sensitive to TMZ and Adriamycin chemotherapy.

      (8) Lastly, the model presented in Fig 7H is confusing. It is not clear what the exact role of PITAR in the DNA damage response based on this model. If DNA damage would induce PITAR expression, this would lead to inactivation of p53 as revealed by this manuscript. However, DNA damage is known to activate p53. Do the authors want to imply that PITAR induction by DNA damage would help to bring down the p53 level at the end of DNA damage response? The presented data do not support this role unfortunately.

      We respect the views and questions raised by the reviewer.

      We would like explain as below the importance of our model.

      Yes, it is true that DNA damage induces p53. We show here that DNA damage also induces PITAR in a p53-independent manner, which, in turn, inhibits p53. Here is our explanation. Even though DNA damage activates p53, there exists an autoregulatory negative feedback loop that controls the extent and duration of p53 response to DNA damage (Wu et al., 1993; Haupt et al., 1997; Kubbutat, Jones and Vousden, 1997; Zhang et al., 2009).  It is proposed that the p53-Mdm2 feedback loop generates a “digital clock” that releases well-timed quanta of p53 until the damage is repaired or the cell dies (Lahave et al., 2004). In addition, it has also been shown that TRIM28, through its association with Mdm2, also contributes to p53 inactivation (Wang et al., 2005b; Czerwińska, Mazurek, and Wiznerowicz, 2017).

      Based on the above reports and our current work, we propose that DNA damage-induced PITAR, through its ability to increase the TRIM28 levels, contributes to the control of the DNA damage response of p53 along with Mdm-2. The difference is as follows: Since Mdm-2 is also a transcriptional target of p53, the p53-Mdm-2 axis is an autoregulatory negative feedback loop to control the DNA damage response by p53. In contrast, PITAR is not a transcriptional target of p53, and DNA damage-induced activation of PITAR is p53-independent. Hence, the PITAR-TRIM28 axis in controlling the DNA damage response of p53 creates an Incoherent feedforward regulatory network.  The experimental evidence provided in the revised manuscript is as follows: 1) We have already (the first version of the manuscript) shown that exogenous overexpression of PITAR significantly inhibits DNA damage-induced p53 (Figures 6A, B, C, and D). 2) In the revised manuscript, we show that the DNA damage response of p53 (duration and extent of p53 activation after a pulse of ionizing radiation) in PITAR-silenced cells follows similar kinetics in terms of duration, but the extent of p53 activation was much stronger (Supplementary figures 8H, I, J, and K).  This is because the TRIM28 component in TRIM28/Mdm-2 axis is compromised as PITAR silencing reduces the TRIM28 levels. 3) We also demonstrate that DNA damage-induced TRIM28 is dependent on PITAR (Figure 6K; Supplementary Figure 5G)

      Reviewer #1(Recommendations For The Authors):

      (1) Fig 7A, what is the explanation for the observation that tumors disappeared in most of the mice in the siPITAR group? Did the authors check if apoptosis was induced here?

      We agree to the point that the lack of tumor growth in the siPITAR group is likely due to the induction of apoptosis. We would like to point out that in vitro experiments indeed demonstrate that PITAR silencing induces apoptosis in Figure 2H and Supplementary Figure 2F.

      (2) The authors need to explain why Fig 6 used a cell line different from other experiments. It would be better to check other cell lines.

      The purpose of RG5 and MGG8 is as follows. 1) We wanted to establish the growth-promoting functions of PITAR in patient-derived GSC lines. 2) We also wanted to show the importance of WT p53 for the growth-promoting functions of PITAR.

      However, in the revised manuscript we moved this portion under the subsection “PITAR inhibits p53 protein levels by its association with TRIM28 mRNA“.

      Further,the experiments related to DNA damage induced activation of PITAR in p53-independent manner and its impact on DNA damage response by p53 is moved to a new section entitled “PITAR is induced by DNA damage in a p53-independent manner, which in turn diminishes the DNA damage response by p53”

      (3) It would be more convincing if the authors could test more p53 target genes in addition to p21.

      We thank the reviewer for this comment and the specific suggestions for checking additional p53 targets. In the revised manuscript, we have checked the MDM2 transcript levels in Supplementary Figure 6D. 

      Reviewer #2 (Recommendations For The Authors):

      (1) In the text, they mentioned " Figure 4J". There is no Figure 4J in Figure 4. It may be Figure 4K.

      We thank reviewer #2. We corrected this information in the revised manuscript.

      (2) The molecular weight markers in Western blots were missed in several Figure panels, including Figure 4J, Figure 5K, and Supple. Figure 3B, Supple. Figure 5G, H, Supple. Figures 6A and 7A.

      We thank reviewer #2, and we have included the molecular weight markers in all the mentioned figures.

    1. We would like to thank you and the reviewers for your thoughtful comments that assisted us to improve the manuscript. We carefully followed the reviewers’ recommendations and provide a detailed point-by-point account of our responses to the comments. 

      Please find below the important changes in the updated manuscript.

      (1) We changed the title according to the comments provided by reviewer #1.

      (2) We edited the introduction, results, and discussion to improve the link between the objectives of the study, the findings, and their discussion, as reviewer #2 recommended.

      (3) We clarified the link between camouflage and fitness, which is now presented as a hypothesis, as reviewer #1 suggested.

      (4) We added new analyses and figures in the main text and in the supplementary materials to better emphasize sex differences in landing force, foraging strategies and hunting success, following reviewer #1 suggestion.

      (5) According to reviewer #2 comments, we edited the results adding key information about methods to help the reader understand the findings without reading the Methods section.

      (6) We added important details about the model selection approach along with a discussion of the low R-square values reported in our analyses on hunting success, as reviewer #2 suggested.

      eLife assessment 

      This fundamental work substantially advances our understanding of animals' foraging behaviour, by monitoring the movement and body posture of barn owls in high resolution, in addition to assessing their foraging success. With a large dataset, the evidence supporting the main conclusions is convincing. This work provides new evidence for motion-induced sound camouflage and has broad implications for understanding predator-prey interactions. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In this paper, Schalcher et al. examined how barn owls' landing force affects their hunting success during two hunting strategies: strike hunting and sit-and-wait hunting. They tracked tens of barn owls that raised their nestlings in nest boxes and utilized high-resolution GPS and acceleration loggers to monitor their movements. In addition, camcorders were placed near their nest boxes and used to record the prey they brought to the nest, thus measuring their foraging success. 

      This study generated a unique dataset and provided new insights into the foraging behavior of barn owls. The researchers discovered that the landing force during hunting strikes was significantly higher compared to the sit-and-wait strategy. Additionally, they found a positive relationship between landing force and foraging success during hunting strikes, whereas, during the sit-and-wait strategy, there was a negative relationship between the two. This suggests that barn owls avoid detection by generating a lower landing force and producing less noise. Furthermore, the researchers observed that environmental characteristics affect barn owls' landing force during sit-and-wait hunting. They found a greater landing force when landing on buildings, a lower landing force when landing on trees, and the lowest landing force when landing on poles. The landing force also decreased as the time to the next hunting attempt decreased. These findings collectively suggest that barn owls reduce their landing force as an acoustic camouflage to avoid detection by their prey. 

      The main strength of this work is the researchers' comprehensive approach, examining different aspects of foraging behavior, including high-resolution movement, foraging success, and the influence of the environment on this behavior, supported by impressive data collection. The weakness of this study is that the results only present a partial biological story contained within the data. The focus is on acoustic camouflage without addressing other aspects of barn owls' foraging strategy, leaving the reader with many unanswered questions. These include individual differences, direct measurements of owls' fitness, a detailed analysis of the foraging strategy of males and females, and the collective effort per nest box. However, it is possible that these data will be published in a separate paper. 

      We greatly appreciate your recognition of the comprehensive approach and extensive data collection. Our primary objective was to study the role of acoustic camouflage. Nonetheless, the manuscript now includes a detailed analysis of the foraging strategy and hunting success of males and females (lines 164-225).

      The results presented support the authors' conclusion that lower landing force during sit-andwait hunting increases hunting success, likely due to a decreased probability of detection by their prey, resulting in acoustic camouflage. The authors also argue that hunting success is crucial for survival, and thus, acoustic camouflage has a direct link to fitness. While this statement is reasonable, it should be presented as a hypothesis, as no direct evidence has been provided here.

      Thank you for the comment. We agree and thus have edited the language accordingly.  

      However, since information about nestling survival is typically monitored when studying behavior during the breeding period, the authors' knowledge of the effect of acoustic camouflage on owls' fitness can probably be provided. Furthermore, it will be interesting to further examine the foraging strategies used by different individuals during foraging, the joint foraging success of both males and females within each nest box, and the link between landing force and foraging success if the data are available.

      We are currently writing a manuscript on these topics. We are aware that several scientific questions regarding the foraging ecology of the barn owl still need our attention. Regarding the link between landing force and foraging success, we believe that our revised manuscript addresses this specific topic, please see specific responses below.

      However, even without this additional analysis on survival, this paper provides an unprecedented dataset and the first measurement of landing force during hunting in the wild. It is likely to inspire many other researchers currently studying animal foraging behavior to explore how animals' movements affect foraging success.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide new evidence for motion-induced sound camouflage and can link the hunting approach to hunting success (detailing the adaptation and inferring a fitness consequence). 

      Strengths: 

      Strong evidence by combining high-resolution accelerometer data with a ground-truthed data set on prey provisioning at nest boxes. A good set of co-variates to control for some of the noise in the data provides some additional insights into owl hunting attempts. 

      Weaknesses: 

      There is a disconnect between the hypotheses tested and the results presented, and insufficient detail is provided on the statistical approach. R2 values of the presented models are very small compared to the significance of the effect presented. Without more detail, it is impossible to assess the strength of the evidence.

      In the revised manuscript, we changed the way results are presented and we improved the link between the hypotheses and the results. The R2 values are indeed small. It is however important to keep in mind that we are assessing the outcome of one specific behavior (i.e. landing force during sit-and-wait hunts) on hunting success in a wild environment, where many complex ecological interactions likely influence hunting success. Nonetheless, the coefficients (as reported in the results) show that for every 1 N increase in landing force, there is a 15% reduction in hunting success, which is substantial. In the discussion we also note that 50 Hz is a relatively low sampling frequency for estimating the peak ground reaction force. We have gone back over the presentation of our results and made our discussion more nuanced to acknowledge this aspect. 

      We have also added a detailed description about our model selection process in the methods section and provide a model selection table for each analysis in the supplementary materials.

      The authors seem to overcome persisting challenges associated with the validation and calibration of accelerometer data by ground-truthing on-board measures with direct observations in captivity, but here the methods are not described any further and sample sizes (2 owls - how many different loggers were deployed?) might be too small to achieve robust behavioural classifications.

      Thank you for the comment. Details of our methods of behavioural identification are provided in lines 385 – 429. There are two reasons why our results should not be limited by the sample size. First, we used the temporal sequence of changes in acceleration, and rates of change in acceleration data, which make the methods robust to individual differences in acceleration values. Furthermore, our methods for behavioural identification were not based on machine learning. Instead, we use a Boolean based approach (as described in Wilson et al. 2018. MEE), which is more robust to small differences in absolute values that might occur e.g. in relation to slight changes in device position. 

      Recommendation for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Comment 1. This study provides new insights into animals' foraging behavior and will probably inspire other researchers to examine foraging behavior in such high resolution.

      We hope so, thank you.

      Comment 2. However, it is necessary to describe better the measured landing force and the hunting strike and perching behavior so the readers can understand these methods when reading the results (and without reading the Methods).

      We have now changed the text in the “Results” to help the reader understand the key methods while reading the results.

      Comment 3. In addition, make sure you use the same terminology for hunting strategies during the entire paper and especially in all figures and corresponding result descriptions.

      We now use consistent terminology throughout the text and figures. We hope that this is now clear in the revised manuscript.

      Comment 4. In addition, although I find your statement about the link between acoustic camouflage and fitness reasonable, it should be described as a hypothesis or examined if you want to keep the direct link statement. I believe showing a direct link can add an additional outstanding aspect to this paper, but I also understand that it can be addressed in a separate paper.

      We agree that the relationship between hunting success and barn owl fitness is an important topic, but it necessitates a consideration of both hunting strategies, including hunting on the wing, which extends beyond the limits of our current study. Indeed, our primary objective was to conduct a detailed examination of the interplay between acoustic camouflage and the success of the sit-and-wait technique.

      However, we have edited the manuscript to explicitly describe the link between acoustic camouflage and fitness as a hypothesis. We believe this adjustment provides a more accurate representation of our approach. We hope this clarifies the specific emphasis of our work and its contribution to the understanding of barn owl hunting behavior.

      Here are my detailed comments about the paper: 

      Comment 5. Title: Consider changing the title to "Acoustic camouflage predicts hunting success in a wild predator." 

      We would like to thank you for your nice proposition. However, we opted for a different title, which is now “Landing force reveals new form of motion-induced sound camouflage in a wild predator”.

      Comment 6. Line 91-93: Please provide additional information about the collected dataset, including: 

      Description of the total period of observations, an average and standard deviation of perching and hunting attempt events per individual per night, number of foraging trips per individual per night, details about the geographic location and characteristics of the habitat, season, and reproductive state. 

      The revised manuscript now includes detailed information about the collected dataset (i.e. study area, reproductive state, etc…). “We used GPS loggers and accelerometers to record high resolution movement data during two consecutive breeding seasons (May to August in 2019 and 2020) from 163 wild barn owls (79 males and 84 females) breeding in nest boxes across a 1,000 km² intensive agricultural landscape in the western Swiss plateau.” Results section, lines 79 – 82

      Details about the number of foraging trips per individuals and per night are now presented in the results: “Sexual dimorphism in body mass was marked among our sampled individuals. Males were lighter than females (84 females, average body mass: 322 ± 22.6 g; 79 males, average body mass 281 ± 16.5 g, Fig S6) and provided almost three times more prey per night than females (males: 8 ± 5 prey per night; females: 3 ± 3 prey per night; Fig.S7). Males also displayed higher nightly hunting effort than females (Males: 46 ± 16 hunting attempts per night, n= 79; Females: 25 ± 11 hunting attempts per nights, n=84; Fig. 3A, Fig S8). However, females were more likely to use a sit and wait strategy than males (females: 24% ± 15%, males: 13% ± 10%, Fig.S9). As a result, the number of perching events per night was similar between males and females (Females: 76 ± 23 perching events per nights; Males: 69 ± 20 perching events per night; Fig S8).” (lines 165 – 174) 

      Comment 7. In addition, state if the information describes breeding pairs of males and females and provides statistics on the number of tracked pairs and the number of nest boxes.

      The revised manuscript now includes a description of the number of tracked breeding pairs and the number of nest boxes. “Of these individuals, 142 belonged to pairs for which data were recovered from both partners (71 pairs in total, 40 in 2019, 31 in 2020). The remaining 21 individuals belonged to pairs with data from one partner (11 females and 1 male in 2019; 4 females and 5 males in 2020).” (lines 82 – 85.)

      Comment 8. Line 93: Briefly define the term "landing force" and explain how it was measured (and let the reader know that there is a detailed description in the Methods).

      We now include a brief definition of the “landing force” along with a brief explanation of how it was measured in the results section. “We extracted the peak vectoral sum of the raw acceleration during each landing and converted this to ground reaction force (hereafter “landing force”, in Newtons) using measurements of individual body mass (see methods for detailed description).” (lines 92 – 95).

      Comment 9. Line 94: All definitions, including "pre-hunting force," need to be better described in the Results section.

      Thank you for this suggestion. We now provided a better description of those key definitions directly in the results section: 

      Measurement of landing force: “Barn owls employing a sit-and-wait strategy land on multiple perches before initiating an attack, with successive landings reducing the distance to the target prey (Fig. 2C). 

      We used the acceleration data to identify 84,855 landings. These were further categorized into perching events (n = 56,874) and hunting strikes (n = 27,981), depending whether barn owls were landing on a perch or attempting to strike prey on the ground (Fig. 1A and B, see methods for specific details on behavioral classification).” (lines 88 – 95)

      Pre-hunt perching force predicts hunting success: “Finally, we analyzed whether the landing force in the last perching event before each hunting attempt (i.e. pre-hunt perching force) predicted variation in hunting success” (lines 229 – 230)

      Comment 10. Line 102: Remove "Our analysis of 27,981 hunting strikes showed that" and add "n = 27,981" after the statistics. You have already stated your sample size earlier. There is no need to emphasize it again, although your sample size is impressive.

      We modified the text in the results section as suggested.

      Comment 11. Line 104: The results so far suggest that the difference in landing force between males and females is an outcome of their different body masses. However, it is not clear what is the reason for the difference in the number of hunting strike attempts between males and females (Lines 104-106). Can you compare the difference in landing force between males and females with similar body mass (females from the lower part of the distribution and males from the upper part)? Is there still a difference?

      Thank you, following your comment we made some new analyses that clarified the situation around landing force involved in perching and hunting strike events between sexes. But firstly, we wanted to clarify why there is a difference in number of hunting attempts between males and females. During the breeding season, females typically perform most of the incubation, brooding, and feeding of nestlings in the nest, while the male primarily hunts food for the female and chicks. The female supports the male providing food in a very irregular way, and this changes from pair to pair (paper in prep.). The differences in number of hunting attempts between males and females reflects this asymmetry in food provisioning between sexes during this specific period. We specified this in the revised version of the manuscript (lines 164 – 174). 

      We also provide a new analysis to investigate sex differences in mass-specific landing force (force/body mass). We found that males and females produce similar force per unit of body mass during perching events. This demonstrates that the overall higher perching force in females (see Fig. 4C in the manuscript) is therefore driven by their higher body mass. (lines 194 – 199)

      Comment 12. Line 154: I believe Boonman et al. (2018) is relevant to this part of the discussion. Boonman, Arjan, et al. found that barn owl noise during landing and taking off is worth considering. ["The sounds of silence: barn owl noise in landing and taking off."

      Behavioral Processes 157 (2018): 484-488.]

      We now cited this paper in the discussion.

      Comment 13. Line 164: Your results do not directly demonstrate a link to fitness, although they potentially serve as a proxy for fitness (add a reference). However, you might have information regarding nestlings' survival - that will provide a direct link for fitness. Change your statement or add the relevant data.

      We appreciated your feedback, and we adjusted the language accordingly.

      Comment 14. Line 213: If the poles are closer to the ground - is it possible that the higher trees and buildings serve for resting and gathering environmental information over greater distances? For example, identifying prey at farther distances or navigating to the next pole?

      Yes, this is indeed the most likely explanation for the fact that owls land more on buildings and trees than on poles until the last period (about 6 minutes) before hunting. In these last minutes, barn owls preferentially use poles, as we showed in figure 2B. The revised manuscript now includes this explanation in the discussion (lines 269 – 284).

      Comment 15. Line 250: The product "AXY-Trek loggers" does not appear on the Technosmart website (there are similar names, but not an exact match). Are you sure this is the correct name of the tracking device you used? 

      Thank you for pointing out this detail that we missed. The device we used is now called "AXY-Trek Mini" (https://www.technosmart.eu/axy-trek-mini/). We have corrected this error directly in the revised manuscript.

      Comment 16. Line 256: Please explain how the devices were recovered. Did you recapture the animals? If so, how? Additionally, replace "after approximately 15 days" with the exact average and standard deviation. Furthermore, since you have these data, please state the difference in body mass between the two measurements before and after tagging.

      The birds were recaptured to recover the devices. Adults barn owls were recaptured at their nest sites, again using automatic sliding traps that are activated when birds enter the nest box. The statement "after approximately 15 days" was replaced by the exact mean and standard deviation, which were 10.47 ± 2.27 days. Those numbers exclude five individuals from the total of 163 individuals included in this study. They could not be recaptured in the appropriate time window but were re-encountered when they initiated a second clutch later in the season (4 individuals) or a new clutch the year after (1 individual).

      We integrated this previously missing information in the revised manuscript (lines 370 – 372).

      Comment 17. Line 259: What was the resolution of the camera? What were the recording methods and schedule? How did you analyze these data? 

      The resolution was set to 3.1 megapixel. Motion sensitive camera traps were installed at the entrance to each nest box throughout the period when the barn owls were wearing data loggers, and each movement detected triggered the capture of three photos in bursts. The photos recorded were not analyzed as such for this study, but were used to confirm each supply of prey, which had previously been detected from the accelerometer data. We added these details in the revised manuscript (lines 377 – 380)

      Comment 18_1. Figure 1: 

      Panel A) Include the sex of the described individual. 

      The sex of the described individual is now included in the figure caption.

      Comment 18_2. It would be interesting to show these data for both males and females from the same nest box (choose another example if you don't have the data for this specific nest box). 

      Although we agree that showing tracks of males and females from the same nest is very interesting, the purpose of this figure was to illustrate our data annotation process and we believe that adding too many details on this figure will make it appear messy. However, the revised manuscript now includes a new figure (Fig. 3A) which shows simultaneous GPS tracks of a male and a female during a complete night, with detailed information about perching and hunting behaviors.

      Comment 18_3. Add the symbol of the nest box to the legend. 

      Done

      Comment 18_4. Provide information about the total time of the foraging trip in the text below. 

      The duration of the illustrated foraging trip has been included in the figure caption.

      Comment 18_5. To enhance the figure’s information on foraging behavior, consider color coding the trajectory based on time and adding a background representing the landscape. Since this paper may be of interest to researchers unfamiliar with barn owl foraging behavior, it could answer some common questions. 

      For similar reasons explained in our answer above (Comment 18_2), we would rather keep this figure as clean as possible. However, we followed your recommendations and included these details in the new Figure 3 described above. In this new figure, GPS tracks are color coded according to the foraging trip number and includes a background representing the landscape. To provide even more detail about the landscape, we added another figure in the supplementary materials (Fig. S2) which provides illustration of barn owls foraging ground and nest site that we think might be of interest for people unfamiliar with barn owls.

      Comment 18_6. Inset panels) provide a detailed description of the acceleration insert panels. 

      Done

      Comment 18_7. Color code the acceleration data with different colors for each axis, add x and y axes with labels, and ensure the time frame on the x-axis is clear. How was the self-feeding behavior verified (should be described in the methods section)? 

      We kept both inset panels as simple as possible since they serve here as examples, but a complete representation of these behaviors (with time frame, different colors and labels) is provided in the supplementary materials (figure S3). We included this statement in the figure caption and added a reference to the full representations from the supplementary materials: 

      In the Figure caption: “Inset panels show an example of the pattern of the tri-axial acceleration corresponding to both nest-box return and self-feeding behaviors (but see Fig S3for a detailed representation of the acceleration pattern corresponding to each behavior).” 

      In the Method section: “Self-feeding was evident from multiple and regular acceleration peaks in the surge and heave axes (resulting in peaks in VeDBA values > 0.2 g and < 0.9 g, Fig.S3D), with each peak corresponding to the movement of the head as the prey was swallowed whole.”.

      Comment 18_8. Panel B) Note in the caption that you refer to the acceleration z-axis.

      We believe that keeping the statement “the heave acceleration…” in the figure caption is more informative than referring to the “z-axis” as it describes the real dimension to which we are referring. The use of the x, y and z axes can be misleading as they can be interchanged depending on the type and setting of recorders used.

      Comment 18_9. Present the same time scale for both hunting strategies to facilitate comparison. You can achieve this by showing only part of the flight phase before perching. 

      Done

      Comment 18_10. Panel C) Presenting the data for both hunting strategy and sex would provide more comprehensive information about the results and would be relatively easy to implement. 

      We agree with your comment. We present the differences in landing force for both landing contexts and sexes in the new Figure 3 as well as in the supplementary materials (Figure S10) of this revised manuscript.

      Comment 19. Figure 2: Please provide an explanation of the meaning of the circles in the figure caption.  

      Done

      Comment 20. Figure 3: 

      Panel A) It is unclear how the owl illustration is relevant to this specific figure, unlike the previous figures where it is clear. Also, suggest removing the upper black line from the edge of the figure or add a line on the right side. 

      Done (now in Figure 2).

      Panel B) "Density" should be capitalized. 

      Done

      Panel C) Add a scale in meters, and it would be helpful to include an indication of time before hunting for each data point. 

      Done

      Comment 21. Figure S1: Mark the locations of the nest boxes and ensure that trajectories of different individuals and sexes can be identified. 

      The purpose of this figure was to show the spatial distribution of the data. We think that adding nest locations and coloring the paths according to individuals and/or sex will make the figure less clear. However, the new Figure 3 highlights those details.

      Comment 22. Figure S2: Show the pitch angle similarly to how you showed the acceleration axes, and explain what "VeDBA" stands for. Provide a description of the perching behavior, clearly indicating it on the figure. Add axes (x, y, z) to the illustration of the acceleration explanation. 

      We edited this figure (now figure S3) to show the pitch angle and provide an explanation of what “VeDBA” stands for in the figure caption. The figure caption now also provides a better description of the perching behavior. For the axes (i.e. X, Y, Z), we prefer to refer to the heave, surge, and sway as this is more informative and refers to what is usually reported in studies working with tri-axial accelerometers.

      Comment 23. Table S1: Improve the explanation in the caption and titles of the table. 

      Done

      Reviewer #2 (Recommendations For The Authors): 

      Comment 1. From the public review and my assessment there, the authors can be assured that I thoroughly enjoyed the read and am looking forward to seeing a revised and improved version of this paper. 

      We thank the reviewer for this comment. We revised the manuscript according to their comments.

      Comment 2. In addition to my major points stated above, I would like to add the following recommendations: 

      The manuscript is overall well written, but it uses a very pictorial language (a little as if we were in a David Attenborough documentary) that I find inappropriate for a research paper (especially in the abstract and introduction, "remarkable" (2x), "sophisticated" (are there any unsophisticated adaptations? We are referring to something under selection after all) etc.

      We appreciated that you found the paper overall well written, and we understand the comment about pictorial language. We therefore slightly changed the text to make sure that the adjective used to describe adaptive strategies are not over-emphasized.

      Comment 3. Abstract 

      "While the theoretical benefits of predator camouflage are well established, no study has yet been able to quantify its consequences for hunting success." - This claim is actually not fully true: 

      Nebel Carina, Sumasgutner Petra, Pajot Adrien and Amar Arjun 2019: Response time of an avian prey to a simulated hawk attack is slower in darker conditions, but is independent of hawk colour morph. Soc. open sci.6:190677 

      We edited our claim to specify that the consequences of predator camouflage on hunting success has never been quantified in natural conditions and cited the reference in the introduction.

      Comment 4. Line 23. Rephrase to: "We used high-resolution movement data to quantify how barn owls (Tyto alba) conceal their approach when using a sit-and-wait strategy, as well as the power exerted during strikes." 

      We edited this sentence in the abstract, as suggested.

      Comment 5. Results 

      There is a disconnect between the objectives outlined at the end of the introduction and the following results that should be improved. 

      The authors state: "Using high-frequency GPS and accelerometer data from wild barn owls (Tyto alba), we quantify the landing dynamics of this sit-and-wait strategy to (i) examine how birds adjust their landing force with the behavioral and environmental context and (ii) test the extent to which the magnitude of the predator cue affects hunting success." But one of the first results presented are sex differences. 

      This is a fair point. We have now changed our statement in the end of the introduction as well as the order of the results to improve the link between the objectives outlined in the introduction and the way result are presented. 

      Comment 6. At this stage, the reader does not even know yet that we are presented with a size-dimorphic species that also has very different parental roles during the breeding season. This should be better streamlined, with an extra paragraph in the introduction. And these sex differences are then not even discussed, so why bring them up in the first place (and not just state "sex has been fitted as additional co-variate to account for the size-dimorphism in the species" without further details). 

      We edited the way the objectives are outlined in the introduction to cover the size dimorphism (lines 70 – 76). We also completely changed the way the sex differences are presented in the results, including a new analysis that we believe provides a better comprehensive understanding of barn owl foraging behavior (lines 164 – 206). Finally, we added a new paragraph in the discussion to consider those results (lines 319 – 339).

      Comment 7. It is not clear to me where and how high-resolution GPS data were used? The results seem to concentrate on ACC – why GPS was used and how it features should be foreshadowed in a few lines in the introduction. I definitively prefer having the methods at the end of a manuscript, but with this structure, it is crucial to give the reader some help to understand the storyline. 

      GPS data were used to validate some behavioral classifications (prey provisioning for example), but most importantly they were used to link each landing event with perch types. We edited the text in the result section to clarify where GPS and/or ACC data were used.

      Comment 8. Discussion 

      Move the orca example further down, where more detail can be provided to understand the evidence. 

      After our extensive edits in the discussion, we felt this example was interrupting the flow. We now cite this study in the introduction. 

      Comment 9. Size dimorphism and evident sex differences are not discussed. 

      The revised manuscript now includes a new paragraph in the discussion in which sex differences are discussed (lines 319 – 339).

      Comment 10. Be more precise in the terminology used (for example, land use seems to be interchangeable with habitat characteristics?). 

      We modified “land use” with “habitat data” in the revised manuscript.

      Comment 11. Methods 

      Please provide a justification for the very high weight limit (5%; line 256). This limit is outdated and does not fulfill the international standard of 3% body weight. I assume the ethics clearance went through because of the short nature of the study (i.e., the birds were not burdened for life with the excess weight? But a line is needed here or under the ethics considerations to clarify this). 

      The 5% weight limit was considered acceptable due to the short deployment period, and we now edited the ethics statement to emphasize this point. However, it is important to note that there is no real international standard, with both 3% and 5% weight limits being commonly used. Both limits are arbitrary and the impact of a fixed mass on a bird varies with species and flight style. All owls survived and bred similarly to the non-tagged individuals in the population (lines 373 – 376 & lines 558 – 561)

      EDITORIAL COMMENT: We strongly encourage you to provide further context and clarification on this issue, as suggested by the Reviewer. On a related point, the ethics statement refers to GPS loggers, rather than GPS and ACC devices; we encourage you to clarify wording here.

      Thank you for highlighting this point that indeed needed some clarifications.

      Although we have used the terminology "GPS recorders", the authorization granted by the Swiss authorities for this study effectively covers the entire tracking system, which combines both GPS and ACC recorders in the same device. We have therefore changed the wording used in the ethics statement to avoid any misunderstanding (lines 373 – 376 & lines 558 – 561)

      Comment 12. Please provide more information on the model selection approach, what does "Non-significant terms were dropped via model simplification by comparing model AIC with and without terms." mean? Did the authors use a stepwise backward elimination procedure (drop1 function)? Or did they apply a complete comparison of several candidate models? I think a model comparison approach rather than stepwise selection would be more informative, as several rather than only one model could be equally probable. This might also improve model weights or might require a model averaging procedure - current reported R2values are very small and do not seem to support the results well. 

      We apologize for the lack of details about this important aspect of the statistical analysis. We applied an automated stepwise selection using the dredge function from the R package “MuMin”, therefore applying a complete comparison of several candidate models. The final models were chosen as the best models since the number of candidate models within ∆AIC<2 was relatively low in each analysis and thus a model averaging was not appropriate here. We edited the methods section to ensure clarity, and added model selection tables for each analysis, ranked according to AICc scores, in the supplementary materials (lines 532 – 552)

      In addition, we agree that the reported R-squared values in our analyses are quite low, specifically regarding the influence of pre-hunt perching force on hunting success (cond R2 = 0.04). Nonetheless, landing impact still has a notable effect size (an increase of 1N reduces hunting success by 15%). The reported values are indicative of the inherent complexity in studying hunting behavior in a wild setting where numerous variables come into play. We specifically investigated the hypothesis that the force involved during pre-hunt landings, and consequently the emitted noise, influences the success of the next hunting attempt in wild barn owls. Factors such as prey behavior and micro-habitat characteristics surrounding prey (such as substrate type and vegetation height) are most likely to be influential but hard, or nearly impossible, to model. We now cover this in a more nuanced way in the discussion (lines 266 – 268)

      Comment 13. Please explain why BirdID was nested in NightID - this is not clear to me.

      Probably here there is a misunderstanding because we wrote that we nested NightID in BirdID (and not BirdID in NightID). 

      Comment 14. I hope the final graphs and legends will be larger, they are almost impossible to read. 

      We enlarged the graphs and legends as much as possible to improve readability. However, looking at the graphs in the published version they seem clear and readable.

      Comment 15. Figure S1: Does "representation" mean the tracks don't show all of the 163 owls? If so, be precise and tell us how many are illustrated in the figure. 

      Figure S1 represent the tracks for each of the 163 barn owls used in the study. We changed the terminology used in the figure caption to avoid any misunderstanding.

      Comment 16. Figure S4: Please adjust the y-axis to a readable format. 

      Done

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1 comments:

      (1) SY1 aggregation enhances (in terms of number of aggregates) when Sphingolipid biosynthesis is blocked.

      a. Line no 132-133: I agree that there is circumstantial evidence that the maturation pathway of SY1 IB is perturbed by knocking down sphingolipid biosynthesis. However, to prove this formally, a time course of IB maturation needs to be reported in the knock-down strains.

      Please see Figure 2-figure supplement 1 for the time course of SY1 IB maturation in the knock-down strains. We have added the result to the manuscript, please see lines 129-131on page 5 in the revised version.

      b. It will be good to have formal evidence that sphingolipids are indeed downregulated when these genes are downregulated (knocked down).

      This issue has been clearly evidenced in previous reports, and we have added the appropriate references in the main text. For example, down-regulation of LCB1 or SPT in yeast decreased sphingolipid levels by Huang et al (https://doi.org/10.1371/journal.pgen.1002493). According to the report from Tafesse FG, et al (https://doi.org/10.1371/journal.ppat.1005188), in mammalian cells in which Sptlc2 was knocked down by CRISPR/Cas9, sphingolipid and glucosylceramide production is almost completely blocked. In addition, the levels of sphingosine, sphingomyelin, and ceramide were significantly lower compared to control cells. Please see lines 143-144 on pages 6 and lines 232-233 on pages 9 in the revised version.

      (2) In a normal cell (where sphingolipid biosynthesis is not hampered), the aggregate of SY1 (primarily the Class I aggregate) is localized only on the mitochondrial endomembrane system. These results have been published for other aggregation-prone proteins and are partly explained in the literature. However, their role in the context of maturation is relatively unclear. The authors however provide no strong evidence to show if mitochondria are preferentially involved in any of the stages of IB maturation. Specifically:

      a. Line 166-167: It is not clear from Figure 4B that this is indeed the case. Only the large IB seems to colocalize in all three panels (Class I, 2, 3) with Mitotracker. The smaller IBs in 2 and 3 do not show any obvious co-localization. It is also possible that they do co-localize, but it is not clear from the images. I would appreciate it if the authors either provide stronger evidence (better image) or revise this statement. This point is crucial in some claims made later in the manuscript. (pls see comment #5A).

      Based on the reviewer's suggestion, we replaced the images in Figure 4B. In addition, we added the 3D reconstruction results of the interrelationship between Class 3 and Mitotracker in Figure 4-figure supplement 1B, to further show their relationship.

      (3) The localization is due to the association of SY1 (aggregates) with mitochondrial proteins like Tom70, Tim44 etc. There are some critical points (that can strengthen the manuscript) that are not addressed here. Primarily, the important role of mitochondria in the context of toxicity is neglected. Although the authors have mentioned in the discussion that it was not their main focus, I believe that this is the novel part of the manuscript and this part is potentially a beautiful addition to literature. The questions I found unanswered are:

      a. Is the localization completely lost upon deleting these genes? I see only a partial loss in shape/localization. This is not properly explained in the manuscript. The shape of the IB seems to remain intact while the localization is slightly altered. This indicates that even when sphingolipid is present, SY1 localization is dictated by the (lipid-raft embedded) proteins. Interestingly, it shows that even in the absence of mitochondrial localization the shape of the aggregates is not altered in these deletion strains! How do the authors explain this if mitochondrial surface sphingolipids are important for IB maturation? (the primary screen found that sphingolipid biosynthesis promotes the formation of Class I IBs).

      We agree that mutation in one mitochondrial binding protein only a partial loss in shape/localization, and we have replaced “association” with “surrounding” in the manuscript. Please see lines 163-166 on page 6 in the revised version. In mutants that interact with SY1, we counted the proportion of Class 3 aggregates formed by SY1 and found an increase in the proportion of SY1 Class 3 aggregates in the deletion mutants compared to controls, partially lost interaction of SY1 with mitochondria has effect on shape of aggregates, as detailed in line 184 on page 7 and Figure 4-figure supplement 1D. We think that SY1 interactions with mitochondrial proteins are important for the localization of SY1 IB in mitochondria, whereas sphingolipids play an important role in facilitating the formation of Class 1 IBs from Class 3 aggregates.

      b. What happens to the toxicity when the aggregates are not localized on mitochondria?

      We thank the reviewer for the comments, however to investigate this issue, since a single mutant can only partially affect the phenotype, it may be necessary to construct groups of mutants of different genes to observe the effect, which we will further elucidate in our future studies. What we want to show in this work is that SY1 achieves binding to mitochondria by interacting with these mitochondrial proteins.

      c. It is important to note that sphingolipids may affect the whole process indirectly by altering pathways involved in protein quality control or UPR. UPR may regulate the maturation of IBs. It is therefore important to test if any of the effects seen could be of direct consequence.

      We agree with the reviewer's comments, but there was no significant enrichment for protein quality control or UPR-related pathways in our genome-wide screen, so it is unlikely that sphingolipids indirectly cause maturation of IBs by affecting these two pathways. We addressed this issue in our discussion. Please see lines 325-328 on page 12 in the revised version.

      d. In Figure 4D, the authors find SY1 when they pull down Tom70, Tom37 or Tim44. Tim44 is a protein found in the mitochondrial matrix, how do the authors explain that this protein is interacting with a protein outside the mitochondrial outer membrane?

      This interaction could be potentially due to that some of the soluble SY1 enter the mitochondrial matrix and interact with Tim44.

      e. Is it possible that the authors are immunoprecipitating SY1 since IBs have some amount of unimported mitochondrial proteins in aggregates formed during proteotoxic stress (https://doi.org/10.1073/pnas.2300475120) (Liu et al. 2023).

      Our Co-IP experiments were performed in the soluble state supernatant, so mitochondrial proteins in aggregates were not detected.

      f. Line 261 (Discussion): Does deletion of Tom70 or one of the anchors increase Class III aggregation and increase toxicity? Without this, it is hard to say if mitochondria are involved in detoxification.

      We thank the reviewer for the comments, please see our response to comment 3b.

      (4) This fuels the loss of mitochondrial function.

      a. Line 218-219: Although the change is significant, the percentage change is very slight. Is this difference enough to be of physiological relevance in mitochondrial function? In our hands, the DCF fluorescence is much more variable.

      We agree with the reviewer that there is a small difference (but significant). To which extend such a difference be of physiological relevance in mitochondrial function need to be further investigated.

      b. Is SY1-induced loss of mitochondrial function less in knockouts of Tom70 or the other ones found to be important for localizing the SY1 aggregate to mitochondria?

      We examined mitochondrial membrane potential (indicated by Rho 123 fluor intensity) in tom70Δ, tom37Δ and control his3Δ strains and found that the knocking out of Tom70 or Tom37 reduced the mitochondrial toxicity caused by SY1 expression. Please see lines 212-214 on page 8 in the revised version, and Figure 5-figure supplement 2.

      (5) Mitochondrial function is further abrogated when there is a block in sphingolipid biosynthesis.

      a. Myriosin acted like the deletion strains that showed less structured aggregates. There were more aggregates (Class 3) but visually they seemed to be spread apart. The first comment (#2A) on aggregate classes and their interaction with mitochondria may become relevant here.

      According to a recent review article (https://doi.org/10.3389/fcell.2023.1302472), sphingolipids are present in the mitochondrial membrane, bind to many mitochondrial proteins and have emerged as key regulators of mitochondrial morphology, distribution and function. Dysregulation of sphingolipid metabolism in mitochondria disrupts many mitochondrial processes, leading to mitochondrial fragmentation, impaired bioenergetics and impaired cellular function. Myriocin treatment, which affects sphingolipid metabolism, causes mitochondria to become more fragmented, which may explain why the aggregates appear visually spread apart. Regarding the interaction with mitochondria, we counted the proportion of SY1 aggregates surrounded by mitochondria after treatment with myriocin, and the results were not significantly different compared to the control. Please see lines 168-169 on page 6 in the revised version, and Figure 4-figure supplement 1C.

      (6) A similar phenomenon is conserved in mammalian cell lines.

      a. Line 225-226: Did the authors confirm that this was the only alteration in the genome? Or did they complement the phenotype, genetically?

      We performed SPTLC2 gene complementation experiments in knockout cell lines and found that SPTLC2 gene complementation was able to reduce the number of cells forming IBs and the percentage of dispersed irregular IBs compared to controls. Please see lines 240-242 on page 9 in the revised version, and Figure 6-figure supplement 2B.

      b. Line 241-245: One of the significant phenotypes observed by downregulating sphingolipid biosynthesis in yeast and mammalian cells, was the increase in the number of aggregates. This is not shown in myriocin treatment in mammalian cells. This needs to be shown to the main concordance with the original screen and the data presented with the KO mammalian cell line.

      Please see Figure 7-figure supplement 1A for the data on the proportion of cells forming SY1 IBs after myriocin treatment in mammalian cells, and myriocin treatment in mammalian cells was the same as in the KO mammalian cell line.

      Minor Comments:

      Line 273-275: How is this statement connected to the previous statement? Was it observed that aggregate fusion was advantageous to the cells?

      Yes, aggregate/oligomer fusion is advantageous to the cells, and we have modified the previous statement. Please see line 280 on page 10 in the revised version.

      Line 293-294: I am not sure I understand this statement.

      We have modified this statement. Please see lines 302-303 on page 11 in the revised version.

      Line 295-296: But the authors have commented at multiple places that mitochondria detoxify the cell from SY1 aggregates. I find this link fascinating and worth investigating. Most of the current work has some known links in literature (not everything). The mitochondrial connection being the most fascinating one.

      We have removed this sentence. We have added a validation experiment for the role of mitochondrial activity in SY1 IB maturation in the revised version.

      Line 318: Do the authors mean: The open question is...

      Thanks to the reviewer, we have corrected it.

      Response to Reviewer #2 comments:

      I recommend considering live cell microscopy to analyze whether sphingolipid-dependent formation of SY1 IB takes place at the mitochondrial outer membrane. The IBs could also be produced at other membranes and then transported to the mitochondrial outer membrane for storage.

      As shown in Figure 4A, SY1 IB primarily interacts with mitochondria.

      I recommend analyzing whether mitochondrial activity is needed for sphingolipid-dependent SY1 IB formation. Are these IBs localized to mitochondrial membrane solely as scaffold or are these organelles needed to provide the energy for driving IB formation in concert with sphingolipids? This point could be addressed with rho0 strains lacking mitochondrial DNA.

      We thank the reviewer for this recommendation. We expressed SY1 protein in BY4741 rho0 strain as suggested and found that the maturation and mitochondrial surrounding state of SY1 IB was not affected by mitochondrial activity. Please see lines 185-187 on page 7 in the revised version, and Figure 4-figure supplement 1E and 1F.

      The authors should be more precise in the statistical methods used in their study (method, pre-/post-tests, number of replicates...).

      We thank the reviewer for the comment and we have provided a more precise description of the statistical methods. Please see lines 531-534 on page 19 and figure legends in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting study that utilizes a novel epigenome profiling technology (single molecule imaging) in order to demonstrate its utility as a readout of therapeutic response in multiple DIPG cell lines. Two different drugs were evaluated, singly and in combination. Sulfopin, an inhibitor of a component upstream of the MYC pathway, and Vorinostat, an HDAC inhibitor. Both drugs sensitized DIPG cells, but high (>10 micromolar) concentrations were needed to achieve half-maximal effects. The combination seemed to have some efficacy in vivo, but also produced debilitating side-effects that precluded the measurement of any survival benefit.

      We thank the reviewer for deeply evaluating our work and acknowledging the use of multiple experimental strategies to explore the effect of combination therapy on DMG cells. Of note, all mice in our experiment experienced deterioration (including the control mice and those treated with single agents). Thus, it is not the combination of drugs that led to the debilitating side-effects; the mice deteriorated due to the extremely aggressive tumor cells, forming relatively large tumors prior to the treatment onset, calling for further optimization of the therapeutic regime.

      We modified the text in the results section to clarify this point (lines 238-241): “This rapid deterioration is likely a result of the aggressiveness of the transplanted tumors and does not represent side effects of the treatment, as mice from all groups, including the non-treated mice, showed similar signs of deterioration”.  

      We also elaborate on this in the discussion (lines 272-276): “Notably, despite a significant reduction in tumor size in-vivo, the combined treatment did not increase mice survival. This is perhaps due to the relatively large tumors already formed at the onset of treatment, leading to rapid deterioration of mice in all experimental groups. Thus, further optimization of the modeling system and therapeutic regime is needed.” We truly hope that further studies will allow better assessment of this drug combination in various models.

      Strengths:

      Interesting use of a novel epigenome profiling technology (single molecule imaging).

      Weaknesses:

      The use of this novel imaging technology ultimately makes up only a minor part of the study. The rest of the results, i.e. DIPG sensitivity to HDAC and MYC pathway inhibition, have already been demonstrated by others (Grasso Monje 2015; Pajovic Hawkins 2020, among others). The drugs have some interesting opposing effects at the level of the epigenome, demonstrated through CUT&RUN, but this is not unexpected in any way. The drugs evaluated here also didn't have higher efficacy, or efficacy at especially low concentrations, than inhibitors used in previous reports. The combination therapy attempted here also caused severe side effects in mice (dehydration/deterioration), such that an effect on survival could not be determined. I'm not sure this study advances knowledge of targeted therapy approaches in DIPGs, or if it iterates on previous findings to deliver new, or more efficient, mechanistic or therapeutic/pharmaclogic insights. It is a translational report evaluating two drugs singly and in combination, finding that although they sensitise cells in vitro, efficacy in vivo is limited at best, as this particular combination cannot progress to human translation.

      We thank the reviewer for pointing out the strengths and weaknesses of our work. As far as we know, while many studies demonstrated upregulation of the MYC pathway in DIPG, this is the first study that shows inhibition of this pathway (via PIN1) as a therapeutic strategy. While it is clear from the literature that MYC inhibition may pose therapeutic benefit, the development of potent MYC inhibitors is highly challenging due to its structure and cellular localization. Of note, in the 2020 paper, Pajovic and colleagues inhibited MYC by transfecting the cells with a plasmid expressing a specific inhibitory MYC peptide (Omomyc); while this strategy works well for cell cultures, the clinical translation requires different delivery strategies. Sulfopin is a small molecule inhibitor that can be used in-vivo and potentially in clinical studies. Thus, we believe that our study offers a novel strategy, as well as mechanistic insights, regarding the potential use of Sulfopin and Vorinostat to treat DIPG.

      As noted above, the combination therapy did not cause side effects, but rather the aggressiveness of the tumors. We did not notice specific toxicity in the mice treated with Sulfopin alone, or the combined treatment. Furthermore, Dubiella et al. extensively examined toxicity issues and did not observe adverse effects or weight loss when administrating Sulfopin in a dose of 40 mg kg–1.

      Optimization of the model and treatment regime (# of cells injected, treatment starting point, etc.) may have allowed us to reveal survival benefits. Yet, these are highly complicated and expensive experiments; unfortunately, we did not have the resources to perform them within the scope of this revision. Importantly, within the current manuscript, we show the effect of this drug combination in reducing the growth of DMG cells in-vitro and in-vivo, laying the framework for follow-up exploration in future studies. Furthermore, the epigenetic and transcriptomic profiling shed light on the molecular mechanisms that drive these aggressive tumors.

      Reviewer #2 (Public Review):

      Summary:

      The study by Algranati et al. introduces an exciting and promising therapeutic approach for the treatment of H3-K27M pediatric gliomas, a particularly aggressive brain cancer predominantly affecting children. By exploring the dual targeting of histone deacetylases (HDACs) and MYC activation, the research presents a novel strategy that significantly reduces cell viability and tumor growth in patient-derived glioma cells and xenograft mouse models. This approach, supported by transcriptomic and epigenomic profiling, unveils the potential of combining Sulfopin and Vorinostat to downregulate oncogenic pathways, including the mTOR signaling pathway. While the study offers valuable insights, it would benefit from additional clarification on several points, such as the rationale behind the dosing decisions for the compounds tested, the specific contributions of MYC amplification and H3K27me3 alterations to the observed therapeutic effects, and the details of the treatment protocols employed in both in-vitro and in-vivo experiments.

      We thank the reviewer for evaluating our work and recognizing its potential for the DMG research field. We address in detail below the important comments regarding the treatment protocols and dosing decisions.

      Clarification is needed on how doses were selected for the compounds in Figure S2A and throughout the study. Understanding the basis for these choices is crucial for interpreting the results and their potential clinical relevance. IC50s are calculated for specific patient derived lines, but it is not clear how these are used for selecting the dose.

      We thank the reviewer for these important comments. For the epigenetic drugs shown in Figure S2A, we followed published experimental setups; for EPZ6438, GSKJ4, Vorinostat and MM-102 we chose the treating concentrations according to Mohammad et al. 2017, Grasso et al. 2015 and Furth et al. 2022, accordingly. For Sulfopin, we conducted a dedicated dose curve analysis (shown in Figure 1E), indicating only a mild effect on viability and relatively high IC-50 values as a single agent. Since we aimed to test the ability of a combined treatment to additively reduce viability, we used a sub-IC50 concentration for Sulfopin in these experiments. We added this information in lines 123 and 131-132.

      Finally, following the results obtained in the experiment shown in Figure S2A, we conducted a full dose-curve analysis of the combined treatment in multiple DMG patient-derived cells (figure 2B and S2C), to identify a combination of concentrations that provides an additive effect (as indicated by BLISS index in figure 2C and S2E). Of note, for downstream analysis of the molecular mechanisms underlying the treatment response (RNAseq and Cut&Run), we intentionally used concentrations that provide an additive BLISS index, but do not completely abolish the culture, to allow for cellular analysis (i.e. 10uM Sulfopin and 1uM Vorinostat).

      The introduction mentions MYC amplification in high-grade gliomas. It would be beneficial if the authors could delineate whether the models used exhibit varying degrees of MYC amplification and how this factor, alongside differences in H3K27me3, contributes to the observed effects of the treatment.

      The reviewer highlights an important part of our study relating to the MYC-dependent sensitivity of the proposed treatment combination. Since high expression of MYC can be mediated by different molecular mechanisms and not only genomic amplification, we directly quantified mRNA levels of MYC by qPCR (shown in figure S2G) in order to explore its relationship with cellular response to Sulfopin and Vorinostat. Indeed, cultures that express high levels of MYC mRNA were more sensitive to Sulfopin treatment alone (figure S1P) and to the combined treatment (figure 2D-E). We also relate to these findings in lines 103-106 and 142-147 of the results section. Importantly, in cultures that express high levels of MYC (SU-DIPG13 as an example), we see downregulation of MYC targets upon the combined treatment, supporting the notion that this treatment affects viability by attenuation of MYC signaling.

      In Figure 2A, the authors outline an optimal treatment timing for their in vitro models, which appears to be used throughout the figure. It would be helpful to know how this treatment timing was selected and also why Sulfopin is dosed first (and twice) before the vorinostat. Was this optimized?

      As PIN1 regulates the G2/M transition, its inhibition by Sulfopin delays cell cycle progression (Yeh et al. 2007). Thus, in order to observe a strong viability difference in culture, a prolonged treatment period of 8-9 days is required (Dubiella et al., 2021). To maintain an active concentration of the drug during this long time period, we added a Sulfopin pulse (2nd dose) to achieve a stronger effect on cell viability. We and others noticed that, unlike Sulfopin, the effect of Vorinostat on viability is rapid and can be clearly seen after 2-3 days of treatment. Thus, we added this drug only after the 2nd dose of Sulfopin. We now relate to the mode of action of Sulfopin in lines 79-81.

      It should be clarified whether the dosing timeline for the combination drug experiments in Figure 3 aligns with that of Figure 2. This information is also important for interpreting the epigenetic and transcriptional profiling and the timing should be discussed if they are administered sequentially (also shown in Figure 2A).I have the same question for the mouse experiments in Figure 4.

      The reviewer is correct that this information is critical for evaluating the results. In order to link the expression changes to the epigenetic changes, we kept the same experimental conditions in both the Cut&Run and RNA-seq experiments (shown in figures 2-3). We added this information to the text in line 184.

      For the in-vivo studies of HDAC inhibition (Figure 4), we followed published protocols (Ehteda et al. 2021). In these experiments both drugs were administrated simultaneously every day. We added this information to the text in line 231-232.  It may be that changing the admission regime may improve the efficacy of the drug combination, which remains to be tested in future studies.

      The authors mention that the mice all had severe dehydration and deterioration after 18 days. It would be helpful to know if there were differences in the side effects for different treatment groups? I would expect the combination to be the most severe. This is important in considering the combination treatment.

      As noted in our response to Reviewer #1, all mice in our experiment experienced deterioration (including the control mice and those treated with single agents- we could not observe any differences between the groups). This is due to the extremely aggressive tumor cells, forming relatively large tumors prior to the treatment onset, calling for further optimization of the system and therapeutic regime (# of cell injected, treatment starting point, etc.). Unfortunately, this model is very challenging (especially the injection of cells to the pons of the mice brains, which requires unique expertise and is associated with mortality of some of the mice). Thus, these are highly complicated and expensive experiments; unfortunately, we did not have the resources to repeat and optimize the treatment protocol within the scope of this revision. Of note, Dubiella et al. extensively examined toxicity issues and did not observe adverse effects or weight loss when administrating Sulfopin in a dose of 40 mg kg–1. In our model, the side effects were caused by the tumors rather than the drugs.

      Minor Points:

      (1) For Figure 1F, reorganizing the bars to directly compare the K27M and KO cell lines at each dose would improve readability of this figure.

      We have changed figure 1F as the reviewer suggested.

      (2) In Figure 4D, it would be helpful to know how many cells were included (or a minimum included) to calculate the percentages.

      We added the number of H3-K27M positive cells detected per FOV to the figure legend and method section (n=13-198 cells per FOV). Of note, while we analyzed similar-sized FOVs, the number of tumor cells varied between the groups, with the treated group presenting a lower number of H3-K27M cells (due to the effect of the treatment on tumor growth). To account for this difference, we calculated the portion of mTOR-positive cells out of the tumor cells.   

      Reviewer #3 (Public Review):

      Summary:

      The authors use in vitro grown cells and mouse xenografts to show that a combination of drugs, Sulfopin and Vorinostat, can impact the growth of cells derived from Diffuse midline gliomas, in particular the ones carrying the H3 K27M-mutations (clinically classified as DMG, H3 K27M-mutant). The authors use gene expression studies, and chromatin profiling to attempt to better understand how these drugs exert an effect on genome regulation. Their main findings are that the drugs reduce cell growth in vitro and in mouse xenografts of patient tumours, that DMG, H3 K27M-mutant tumours are particularly sensitive, identify potential markers of gene expression underlying this sensitivity, and broadly characterize the correlations between chromatin modification changes and gene expression upon treatment, identifying putative pathways that may be affected and underlie the sensitive (and thus how the drugs may affect the tumour cell biology).

      Strengths:<br /> It is a neat, mostly to-the-point work without exploring too many options and possibilities. The authors do a good job not overinterpreting data and speculating too much about the mechanisms, which is a very good thing since the causes and consequences of perturbing such broad epigenetic landscapes of chromatin may be very hard to disentangle. Instead, the authors go straight after testing the performance of the drugs, identifying potential markers and characterizing consequences.

      Weaknesses:

      If anything, the experiments done on Figure 3 could benefit from an additional replicate.<br />

      We thank the reviewer for evaluating our work, and for the positive and insightful comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Perhaps a more substantial drug screen, or CRISPR screen, that utilises single molecule imaging as a readout would identify pharmacologic candidates that are either more effective, or novel.

      While out of scope for the current study, this is a highly interesting suggestion, which will be considered in future studies. Here, we focused on the potential use of the novel MYC inhibitor, Sulfopin. While the dependency of DMG cells on MYC signaling has been documented, to the best of our knowledge, pharmacological inhibition of MYC has not been tested for this disease due to the severe lack of potent MYC inhibitors. We show preliminary evidence for the use of this inhibitor, in combination with HDAC inhibition, to attenuate DMG growth in-vitro and in-vivo.  

      Reviewer #2 (Recommendations For The Authors):

      In Figure 1B, it is hard to tell if there are error bars for HSP90 and E2F2. Is there a potential error here? Seems unlikely to not have an error with a RT-qPCR?

      We thank the reviewer for the careful evaluation of the figures. We included error bars for all genes shown in Figure 1B. We have now increased the line width with the hope of making this information more accessible. As stated in the figure legend, these error bars represent the standard deviation of two technical repeats.

      I noticed that many experiments only had technical replicates. Incorporating biological (independent) replicates, where feasible, would strengthen the study's findings.

      We agree with the reviewer regarding the importance of biological replicates. While some of the panels present error estimates based on technical repeats, the main results were repeated independently with complementary approaches or various biological systems for validation.

      The RNAseq analysis presented in figure 1 was conducted in triplicates and then independently validated by qPCR (Figure 1A-B). Similarly, the transcriptomic analysis presented in figures 2G-I was verified by both western blot (figure 2J) and qPCR (figure S2O). Of note, this later validation was conducted for two different DMG-patient derived cultures.

      To verify the robust effects on cellular viability, we analyzed the response to each drug and the combination on eight different DMG-patient-derived cultures, each representing a completely independent experiment. We show very similar trends in response to treatment between cultures that share the same H3-K27M variant. Thus, while for each culture technical repeats are shown, we provide multiple, independent repeats by examining the different cultures. Similarly, in figure 1F we examined the dependency of Sulfopin treatment on the expression of the H3-K27M oncohistone in two independent isogenic systems.

      Reviewer #3 (Recommendations For The Authors):

      A few questions and suggestions:

      (1) To avoid confusion is important to state if the cells used in each experiment are or not K27M mutants (e.g. SU-DIPG13 on line 63).

      We thank the reviewer for pointing this out and have now added this information when appropriate across the manuscript.

      2) Line 72 - confirming epigenetic silencing of these genes upon PIN1 inhibition (Fig. 1C, S1D)

      Considering that the mechanism of down regulation of MYC targets is likely H3K27me3-independent if it is also happening in DMG H3 K27M-mutants (high H3K27me3 here may rather be a consequence of less MYC binding?), I would strike this sentence out and just point out the correlation between lower expression and higher H3K27me3.

      We agree with the reviewer that the exact molecular mechanism mediating the silencing is yet to be characterized. We have modified the text in line 72 accordingly.

      3) (line 78) Are MYC targets also down regulated in Sulfopin treated DMG, H3 K27M-mutant lines? Any qPCR or previously done RNA-seq data to use?

      In addition to the extensive analysis done on SU-DIPG13 cells (Figure 1 and S1), in light of the reviewer`s comment we examined specific MYC targets in an additional H3-K27M mutant DMG culture (SU-DIPG6) treated with Sulfopin, followed by qPCR. We observed a mild reduction in two prominent targets, E2F2 and mTOR (new figure S1D). Unfortunately, within this study, we only conducted full RNA-sequencing analysis on SU-DIPG13 cells treated with Sulfopin, and thus, we could not examine the global effect of Sulfopin on the transcriptome of other DMG cultures. This will, of course, be of high interest for future studies.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript aims at a quantitative model of how visual stimuli, given as time-dependent light intensity signals, are transduced into electrical currents in photoreceptors of macaque and mouse retina. Based on prior knowledge of the fundamental biophysical steps of the transduction cascade and a relatively small number of free parameters, the resulting model is found to fairly accurately capture measured photoreceptor currents under a range of diverse visual stimuli and with parameters that are (mostly) identical for photoreceptors of the same type.

      Furthermore, as the model is invertible, the authors show that it can be used to derive visual stimuli that result in a desired, predetermined photoreceptor response. As demonstrated with several examples, this can be used to probe how the dynamics of phototransduction affect downstream signals in retinal ganglion cells, for example, by manipulating the visual stimuli in such a way that photoreceptor signals are linear or have reduced or altered adaptation. This innovative approach had already previously been used by the same lab to probe the contribution of photoreceptor adaptation to differences between On and Off parasol cells (Yu et al, eLife 2022), but the present paper extends this by describing and testing the photoreceptor model more generally and in both macaque and mouse as well as for both rods and cones.

      Strengths:

      The presentation of the model is thorough and convincing, and the ability to capture responses to stimuli as different as white noise with varying mean intensity and flashes with a common set of model parameters across cells is impressive. Also, the suggested approach of applying the model to modify visual stimuli that effectively alter photoreceptor signal processing is thought-provoking and should be a powerful tool for future investigations of retinal circuit function. The examples of how this approach can be applied are convincing and corroborate, for example, previous findings that adaptation to ambient light in the primate retina, as measured by responses to light flashes, mostly originates in photoreceptors.

      Weaknesses:

      In the current form of the presentation, it doesn't become fully clear how easily the approach is applicable at different mean light levels and where exactly the limits for the model inversion are at high frequency. Also, accessibility and applicability by others could be strengthened by including more details about how parameters are fixed and what consensus values are selected.

      Thank you - indeed a central goal of writing this paper was to provide a tool that could be easily used by other laboratories. We have clarified and expanded four points in this regard: (1) we have stated more clearly that mean light levels are naturally part of inversion process, and hence the approach can be applied across a broad range of light levels (lines 292-297); (2) we have expanded our analysis of the high frequency limits to the inversion and added that expanded figure to the main text (new Fig 5); (3) we have included additional detail about our calibration procedures, including our calibration code, to facilitate transfer to other labs; and, (4) we have detailed the procedure for identification of consensus parameters (line 172-182, 191-199 and Methods section starting on line 831).

      Reviewer #2 (Public Review):

      Summary:

      This manuscript proposes a modeling approach to capture nonlinear processes of photocurrents in mammalian (mouse, primate) rod and cone photoreceptors. The ultimate goal is to separate these nonlinearities at the level of photocurrent from subsequent nonlinear processing that occurs in retinal circuitry. The authors devised a strategy to generate stimuli that cancel the major nonlinearities in photocurrents. For example, modified stimuli would generate genuine sinusoidal modulation of the photocurrent, whereas a sinusoidal stimulus would not (i.e., because of asymmetries in the photocurrent to light vs. dark changes); and modified stimuli that could cancel the effects of light adaptation at the photocurrent level. Using these modified stimuli, one could record downstream neurons, knowing that any nonlinearities that emerge must happen post-photocurrent. This could be a useful method for separating nonlinear mechanisms across different stages of retinal processing, although there are some apparent limitations to the overall strategy.

      Strengths:

      (1) This is a very quantitative and thoughtful approach and addresses a long-standing problem in the field: determining the location of nonlinearities within a complex circuit, including asymmetric responses to different polarities of contrast, adaptation, etc.

      (2) The study presents data for two primary models of mammalian retina, mouse, and primate, and shows that the basic strategy works in each case.

      (3) Ideally, the present results would generalize to the work in other labs and possibly other sensory systems. How easy would this be? Would one lab have to be able to record both receptor and post-receptor neurons? Would in vitro recordings be useful for interpreting in vivo studies? It would be useful to comment on how well the current strategy could be generalized.

      We agree that generalization to work in other laboratories is important, and indeed that was a motivation for writing this as a methods paper. The key issue in such generalization is calibration. We have expanded our discussion of our calibration procedures and included that code as part of the github repository associated with the paper. Figure 10 (previously Figure 9) was added to illustrate generalization. We believe that the approach we introduce here should generalize to in vivo conditions. We have expanded the text on these issues in the Discussion (sections starting on line 689 and 757).

      Weaknesses:

      (1) The model is limited to describing photoreceptor responses at the level of photocurrents, as opposed to the output of the cell, which takes into account voltage-dependent mechanisms, horizontal cell feedback, etc., as the authors acknowledge. How would one distinguish nonlinearities that emerge at the level of post-photocurrent processing within the photoreceptor as opposed to downstream mechanisms? It would seem as if one is back to the earlier approach, recording at multiple levels of the circuit (e.g., Dunn et al., 2006, 2007).

      Indeed the current model is limited to a description of rod and cone photocurrents. Nonetheless, the transformation of light inputs to photocurrents can be strongly nonlinear, and such nonlinearities can be difficult to untangle from those occurring late in visual processing. Hence, we feel that the ability to capture and manipulate nonlinearities in the photocurrents is an important step. We have expanded Figure 10 to show an additional example of how manipulation of nonlinearities in phototransduction can give insight into downstream responses. We have also noted in text that an important next step would be to include inner segment mechanisms (section starting on line 661); doing so will require not only characterization of the current-to-voltage transformation, but also horizontal cell feedback and properties of the cone output synapse.

      (2) It would have been nice to see additional confirmations of the approach beyond what is presented in Figure 9. This is limited by the sample (n = 1 horizontal cell) and the number of conditions (1). It would have been interesting to at least see the same test at a dimmer light level, where the major adaptation mechanisms are supposed to occur beyond the photoreceptors (Dunn et al., 2007).

      We have added an additional experiment to this figure (now Figure 10) which we feel nicely exemplifies the approach. The approach that we introduce here really only makes sense at light levels where the photoreceptors are adapting; at lower light levels the photoreceptors respond near-linearly, so our “modified” and “original” stimuli as in Figure 10 (previously Figure 9) would be very similar (and post-phototransduction nonlinearities are naturally isolated at these light levels).

      Reviewer #3 (Public Review):

      Summary:

      The authors propose to invert a mechanistic model of phototransduction in mouse and rod photoreceptors to derive stimuli that compensate for nonlinearities in these cells. They fit the model to a large set of photoreceptor recordings and show in additional data that the compensation works. This can allow the exclusion of photoreceptors as a source of nonlinear computation in the retina, as desired to pinpoint nonlinearities in retinal computation. Overall, the recordings made by the authors are impressive and I appreciate the simplicity and elegance of the idea. The data support the authors' conclusions but the presentation can be improved.

      Strengths:

      -  The authors collected an impressive set of recordings from mouse and primate photoreceptors, which is very challenging to obtain.

      -  The authors propose to exploit mechanistic mathematical models of well-understood phototransduction to design light stimuli that compensate for nonlinearities.

      -  The authors demonstrate through additional experiments that their proposed approach works.

      Weaknesses:

      -  The authors use numerical optimization for fitting the parameters of the photoreceptor model to the data. Recently, the field of simulation-based inference has developed methods to do so, including quantification of the uncertainty of the resulting estimates. Since the authors state that two different procedures were used due to the different amounts of data collected from different cells, it may be worthwhile to rather test these methods, as implemented e.g. in the SBI toolbox (https://joss.theoj.org/papers/10.21105/joss.02505). This would also allow them to directly identify dependencies between parameters, and obtain associated uncertainty estimates. This would also make the discussion of how well constrained the parameters are by the data or how much they vary more principled because the SBI uncertainty estimates could be used.

      Thank you - we have improved how we describe and report parameter values in several ways. First, the previous text erroneously stated that we used different fitting procedures for different cell types - but the real difference was in the amount of data and range of stimuli we had available between rods and cones. The fitting procedure itself was the same for all cell types. We have clarified this along with other details of the model fitting both in the main text (lines 121-130) and in the Methods (section starting on line 832). We also collected parameter values and estimates of allowed ranges in two tables. Finally, we used sloppy modeling to identify parameters that could covary with relatively small impact on model performance; we added a description of this analysis to the Methods (section starting on line 903).

      -  In several places, the authors refer the reader to look up specific values e.g. of parameters in the associated MATLAB code. I don't think this is appropriate, important values/findings/facts should be in the paper (lines 142, 114, 168). I would even find the precise values that the authors measure interesting, so I think the authors should show them in a figure/table. In general, I would like to see also the average variance explained by different models summarized in a table and precise mean/median values for all important quantities (like the response amplitude ratios in Figures 6/9).

      We have added two tables with these parameters values and estimates of allowable ranges. We also added points to show the mean (and SD) across cells to the population figures and added those numerical values to the figure legends throughout.

      -  If the proposed model is supposed to model photoreceptor adaptation on a longer time scale, I fail to see why this can be an invertible model. Could the authors explain this better? I suspect that the model is mainly about nonlinearities as the authors also discuss in lines 360ff.

      For the stimuli that we use we see little or no contribution of slow adaptation in phototransduction. We have expanded the description of this point in the text and referred to Angueyra et al (2022) which looks at this issue in more detail for primate cones (paragraph starting on line 280).

      -  The important Figures 6-8 are very hard to read, as it is not easy to see what the stimulus is, the modified stimulus, the response with and without modification, what the desired output looks like, and what is measured for part B. Reworking these figures would be highly recommended.

      We have reworked all of the figures to make the traces clearer.

      -  If I understand Figure 6 correctly, part B is about quantifying the relative size of the response to the little first flash to the little second flash. While clearly, the response amplitude of the second flash is only 50% for the second flash compared to the first flash in primate rod and cones in the original condition, the modified stimulus seems to overcompensate and result in 130% response for the second flash. How do the authors explain this? A similar effect occurs in Figure 9, which the authors should also discuss.

      Indeed, in those instances the modified stimulus does appear to overcompensate. We suspect this is due to differences in sensitivity of the specific cells probed for these experiments and those used in the model construction. We now describe this limitation in more detail (lines 524-526). A similar point comes up for those experiments in which we speed the photoreceptor responses (new FIgure 9B), and we similarly note that the cells used to test those manipulations differed systematically from those used to fit the model (lines 558-560).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I only have a few minor questions and suggestions for clarification.

      It hasn't become fully clear to me how general the model is when different mean light levels (on long-time scales) are considered. Are there slow adaptation processes not captured in the model that affect model performance? And how should one go about setting the mean light level when, for example, probing ganglion cells with a stimulus obtained through model inversion? Should it work to add an appropriate DC component to the current that is provided as input to the inverted model? (Presumably, deriving a stimulus and then just adding background illumination should not work, or could this be a good approximation, given a steady state that is adapted to the background?)

      We have clarified in the main text that slow adaptation does not contribute substantially to responses to the range of stimuli we explored (lines 281-289). We have also clarified that the stimulus in the model inversion is specified in isomerizations per second - so the mean value of the stimulus is automatically included in the model inversion (lines 293-298).

      Furthermore, a caveat for the model inversion seems to be the potential amplification of high-frequency noise. The suggested application of a cutoff temporal frequency seems appropriate, but data are shown only for a few example cells. Is this consistent across cells? (Given that performance between, e.g., mouse cones can vary considerably according to Fig. 4B?) I would also like to suggest moving the corresponding Supplemental Figure (4.1) into the main part of the manuscript, as it seems quite important.

      We have added population analysis to the new Figure 5 (which was Figure 4 - Figure Supplement 1). We have also clarified that the amplification of high frequency noise is an issue only when we try to apply model inversion to measured stimuli. When we use model inversion to identify stimuli that elicit desired responses, the target responses are computed from a linear model that has no noise, so this is not a concern in applications like those in Figures 6-10.

      Also, could the authors explain more clearly what the effect of the normalization of the estimated stimulus by the power of the true stimulus is? Does this simply reduce power at high frequency or also affect frequencies below the suggested cutoff (where the stimulus reconstruction should presumably be accurate even without normalization)?

      Indeed this normalization reduces high frequency power and has little impact on low frequencies where the inversion is accurate; this is now noted in the text (line 363). As for amplification of high frequency noise (previous comment), the normalization by the stimulus power is only needed when inverting measured responses (i.e. responses with noise) and is omitted when we are identifying stimuli that elicit desired responses (e.g. in Figures 6-10).

      While the overall performance of the model to predict photoreceptor currents is impressive, it seems that particular misses occur for flashes right after a step in background illumination and for the white-noise responses at low background illumination (e.g. Figure 1B). Is that systematic, and if so what might be missing in the model?

      Indeed the model (at least with fixed parameters across stimuli) appears to systematically miss a few aspects of the photoreceptor responses. These include the latency of the response to a bright flash and the early flashes in the step + flash protocol in Figure 1B. Model errors for the variable mean noise stimulus (Figure 2) showed little dependence on time even when responses were sorted by mean light level and by previous mean level. Model errors did not show a clear systematic dependence on light level; this likely reflects, at least in part, the use of mean-square-error to identify model parameters. We have expanded our discussion of these systematic errors in the text (lines 164-166).

      I was also wondering whether this is related to the fact that in Figure 9B, the gain in the modified condition is actually systematically higher when there is more background light. Do the authors think that this could be a real effect or rather an overcompensation from the model? (By the way, is it specified what "Delta-gain" really is, i.e., ratio or normalized difference?)

      We suspect this is an issue with the sensitivity of the specific cells for which we did these experiments (i.e. variability in the gamma parameter between cells). This sensitivity varies between cells, and such variations are likely to place the strongest limitation on our ability to use this approach to manipulate responses in different retinas. We now note those issues in the Results (lines 523-526, 557-559 and 591-593) with reference to Figures 9 (previously Figure 8) and 10 (previously Figure 9), and describe this limitation more generally in the Discussion (section starting on line 649). We have also changed delta-gain to response ratio, which seemed more intuitive.

      Maybe I missed this, but it seems that the parameter gamma is fitted in a cell-type-specific fashion (e.g. line 163), but then needs to be fixed for held-out cells. How was this done? Is there much variability of gamma between cells?

      There is variability in gamma between cells, and this likely explains some of systematic differences between data and model (see above and Methods, lines 902-903). For the consensus models in Figure 2B, gamma was allowed to vary for each cell while the remaining consensus model parameters were fixed. Gamma was set equal to the mean value across cells for model inversion (i.e. for all of the analyses in Figures 4-10). We have described the fitting procedure in considerably more detail in the revised Methods (starting on line 832).

      For completeness, it would be nice to have the applied consensus model parameters in the manuscript rather than just in the Matlab code (especially since the code has not been part of the submission). Also, some notes on how the numerical integration of the differential equations was done would be nice (time step size?).

      We have added tables with consensus parameters and estimates of the sensitivity of model predictions to each parameter. We have also added additional details about the numerical approaches (including the time step) to Methods.

      Similarly, it would be nice to explicitly see the relationships that are used to fix certain model parameters (lines 705ff). And can the constants k and n (lines 709-710) be assumed identical for different species and receptor types?

      We have added more details to the model fitting to the methods, including the use of steady-state conditions to hold certain parameters fixed (lines 862 and 866). We are not aware of any direct comparisons of k and n across species and receptor types. We have noted that model performance was not improved by modest changes in these parameters (due to compensation by other model parameters). More generally, we have explained how some parameters trade for others and hence the logic of fixing some even when exact values were not available.

      For the previous measurements of m and beta (lines 712-713), is there a reference or source?

      We have added references for these values.

      Did the authors check for differences in the model parameters between cone types (e.g., S vs. M)?

      We did not include S cones here. They are harder to record from and collecting a fairly large data set across a range of stimuli would be challenging. Our previous work shows that S cones have slower responses than L and M cones, and this would certainly be reflected in differences in model parameters. We have noted this in the text (Methods, line 808-810).

      For the stated flash responses time-to-peak (lines 183-184), is this for a particular light intensity with no background illumination?

      Those are flashes from darkness - now noted in the text.

      Figure 2 - Supplement 1 doesn't have panel labels A and B, unlike the legend.

      Fixed - thank you.

      Reviewer #2 (Recommendations For The Authors):

      (1) Fig. 2B - for some cells, the consensus model seems to fit better than the individual model. How is this possible?

      This was mostly an error on our part (we inadvertently included responses to more stimuli in fitting the individual models, which slightly hampered their performance). Even with this correction, however, a few cells remain for which the consensus model outperforms and individual model. We believe this is because there is more data to constrain model parameters for the consensus models (since they are fit to all cells at the same time), and that can compensate for improvements associated with customizing parameters to specific cells.

      (2) Fig. 2 Supplement 1, it would be useful to see a blow-up of the data in an inset, as in Fig. 2B.

      Thanks - added.

      (3) Line 400 - this paragraph could include additional quantification and statistics to back up claims re 'substantially reduced', 'considerably lower'.

      We quantify that in the next sentence by computing the mean-square-error between responses and sinusoidal fits (also in Figure 7B, which now includes statistics as well). We have made that connection more direct in the text.

      (4) Maybe a supplement to Fig. 8 could show the changes to the stimulus required to alter the kinetics in both directions - to give more insight into part B., especially.

      Good suggestion - we have added the stimuli to all of the panels of the figure (now Figure 9).

      (5) Fig. 8B - in 'Speed response up' condition - there seems to be error in the model for the decay time of the response - especially for the 'original' condition, which is not quantified in 8C. Was it generally difficult to predict responses to flashes?

      That seems largely to reflect that the cells used for those experiments had faster initial kinetics than the average cells (responses to the control traces are also faster than model predictions in these cells - black traces in Figure 9B). We have added this to the text.

      (6) Line 678, possibly notes that 405 nm equally activates S and M photopigments in mice, since most of the cones co-express the two photopigments (Rohlich et al., 1994; Applebury et al., 2000; Wang et al., 2011).

      Thanks - we have added this (lines 827-829).

      (7) The discussion could include a broader description of the various approaches to identifying nonlinearities within retinal circuitry, which include (incomplete list): recording at multiple levels of the circuit (e.g., Kim and Rieke 2001; Rieke, 2001; Baccus and Meister, 2002; Dunn et al., 2006; 2007; Beaudoin et al., 2007; Baccus et al., 2008); recording currents vs. spiking responses in a ganglion cell (e.g., Kim and Rieke, 2001; Zaghloul et al., 2005; Cui et al., 2016); neural network modeling approaches (e.g., Maheswaranathan et al., 2023); optogenetic approaches to studying filtering/nonlinear behavior at synapses (e.g., Pottackal et al., 2020; 2021).

      Good suggestion - we have added this to the final paragraph of the Discussion.

      Reviewer #3 (Recommendations For The Authors):

      -  I am personally not a fan of the style: "... as Figure 4A shows..." or comparable and much prefer a direct "We observe that X is the case (Figure 4A)". If the authors agree, they may want to revise their paper in this way.

      We have revised the text to avoid the “... as Figure xx shows” construction. We have retained multiple instances which follow a “Figure xx shows that …” construction (which is both active rather than passive and does not use a personal pronoun).

      -  I am not a fan of the title. Light-adaption clamp caters only to a very specialized audience.

      We have changed the title to “Predictably manipulating photoreceptor light responses to reveal their role in downstream visual responses.”

      -  The parameter fitting procedure should not only be described in Matlab code, but in the paper.

      Thanks - we have expanded this in the Methods considerably (section starting on line 832).

      -  The authors should elaborate on why different fitting procedures were used.

      We did not describe that issue clearly. The fitting procedures used across cells were identical, but we had different data available for different cell types due to experimental limitations. We have substantially revised that part of the main text to clarify this issue (paragraph starting on line 121).

      -  The authors state in line 126 that the input stimulus is supposed to mimic eye movements mouse, monkey, or human? Please clarify.

      Thanks - we have changed this sentence to “abrupt and frequent changes in intensity that characterize natural vision.”

      -  Please improve the figure style. For example, labels should be in consistent capitalization and ideally use complete words (e.g. Figure 2B, 4B, and others).

      We have made numerous small changes in the figures to make them more consistent.

      -  Is the fraction of variance calculated on held-out-data? Linear models should be added to Figure 2B.

      The fraction of variance explained was not calculated on held out data because of limitations in the duration of our recordings. Given the small number of free parameters, and the ability of the model to capture held out cells, we believe that the model generalizes well. We have added a supplemental figure with linear model performance (Figure 2 - Figure Supplement 2).

      -  Fig. 9A is lacking bipolar cell and amacrine cell labels. Currently, it looks like HC is next to the BC in the schematic.

      Thanks - we have updated that figure (now Figure 10A)

      -  Maybe I am misunderstanding something, but it seems like the linear model prediction shown in Figure 2A for the rod could be easily improved by scaling it appropriately. Is this impression correct or why not?

      We have clarified how the linear model is constructed (by fitting the linear model to low contrast responses of the full model at the mean stimulus intensity). We also added a supplemental figure, following the suggestion above, showing the linear model performance when a free scaling factor is included for each cell.

      -  The verification experiment in Fig. 5 is only anecdotal and is elaborated only in Figure 6. If I am not mistaken, this does not necessitate its own figure/section but could rather be merged.

      We have kept this figure separate (now Figure 6) as we felt that it was important to highlight the approach in general in a figure before getting into quantification of how well it works.

      -  Figure 5 right is lacking labels. What is red and grey?

      Thanks for catching that - labels are added now.

      -  The end of the Discussion is slightly unusual. Did some text go missing?

      Thanks - we have rearranged the Discussion so as not to end on Limitations.

      -  There is a bonus figure at the end which seems also not to belong in the manuscript.

      Thanks - the bonus figure is removed now.

      -  The methods should also describe briefly what kind of routines were used in the Matlab code, e.g. gradient descent with what optimizer?

      We’ve added that information as well.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Our revised version of the manuscript addresses all the comments and suggestions raised, as clarified in our point-by-point answer to the reviewers. We have performed additional experiments regarding the effects on proliferation and differentiation of additional cell types in the muscle, such as myogenic and mesenchymal progenitors as well as chondrogenesis in parental hMSCs that did not express exogenous ACVR1. Moreover, as suggested by reviewer #2, we performed all the chondrogenic experiments with addition of TGFβ in the differentiation media and analyzed chondrogenesis by both Alcian blue staining and qPCR analysis of gene markers (Sox9, Acan, Col2a1 and Mmp3). We also extended our RNA-seq analysis and included new data using both hMSCs expression wild type or R206H ACVR1 receptor, with or without different ACVR1 ligands (BMP6 and Activin A) and treated or not with the inhibitor BYL719. The new data suggests that BYL719 is able to inhibit the expression of genes involved in ossification and osteoblast differentiation irrespective of the presence of the mutation. We also discuss the effect of BYL719 in mTOR signaling and addressed all the minor comments suggested by both reviewers.

      We addressed the specific comments of the reviewers as follows:

      Reviewer # 1:

      Specific points:

      Point #1 and #2. The authors showed that BYL719 inhibited HO in FOP model mice. Did they have HO not only in the muscle but also in the bone marrow? The progenitor cells of chondrocytes and osteoblasts may differ between the muscle and bone marrow. The authors should examine the effects of BYL719 on some other types of cells in the muscle, such as myoblasts and fibro-adipogenic cells, in addition to the bone marrow-derived MSCs. Furthermore, it was unclear whether they were human or murine MSCs in the text.

      The inhibitory effect of BYL719 on HO in FOP mice was clear, but the molecular mechanisms or target cells were still unclear because BYL719 affected multiple types of cells and molecules. The authors are encouraged to show clearer mechanisms and target cells' critical inhibition of HO. Again, this reviewer believes that in vivo and in vitro experiments using muscle and bone marrow and cells prepared from them should provide additional critical information.

      As detailed in the introduction, it is known that Heterotopic Ossification develops in the skeletal muscle and connective tissues. Consistent with the current knowledge of the field, none of the mice showed HO in the bone marrow. Additionally, since activation of the mutant allele is achieved by injection of CRE-expressing adenovirus and cardiotoxin in the muscle hindlimb, it is unlikely that mesenchymal progenitors in the bone marrow would be strongly affected. Interestingly, single-cell RNA sequencing from multiple mouse tissues identified a very strong transcriptional similarity between FAPs and non-muscle mesenchymal progenitors (PMID: 37599828). As suggested, we examined the effects of BYL719 in proliferation and differentiation in additional cell types such as muscle progenitors. In this new version of the manuscript, we show that BYL719 reduces the proliferation of muscle and mesenchymal progenitors while it blocks myoblast differentiation in vitro (Figure 7, Figure Supplement 1). MSCs were murine on those experiments shown in Figure 3; whereas assays shown in Figures 5 and 6 were of human origin. We have further clarified this in the respective Figure legends.

      All the data generated strongly suggests that there is not a single mechanism supporting all the effects of BYL719 in HO. Instead, BYL719 affects multiple cell types involved in efficient HO (e.g. reduction in proliferation and osteochondrogenic specification of mesenchymal precursors (MPs), reduction on proliferation, migration, and inflammatory gene expression on monocytes, etc.). Interestingly, our data suggests that BYL719 is able to inhibit these effects on MPs and monocytes irrespective of the presence of the ACVR1-R206H mutation (Figures 5, 6 and 7). Additionally, there are several signaling mechanisms affected. BYL719 reduces SMAD1/5, p38, AKT and mTOR signaling in parental MPs or with mutations in ACVR1 (Figure 3 and our previous publication PMID: 31373426), being all these pathways required for efficient osteochondrogenic specification of MPs. We consider that the different detailed mechanisms by which BYL719 inhibits osteochondrogenic specification enhances the robustness of the findings in this study.

      Point #3. In FOP model mice, ACVR1 was mutated as Q207D. However, R206H was used in in vitro experiments. Do they have the same characteristics? This reviewer would like to recommend examining the effect of BYL719 on wild-type ACVR1, R206H, and Q207D simultaneously in each experiment.

      We already performed these experiments, assaying in parallel ACVR1-WT, ACVR1-Q207D and ACVR1-R206H, in the transcriptional responses of MPs in our previous work (PMID: 31373426). Both mutations had similar responses, being ACVR1-Q207D stronger than ACVR1-R206H, as it has been shown in vivo in mouse models of HO (PMID: 34633114). In any case, BYL719 inhibits these transcriptional responses induced by both mutant alleles.

      Point #4. Figure 5: What was the effect of BYL719 on the differentiation of parental cells that did not express exogenous ACVR1?

      We performed new assays of chondrogenic differentiation of hMSCs that are shown in the new Figure 5. BYL719 inhibits chondrogenic differentiation of parental hMSCs and also inhibits chondrogenic specification irrespective of the expression of either wild type or mutant ACVR1.

      Point #5. Figure 6: In this experiment, gene expression was examined in pretreated MSCs-ALK2 (ACVR1?) R206H with and without BYL719. It was clear whether suppression of gene expression by BYL719 was specifically caused in cells expressing R206H. What were the effects of BYL719 on parental cells that did not express exogenous ACVR1?

      To be consistent, we relabeled ALK2 to ACVR1 in the figure. We expanded the conditions analyzed in the RNA-sequencing. We included conditions where we activate ACVR1 (either WT or R206H) with their known physiological ligand BMP6. In both, human MSCs expressing ACVR1-R206H and human MSCs expressing Wild Type ACVR1, we observed a downregulation of differentially expressed genes upon addition of BYL719, irrespective of ligand (BMP6 or Activin A) or receptor (RH or WT) (added new Figure 6: B and C).

      Point #6. Figure 7: BYL719 suppressed cell proliferation of all cells examined partially at 2 uM and almost completely at 10 uM, respectively. There is a possibility that BYL719 inhibits HO by inhibiting osteochondroprogenitor proliferation. The authors are encouraged to show data on the effect of BYL719 on the proliferation of other types of cells, such as myoblasts, fibro-adipogenic cells, or bone marrow cells.

      We examined the effects of BYL719 in proliferation in additional cell types such as muscle and mesenchymal progenitors. BYL719 slightly reduced the proliferation of myoblasts and mesenchymal cells in vitro (Figure 7, Figure Supplement 1). However, the reduction in the proliferation in myoblasts or MPs did not reach the extent to that observed in monocytes or macrophages (Figure 7).

      Point #7. Figure 8: How was the effect of BYL719 on muscle regeneration in wild-type? It was reported that mTOR signaling is important in HO in FOP. The authors are encouraged to show the effect of BYL719 on mTOR signaling.

      Muscle regeneration in wild-type mice has also been shown in our previous results PMID: 31373426. In addition, we included images of the muscle regeneration after 23 days of treatment with BYL719 in mice ACVR1Q207D with or without PI3Kα deletion after induction of HO in the new Figure 2, Figure Supplement 2. These mice showed full muscle regeneration or small calcifications surrounded by muscle at most. The effects of PI3Kα inhibitors, either BYL719 or A66, on mTOR signaling had been previously shown by our group (PMID: 31373426). Both inhibitors strongly reduced signaling of mTOR, visualized by activation of p70 S6-kinase, a surrogate marker of mTOR activity.

      Minor points:

      (9) SMAD 1/5 should be SMAD1/5.

      (10) The source of human MSCs should be indicated in the text.

      (11) ALK2 should be ACVR1 in Figure 6A.

      (12) The protein levels of each receptor should be examined in Fig. 4.

      We introduced the suggested changes in the manuscript and Figure 6 and indicated the source of human MSCs in Materials and Methods. We also examined the levels of each receptor that are shown in the new Figure 4, Figure Supplement 1.

      Reviewer # 2:

      Specific points:

      Point #1. Because the involvement of PI3K in HO of FOP, was already reported by authors' group and also others (Hino et al, Clin Invest, 2017), the main purpose of this study was to disclose the mechanism of how PI3K was activated in FOP cells. In the published study (Hino et al, Clin Invest, 2017), PI3K was activated by the ENPP2-LPA-LPR cascade. Unfortunately, there were no new data for this important issue.

      The main purpose of this study is to demonstrate that the pharmacological and genetic inhibition of PI3Kα in HO progenitors at injury sites reduces HO in vivo, to extend the insights into the molecular and cellular mechanisms responsible for the therapeutic effect of PI3K inhibition, and to optimize the timing of the administration of BYL719. Class I PI3Ks are heterodimers of a p110 catalytic subunit in complex with a regulatory subunit. They engage in signaling downstream of tyrosine kinases, G protein-coupled receptors and monomeric small GTPases. Therefore, a plethora of growth factors, cytokines, inflammatory agents, hormones and additional external and internal stimuli are able to activate PI3Kα (PMID: 31110302). In fact, TGF-β family members, including activin A, are able to activate PI3K and mediate some of their non-canonical responses (PMID: 19114990). Multiple factors with known increased expression in the ossifying niche in HO and FOP (e.g. activin A, TGF-β, inflammatory agents such as TNFα, IL6, IL3, etc.) are known activators of PI3K (PMID: 30429363). Interestingly, in our RNA-seq analysis in hMSCs we did not observe increased expression levels of Enpp2 when comparing wild type and R206H mutated cells treated with activin A.

      Point #2. The HO formation of ACVR1/Q207D model mice in this study is extremely unstable (Figure 1B, DMSO). Even the bone volume of some red symbols, which indicate the presence of HO, is located on the base (0.00) line. I would examine carefully the credibility of the data. Also, it is well known that the molecular behavior of mice Acvr1/Q207D and human ACVR1/R206H was different.

      We agree with the reviewer that induction of HO is variable between mice showing variations in penetrance and intensity of the ossifying lesions. This variability is a known common trend that appears in all the models of HO published so far (e.g. PMID: 28758906, PMID: 26333933). Accordingly, we did not exclude any animal that has been injected with CRE-expressing adenovirus plus cardiotoxin in the μCT analysis. Regarding the behavior of mice Acvr1/Q207D and human ACVR1/R206H, it is well known that Q207D produces more robust and stronger responses in terms of signaling and formation of heterotopic ossification (PMID: 34633114). Therefore, reduction of HO by BYL719 would be more stringent in the Acvr1/Q207D model.

      Point #3. The experimental design of Figure 5 experiments is confusing. Although the authors mentioned that the data in Figure 5A were taken seven days after chondrogenic induction, I am skeptical whether the chondrogenic induction was successful. Based on the description of Material and Methods, the authors did not include TGFβ in their "Differentiation Medium", which is an essential growth factor to induce chondrogenic differentiation of human MSC. Why did the ALP activity increase after chondrogenic induction? The authors should demonstrate the evidence of successful chondrogenic induction by showing the expression of key chondrogenic genes such as SOX9, ACAN, or COL2A1. The data in Figure 5B-E are also confusing. The addition of Activin A showed no difference between ACVR1/WT and ACVR1/R206H cells, suggesting that these cells did not reproduce the situation of FOP.

      We performed new assays of chondrogenic differentiation of hMSCs that are shown in the new Figure 5. We included TGFβ1 in the differentiation medium and also included the parental cell line in the analysis. In addition of being a marker of osteoblast differentiation, alkaline phosphatase (ALPL) has also been shown to be induced during chondroblast differentiation in vitro (PMID: 19855136; PMID: 9457080; PMID: 18377198; PMID: 23388029). Moreover, expression data of SOX9, COL2A1, ACAN and MMP13 of cells after chondrogenic differentiation is included in the new Figure 5. Expression of some markers (e.g. ACAN) are increased by the expression of ACVR1R206H, however, we did not observe significant differences in chondroblast differentiation gene expression between ACVR1wt and ACVR1R206H expressing cells. In any case, BYL719 could inhibit chondrogenic differentiation of parental hMSCs and also the chondrogenic specification irrespective of the expression of either wild type or mutant ACVR1.

      Point #4. The experimental design and data analyses of RNA-seq were inappropriate and insufficient, which is disappointing for the reviewer because this will be a key experiment in this study. Because the most important point is to identify the signal for PI3Kα induced by Activin A via ACVR1/R206H, they should also use hMSC-ACVR1/WT for this experiment. Because the authors clearly demonstrated that TGFBR were not targets of BYL719, they should compare the expression profiles between MSC-ACVR1/WT and MSC-ACVR1/WT with BYL719 to identify the targets of BYL719 unrelated to Activin A signal. Then the expression profiles of ACVR1/R206H cells treated with Activin A and Activin A plus BYL719 were compared. Among down-regulated signals by BYL719, those found also in MSC-ACVR1/WT should be discarded. It is important to investigate whether the GO term of ossification or osteoblast differentiation is found also in MSC-ACVR1/WT. If it is so, the effect of BYL719 is not specific for FOP cells.

      We extended our RNA sequencing analysis with additional experimental conditions and comparisons. In new Figure 6, we now compare hMSCs expressing wild type or R206H receptors, with or without BYL719 inhibition, and with or without different ligand activations (BMP6 or Activin A) (New Figure 6A). New Figure 6B shows the Gene ontology analysis of the differentially expressed genes between cells expressing WT and RH receptors under control conditions. We can observe that ossification (GO:0001503) and osteoblast differentiation (GO:0001649) were detected within the top 10 significantly differentially regulated biological processes between these conditions. Therefore, we analyzed these relevant identified GO terms in 5 different comparisons upon GO enrichment analysis (Figure 6C). In addition to the comparison between cells expressing WT and RH receptors under control conditions explained above, we also compared cells expressing WT or RH receptor, with different ACVR1 ligands (BMP6 and Activin A), and with or without BYL719 inhibitor. The addition of BYL719 resulted in a downregulation of the GO terms “ossification” and “osteoblast differentiation” (new Figure 6C). These results confirm the inhibitory effect of BYL719 on ossification and osteoblast differentiation biological processes, and inform that this inhibitory effect remains consistent upon BMP6 or Activin A ligand activation, and with ACVR1 WT and RH expression.

      Point #5. The data in Figure 7 were not related to the aim of this study because cell lines used in these experiments did not have ACVR1/R206H mutations. It is not appropriate to extrapolate these data in the FOP situation.

      We utilized immune cell lines where we could activate ACVR1 with their known physiological ligand BMP6. Mutated ACVR1 gains response to activin A in addition to maintaining the physiological response to BMP6 as the wild type form. Therefore, in these assays we interrogated in vitro, with addition of BMP6, the effects of BYL719 in the growth, migration and inflammatory gene expression upon conditions of activated ACVR1 receptor downstream signaling. We consider that understanding the effects of PI3Kα inhibition in the regulation of proliferation, migration and inflammatory cytokine expression in monocytes, macrophages and mast cells is essential to better define the potential outcome of BYL719 treatment for heterotopic ossifications.

      Minor comments:

      (1) The legends for Figure 1C were those for Figure 1D, and there were no descriptions for Figure 1C in the legends and methods section. The reviewer was unable to understand the meaning of BV/TV. What is TV?

      (2) “However, in PI3Kα deficient mice ACVR1Q207D expression only led to minor ectopic calcifications that were already surrounded by fully regenerated muscle tissue on the 23rd day after injury (Figure 2D, Figure 2-Figure Supplement 1B)": There were no histological data either Figure 2D, Figure 2-Figure Supplement 1B), which showed muscle tissues.

      (3) "The overexpression of Acvr1R206H increased basal and activin dependent expression of canonical (Id1 and Sp7) and non-canonical (Ptgs2) BMP target genes (Figure 3C),": There was no increase of Ptgs2 gene in basal level.

      (4) Materials and Methods. Production of human fetal mesenchymal stem cells expressing ACVR1.: Is it derived from a fetus?

      (5) Figure 6C: There was no description of the meaning of each column. What does AA mean and what is the number?

      We introduced the missing information in the manuscript, Figure legends and material and methods section for points #1, 4 and 5. AA was Activin A, the number was the number of replicates. This has been detailed in the figure legend. We included images of the muscle regeneration after 23 days of treatment with BYL719 in mice after induction of HO in the new Figure 2, Figure Supplement 2 (point #2). We corrected the mistake in the manuscript refraining for suggesting increase of Ptgs2 gene expression by ACVR1-R206 at the basal level (Point #3).

    1. Author response:

      Reviewer #1 (Public Review):

      Weaknesses:

      There are some minor weaknesses.

      Notably, there are not a lot of new insights coming from this paper. The structural comparisons between MCC and PCC have already been described in the literature and there were not a lot of significant changes (outside of the exo- to endo- transition) in the presence vs. absence of substrate analogues.

      We agree that the structures of the human MCC and PCC holoenzymes are similar to their bacterial homologs. That is due to the conserved sequences and functions of MCC and PCC across different species.

      There is not a great deal of depth of analysis in the discussion. For example, no new insights were gained with respect to the factors contributing to substrate selectivity (the factors contributing to selectivity for propionyl-CoA vs. acetyl-CoA in PCC). The authors state that the longer acyl group in propionyl-CoA may mediate stronger hydrophobic interactions that stabilize the alpha carbon of the acyl group at the proper position. This is not a particularly deep analysis and doesn't really require a cryo-EM structure to invoke. The authors did not take the opportunity to describe the specific interactions that may be responsible for the stronger hydrophobic interaction nor do they offer any plausible explanation for how these might account for an astounding difference in the selectivity for propionyl-CoA vs. acetyl-CoA. This suggests, perhaps, that these structures do not yet fully capture the proper conformational states.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      The authors also need to be careful with their over-interpretation of structure to invoke mechanisms of conformational change. A snapshot of the starting state (apo) and final state (ligand-bound) is insufficient to conclude *how* the enzyme transitioned between conformational states. I am constantly frustrated by structural reports in the biotin-dependent enzymes that invoke "induced conformational changes" with absolutely no experimental evidence to support such statements. Conformational changes that accompany ligand binding may occur through an induced conformational change or through conformational selection and structural snapshots of the starting point and the end point cannot offer any valid insight into which of these mechanisms is at play.

      Point accepted. We will revise our manuscript to use "conformational differences" instead of "conformational changes" to describe the differences between the apo and ligand-bound states.

      Reviewer #2 (Public Review):

      Comments and questions to the manuscripts:

      I'm quite impressed with the protein purification and structure determination, but I think some functional characterization of the purified proteins should be included in the manuscript. The activity of enzymes should be the foundation of all structures and other speculations based on structures.

      We appreciate this comment. However, since we purified the endogenous BDCs and the sample we obtained was a mixture of four BDCs, the enzymatic activity of this mixture cannot accurately reflect the catalytic activity of PCC or MCC holoenzyme. We will acknowledge this limitation in the discussion section of our revised manuscript.

      In Figure 1B, the structure of MCC is shown as two layers of beta units and two layers of alpha units, while there is only one layer of alpha units resolved in the density maps. I suggest the authors show the structures resolved based on the density maps and show the complete structure with the docked layer in the supplementary figure.

      We appreciate this comment. We have shown the cryo-EM maps of the PCC and MCC holoenzymes in fig. S8 to indicate the unresolved regions in these structures. The BC domains in one layer of MCCα in the MCC-apo structure were not resolved. However, we think it would be better to show a complete structure in Fig. 1 to provide an overall view of the MCC holoenzyme. We will revise Fig. 1B and the figure legend to clearly point out which domains were not resolved in the cryo-EM map and were built in the structure through docking.

      In the introduction, I suggest the author provide more information about the previous studies about the structure and reaction mechanisms of BDCs, what is the knowledge gap, and what problem you will resolve with a higher resolution structure. For example, you mentioned in line 52 that G437 and A438 are catalytic residues, are these residues reported as catalytic residues or this is based on your structures? Has the catalytic mechanism been reported before? Has the role of biotin in catalytic reactions revealed in previous studies?

      Point accepted. It was reported that G419 and A420 in S. coelicolor PCC, corresponding to G437 and A438 in human PCC, were the catalytic residues (PMID: 15518551). The same study also reported the catalytic mechanism of the carboxyl transfer reaction. The role of biotin in the BDC-catalyzed carboxylation reactions has been extensively studied (PMIDs: 22869039, 28683917). We will include these information in the introduction section of our revised manuscript.

      In the discussion, the authors indicate that the movement of biotin could be related to the recognition of acyl-CoA in BDCs, however, they didn't observe a change in the propionyl-CoA bound MCC structure, which is contradictory to their speculation. What could be the explanation for the exception in the MCC structure?

      We appreciate this comment. We do not have a good explanation for why we did not observe a change in the propionyl-CoA bound MCC structure. It is noteworthy that neither acetyl-CoA nor propionyl-CoA is the natural substrate of MCC. Recently, a cryo-EM structure of the human MCC holoenzyme in complex with its natural substrate, 3-methylcrotonyl-CoA, has been resolved (PDB code: 8J4Z). In this structure, the binding site of biotin and the conformation of the CT domain closely resemble that in our acetyl-CoA-bound MCC structure. Therefore, the movement of biotin induced by acetyl-CoA binding mimics that induced by the binding of MCC's natural substrate, 3-methylcrotonyl-CoA, indicating that in comparison with propionylCoA, acetyl-CoA is closer to 3-methylcrotonyl-CoA regarding its ability to bind to MCC. We will discuss this possibility in our revised manuscript.

      In the discussion, the authors indicate that the selectivity of PCC to different acyl-CoA is determined by the recognition of the acyl chain. However, there are no figures or descriptions about the recognition of the acyl chain by PCC and MCC. It will be more informative if they can show more details about substrate recognition in Figures 3 and 4.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      How are the solved structures compared with the latest Alphafold3 prediction?

      Since AlphaFold3 was not released when our manuscript was submitted, we did not compare the solved structures with the AlphaFold3 predictions. We have now carried out the predictions using Alphafold3. Due to the token limitation of the AlphaFold3 server, we can only include two α and six β subunits of human PCC or MCC in the prediction. The overall assembly patterns of the Alphafold3-predicted structures are similar to that of the cryo-EM structures. The RMSDs between PCCα, PCCβ, MCCα, and MCCβ in the apo cryo-EM structures and those in the AlphaFold3-predicted structures are 7.490 Å, 0.857 Å, 7.869 Å, and 1.845 Å, respectively. The PCCα and MCCα subunits adopt an open conformation in the cryo-EM structures but adopt a closed conformation in the AlphaFold-3 predicted structures, resulting in large RMSDs.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study presents an important contribution to cardiac arrhythmia research by demonstrating long noncoding RNA Dachshund homolog 1 (lncDACH1) tunes sodium channel functional expression and affects cardiac action potential conduction and rhythms. The evidence supporting the major claims are solid. The work will be of broad interest to cell biologists and cardiac electrophysiologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors show that a long-non coding RNA lncDACH1 inhibits sodium currents in cardiomyocytes by binding to and altering the localization of dystrophin. The authors use a number of methodologies to demonstrate that lncDACH1 binds to dystrophin and disrupt its localization to the membrane, which in turn downregulates NaV1.5 currents. Knockdown of lncDACH1 upregulates NaV1.5 currents. Furthermore, in heart failure, lncDACH1 is shown to be upregulated which suggests that this mechanism may have pathophysiological relevance.

      Strengths:

      (1) This study presents a novel mechanism of Na channel regulation which may be pathophysiologically important.

      (2) The experiments are comprehensive and systematically evaluate the physiological importance of lncDACH1.

      Reviewer #2 (Public Review):

      This manuscript by Xue et al. describes the effects of a long noncoding RNA, lncDACH1, on the localization of Nav channel expression, the magnitude of INa, and arrhythmia susceptibility in the mouse heart. Because lncDACH1 was previously reported to bind and disrupt membrane expression of dystrophin, which in turn is required for proper Nav1.5 localization, much of the findings are inferred through the lens of dystrophin alterations.

      The results report that cardiomyocyte-specific transgenic overexpression of lncDACH1 reduces INa in isolated cardiomyocytes; measurements in whole heart show a corresponding reduction in conduction velocity and enhanced susceptibility to arrhythmia. The effect on INa was confirmed in isolated WT mouse cardiomyocytes infected with a lncDACH1 adenoviral construct. Importantly, reducing lncDACH1 expression via either a cardiomyocyte-specific knockout or using shRNA had the opposite effect: INa was increased in isolated cells, as was conduction velocity in heart. Experiments were also conducted with a fragment of lnDACH1 identified by its conservation with other mammalian species. Overexpression of this fragment resulted in reduced INa and greater proarrhythmic behavior. Alteration of expression was confirmed by qPCR.

      The mechanism by which lnDACH1 exerts its effects on INa was explored by measuring protein levels from cell fractions and immunofluorescence localization in cells. In general, overexpression was reported to reduce Nav1.5 and dystrophin levels and knockout or knockdown increased them.

      The strengths of this manuscript include convincing evidence of a link between lncDACH1 and Na channel function. The identification of a lncDACH1 segment conserved among mammalian species is compelling. The observation that lncDACH1 is increased in a heart failure model and provides a plausible hypothesis for disease mechanism.

      One limitation of the fractionation approach is the uncertain disposition of Na channel protein deemed "cytoplasmic." It seems likely that the membrane fraction includes ER membrane. The signal may reasonably be attributed to Na channel protein in stalled transport vesicles, or alternatively in stress granules, but this was not directly addressed.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors report the first evidence of Nav1.5 regulation by a long noncoding RNA, LncRNA-DACH1, and suggest its implication in the reduction in sodium current observed in heart failure. Since no direct interaction is observed between Nav1.5 and the LncRNA, they propose that the regulation is via dystrophin and targeting of Nav1.5 to the plasma membrane.

      Strengths:

      (1) First evidence of Nav1.5 regulation by a long noncoding RNA.

      (2) Implication of LncRNA-DACH1 in heart failure and mechanisms of arrhythmias.

      (3) Demonstration of LncRNA-DACH1 binding to dystrophin.

      (4) Potential rescuing of dystrophin and Nav1.5 strategy.

      Weaknesses:

      (1) The fact that the total Nav1.5 protein is reduced by 50% which is similar to the reduction in the membrane reduction questions the main conclusion of the authors implicating dystrophin in the reduced Nav1.5 targeting. The reduction in membrane Nav1.5 could simply be due to the reduction in total protein.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Weaknesses:

      (1) What is indicated by the cytoplasmic level of NaV1.5, a transmembrane protein?

      This is still confusing. Since Nav1.5 is an integral membrane protein, I am not sure what is really meant here by cytosolic fraction. From the workflow, it seems a separate organelle fraction is also collected. Is the amount of Nav1.5 in this fraction (which I assume includes for e.g. lysosome) also increased with lncDACH1? I recommend the authors to refer to the Nav channels not at the plasma membrane as 'intracellular' rather than 'cytoplasmic'.

      Thanks for the insightful comment. We completely agree. Accordingly, we have changed “cytoplasmic” to “ intracellular“.

      Line 226. "In consistent with the results" Perhaps unnecessary to have "in"

      Thank you for the insightful comment. We have corrected it.

      Line 228. Is it optimal or optical?

      Sorry for the mistake, it should be optical. We have corrected it.

      Reviewer #3 (Recommendations For The Authors):

      I still have an issue with the total reduction in Nav1.5 which is about the same as the reduction in membrane and currents. The authors argue that there is an increase in cytoplasmic Nav1.5. However the controls that they provide for membrane and cytoplasmic fractions are not convincing.

      Thank you for the insightful comment. We can not rule out the possibility that the reduction in membrane Nav1.5 maybe be due to the reduction in total protein. Our data indicates that the membrane and total protein levels of Nav1.5 were reduced by 50%. However, the intracellular Nav1.5 was not decreased, but increased in the hearts of lncDACH1-TG mice than WT controls, which indicates that the intracellular Nav1.5 failed to traffic to the membrane.

    1. Author response:

      First we thank the reviewers for a thorough reading of our paper and some useful comments. A recurrent remark of the reviewers concerns the appearance of kRas-expressing cells (labelled by a nuclear blue fluorescent marker) which we attribute to the progeny of the initially induced cell. The reviewers suggest that these cells may have been obtained through activation of the Cre-recombinase in other cells by cyclofen released from light scattering, via diffusion, leakiness, etc. These remarks are perfectly reasonable from people not familiar with the cyclofen uncaging approach that we are using but are unwarranted as we shall show below.

      We have been using cyclofen uncaging with subsequent activation of a Cre-recombinase (or some other proteins) since 2010 (see ref.34, Sinha et al., Zebrafish 7, 199-204 (2010) and our 2018 review (ref.35, Zhang et al., ChemBioChem 19,1-8 (2018)). In our experiments, the embryos are incubated in the dark in 6M caged cyclofen (cCyc) and washed in E3 medium (or transferred to a new medium with no cCyc). In these conditions, over many years we never observed activation of the recombinase, i.e. the appearance of the associated fluorescent label in cells of embryos grown in E3 medium. Hence leakiness can be ruled out (in presence of cCyc or in its absence).

      Following transfer of the embryos to new E3 medium we illuminate the embryos locally with light at 405nm. In these conditions, cCyc is only partially uncaged and results in activation of Cre-recombinase in only a few cells (1,2, 3, …) within the illuminated region only, namely in the appearance of the kRas-associated nuclear blue fluorescent label in usually one cell (and sometimes in a few more; data and statistics will be incorporated in a revised manuscript). In absence of any further treatment (e.g. activation of a reprogramming factor) these fluorescently labelled cells disappear within a few days (either via shut-down of their promotor, apoptosis or some other mechanism). The crucial point here is that we see less and not more kRas expressing cells (i.e. with nuclear blue fluorescence). This observation rules out activation of Cre-recombinase in other cells days after illumination due to leakiness, cyclofen released by light or diffusing from the illumination spot.

      To observe many more fluorescent cells days after activation of the initial cell, one needs to transiently activate VentX-GR by overnight incubation in dexamethasone (DEX) (Injecting the embryos at 1-cell stage with VentX-GR or incubating them in DEX does not result in the appearance of more blue fluorescent cells). Following activation of VentX-GR, the fluorescent cells observed a couple of days after initiation are visualized in E3 medium (i.e. in absence of cyclofen) and are localized to the vicinity of the otic vesicle (the region where the initial cell was activated). In a revised manuscript we will present images of these fluorescent cells taken a few days apart from the same embryo in which a single cell was initially activated. Hence, we attribute these cells to the progeny of the activated cell. Obviously, single cell tracking via time-lapse microscopy would nail down this issue and provide fascinating insight into the initial stages of tumor growth. Unfortunately, immobilization of embryos in the usual medium (e.g. MS222, tricaine) over 5-6 days to track the division and motion of single cells is not possible. We are considering some other possibilities (immobilization in bungarotoxin or via photo-activation of anionic channels), but these challenging experiments are for a future paper.

      Reviewer #1 (Public Review):

      The authors then performed allotransplantations of allegedly single fluorescent TICs in recipient larvae and found a large number of fluorescent cells in distant locations, claiming that these cells have all originated from the single transplanted TIC and migrated away. The number of fluorescent cells showed in the recipient larve just after two days is not compatible with a normal cell cycle length and more likely represents the progeny of more than one transplanted cell.

      As mentioned in the manuscript, we measure the density of cells/nl and inject in the yolk of 2dpf Nacre embryos a volume containing about 1 cell, following published protocols (S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007)). We further image the injected cell(s) by fluorescence microscopy immediately following injection, as shown in Fig.4A and Fig.S8B. We might miss a few cells but not many. With a typical cell cycle of ~10h the images of tumors in larvae at 3dpt (and not 2dpt as misunderstood by this reviewer) correspond to ~100 cells. In any case the purpose of this experiment was not to study tumorigenesis upon transplantation but to show that the progeny of the initially induced cells is capable of developing into a tumor in a naïve fish, which is the operational definition of cancer that we adopted here.

      The ability to migrate from the injection site should be documented by time-lapse microscopy.

      As stated above our purpose here is not to study tumor formation from transplanted cell(s) but to use that assay as an operational test of cancer. Besides as mentioned earlier single cell tracking in larvae over 3-4dpt is not a trivial task.

      Then, the authors conclude that "By allowing for specific and reproducible single cell malignant transformation in vivo, their optogenetic approach opens the way for a quantitative study of the initial stages of cancer at the single cell level". However, the evidence for these claims are weak and further characterization should be performed to:

      (1) show that they are actually activating the oncogene in a single cell (the magnification is too low and it is difficult to distinguish a single nucleus, labelling of the cell membrane may help to demonstrate that they are effectively activating the oncogene in, or transplanting, a single cell)

      In a revised manuscript we will provide larger magnification of the initial induced cell and show examples of oncogene activation in more than one cell.

      (2) the expression of the genes used as markers of tumorigenesis is performed in whole larvae, with only a few transformed cells in them. Changes should be confirmed in FACS sorted fluorescent cells

      When the oncogene is activated in a whole larvae all cells are fluorescent and thus FACS is of no use for cell sorting. Sorting could be done in larvae where single cells are activated, but then the efficiency of FACS is not good enough to isolate the few fluorescent cells among the many more non-fluorescent ones. We agree that the change in expression of the genes used as markers of tumorigenesis is an underestimate of their true change, but our goal at this time is not to precisely measure the change in expression level, but to show that the pattern of change is different from the controls and corresponds to what is expected in tumorigenesis.

      (3) the histology of the so called "tumor masses" is not showing malignant transformation, but at the most just hyperplasia.

      The histology of the hyperplasic tissues displays cellular proliferation with a higher density of nuclear material which is characteristic of tumors, Fig.S4C. Besides the increased expression of pERK in these tissues, Fig.S4A,B is also a hallmark of cancer.

      In the brain, the sections are not perfectly symmetrical and the increase of cellularity on one side of the optic tectum is compatible with this asymmetry.

      The expected T-shape formed by the sections of the tegmentum and hypothalamus are compatible with the symmetric sections shown in Fg.2D. The asymmetry in the optic tectum is a result of the hyperplasic growth.

      (4) The number of fluorescent cells found dispersed in the larvae transplanted with one single TIC after 48 hours will require a very fast cell cycle to generate over 50 cells. Do we have an idea of the cell cycle features of the transplanted TICs?

      As answered above, the transplanted larvae are shown at 3dpt (and not 2dpt as misunderstood by this reviewer). With a cell cycle of about 10h, a single cell can give rise to about 100 cells in that time lapse.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes a genetically tractable and modifiable system …which could be used to study an array of combinations and temporal relationships of these cancer drivers/modifiers.

      We thank this referee for its positive comments. We would also like to point out that our approach provides for the first quantitative means to estimate the probability of tumorigenesis from a single cell, an estimate which is crucial in any assessment of cancer malignancy and the effectiveness of prophylactics.

      Weaknesses:

      There is minimal quantitation of … the efficiency of activation of the Ras-TFP fusion (Fig 1) in, purportedly, a single cell. …, such information seems essential.

      In a revised manuscript we will add more images of induction of a single (or a few cells) and a table where the efficiency of RAS activation is detailed.

      The authors indicate that a single cell is "initiated" (Fig 2) using the laser optogenetic technique, but without definitive genetic lineage tracing, it is not possible to conclude that cells expressing TFP distant from the target site near the ear are daughter cells of the claimed single "initiated" cell. A plausible alternative explanation is 1) that the optogenetic targeting is more diffuse (i.e. some of the light of the appropriate wavelength hits other cells nearby due to reflection/diffraction), so these adjacent cells are additional independent "initiated" cells or 2) that the uncaged tamoxifen analogue can diffuse to nearby cells and allow for CreER activation and recombination.

      We have addressed this point in our general comments to the reviewers’ remarks. The possibilities mentioned by this reviewer would result in cells expressing TFP in absence of VentX activation, which is not the case. Cells expressing TFP away from the initial site are observed days after activation of the oncogene (and TFP) in a single cell and only upon activation of VentX.

      In Fig 2B, the claim is made that "the activated cell has divided, giving rise to two cells" - unless continuously imaged or genetically traced, this is unproven.

      We have addressed this remark previously. Tracking of larvae over many days is not possible with the usual protocol using tricaine to immobilize the larvae. Nonetheless, in a revised version we will present images of an embryo imaged at various times post activation where proliferation of the cells can be observed. We are pursuing other alternatives for time-lapse microscopy over many days since, besides convincing the sceptics, a single cell tracking experiment (possibly coupled with in-situ spatial transcriptomics) will shed a new and fascinating light on the initial stages of tumor growth.

      In addition, it appears that Figures S3 and S4 are showing that hyperplasia can arise in many different tissues (including intestine, pancreas, and liver, S4C) with broad Ras + Ventx activation …. This should be clarified in the manuscript).

      This is true and will be clarified in the new version.

      In Fig S7 where single cell activation and potential metastasis is discussed, similar gut tissues have TFP+ cells that are called metastatic, but this seems consistent with the possibility that multiple independent sites of initiation are occurring even when focal activation is attempted.

      As mentioned previously this is ruled out by the fact that these cells are observed days after cyclofen uncaging (and TFP activation) and if and only if VentX is activated.

      Although the hyperplastic cells are transplantable (Fig 4), the use of the term "cells of origin of cancer" or metastatic cells should be viewed with care in the experiments showing TFP+ cells (Fig 1, 2, 3) in embryos with targeted activation for the reasons noted above.

      The purpose of this transplantation experiment was to show that cell in which both kRas and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor.

      Reviewer #3 (Public Review):

      Summary:

      This study employs an optogenetics approach … to examine tumourigenesis probabilities under altered tissue environments.

      We thank this reviewer for this remark, since we believe that the opportunity to assess the probability of tumorigenesis from a single cell is possibly the most significant contribution of this work. To the best of our knowledge this has never been done before.

      Weaknesses:

      Lack of Methodological Clarity: The manuscript lacks detailed descriptions of methodologies,

      In a revised manuscript we will include additional detail of our methodology.

      Sub-optimal Data Presentation and Quality:

      Lack of quantitative data and control condition data obtained from images of higher magnification limits the ability to robustly support the conclusions.

      In a revised version we will include more images at higher magnification and quantitative data to support the main report of targeted single cell induction.

      Here are some details:

      Authors might want to provide more evidence to support their claim on the single cell KRAS activation.

      More images and a data on activation of single or few cells in the illumination field will be provided in a revised version.

      · Stability of cCYC: The manuscript does not provide information on the half-life and stability of cCYC. Understanding these properties is crucial for evaluating the system's reliability and the likelihood of leakiness, which could significantly influence the study's outcomes.

      We have been using the cCyc system for about 14 years. We refer the reader to our previous papers and reviews on this methodology (e.g. ref. 34,35). Briefly, cCyc is stable when not illuminated with light around 375nm. Typically, we incubate our embryos in the dark for about 1h before transferring them into E3 medium and illuminating them. Assessing the leakiness of the system is easy as expression of the fluorescent marker is permanently turned on. We have observed none in the conditions of our experiment.

      · Metastatic Dissemination claim: However, the absence of a supportive cellular compartment within the fin-fold tissue makes the presence of mTFP-positive metastatic cells there particularly puzzling. This distribution raises concerns about the spatial specificity of the optogenetic activation protocol … The unexpected locations of these signals suggest potential ectopic activation of the KRAS oncogene,

      We have addressed this remark in the introduction and above. Specifically, metastatic and proliferative mTFP-positive cells are observed if and only if VentX is also activated concomitant with activation of kRAS in a single cell. No proliferative cells are observed in absence of VentX activation, or in presence of VentX or Dex alone, or if kRAS has not been activated by cyclofen uncaging.

      · Image Resolution Concerns: The cells depicted in Figure 3C β, which appear to be near the surface of the yolk sac and not within the digestive system as suggested in the MS, underscore the necessity for higher-resolution imaging. Without clearer images, it is challenging to ascertain the exact locations and states of these cells, thus complicating the assessment of experimental results.

      Better images will be provided in the revised version.

      · The cell transplantation experiment is lacking protocol details:

      Details will be provided in the revised version. We have followed regular protocols for transplantation: S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007).

      • If the cells are obtained from whole larvae with induced RAS + VX expression, it is notable and somewhat surprising that the larvae survived up to six days post-induction (6dpi) before cells were harvested for transplantation. This survival rate and the subsequent ability to obtain single cell suspensions raise questions about the heterogeneity of the RAS + VX expressing cells that transplanted.

      From Fig.S4D, about 50% of the embryos survive at 6dpi. Though an interesting question by itself we have not (yet) addressed the important issue of the heterogeneity of the outgrowth obtained from a single cell. Our purpose here was just to show that cells in which both kRAS and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor.

      · Unclear Experimental Conditions in Figure S3B: …It is not specified whether the activation of KRAS was targeted to specific cells or involved whole-body exposure.

      This was whole body (global) illumination and will be specified in the revised version.

      · Contrasting Data in Figure S3C compared to literature: The graph in Figure S3C indicates that KRAS or KRAS + DEX induction did not result in any form of hyperplastic growth. The authors should provide detailed descriptions of the conditions under which the experiments were conducted in Figure S3B and clarifying the reasons for the discrepancies observed in Figure S3C are crucial. The authors should discuss potential reasons for the deviation from previous reports.

      This discrepancy will be discussed in the revised version. First the previous reports consider the development of tumors over a longer time-span (4-5 weeks) which we have not studied here. Second, the expression of the oncogene in these reports might be stronger than in ours. Third, the stochastic appearance of tumors in these reports suggest that some other mechanism (transient stress-induced reprogramming?) might have activated the oncogene in the initial cell.

      Further comments:

      Throughout the study, KRAS-activated cell expansion and metastasis are two key phenotypes discussed that Ventx is promoting. However, the authors did not perform any experiments to directly show that KRAS+ cells proliferate only in Ventx-activated conditions.

      Yes, we did. See Fig. S1 and compare with Fig.S3B, or Fig.S8A in comparison with Fig.2A,B.

      The authors also did not show any morphological features or time-lapse videos demonstrating that KRAS+ cells are motile, even though zebrafish is an excellent model for in vivo live imaging. This seems to be a missed opportunity for providing convincing evidence to support the authors' conclusions.

      Performing single cell time-lapse microscopy on larvae over many (4-5) days is not possible with the regular tricaine protocol for immobilization. We are definitely planning such experiments, but they will require some other protocol, perhaps using bungarotoxin or some optogenetic inhibitory channels. Nonetheless, in the revised version we will show images of the same embryos at various times post single cell induction displaying proliferation of cells.

      There were minimal experimental details provided for the qPCR data presented in the supplementary figures S5 and S6, therefore, it is hard to evaluate result obtained.

      More details will be given in the revised version.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that presented data could benefit from addition of suggested experiments. We will  address the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      Tables S3 and S6 show the mass spectrometry output data from MaxQuant analysis  without any flittering.  Certain identifications, i.e. those denoted as contaminants (such as keratins) were removed during statistical analysis in Perseus software. Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. However, we acknowledge that the description of Tables S3 and S6 may lead to misunderstanding, thus we will clarify their explanation.

      I am not convinced that the mass spec data is reliable.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein co-precipitated with FMR1 RNA containing expanded CGG repeats. However, we do not claim that they interact directly. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, changes in efficiency and fidelity of PIC scanning or impeded elongation or more likely combination of some of these processes. We will  provide better explanation regarding those issues in the revised version of the manuscript.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer 1 that RPS26 is an essential protein. Previously, it was shown that cell viability in cells with mutated C-terminal deletion of RPS26 is decreased (Havkin-Solomon T, Nucleic Acids Res 2023). We will address the question regarding the suppression of FMRpolyG in models with partial RPS26 knock-down.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      Missing experiments showing efficiency of knock-down will be included in the revised version of the manuscript.

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      We will clarify this ambiguity in the revised version of the manuscripts.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effect may stem partially from reduced ribosome content, however, we argue that this is not the only explanation. In the publication concerning RPS25 regulation of G4C2-related RAN translation (Yamada SB, 2019, Nat Neurosci), it was shown that RPS25 KO does not affect global translation. Our experiments (SUnSET assay, unpublished) indicated that RPS26 KD also did not reduce global translation rate significantly. We will present that data in the revised version of the manuscript.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Results shown in Fig.S3 does not imply that RPS26 does not affect the selection of start codon context entirely. We just tested a few hypotheses. We decided to test -4 position, because this position was indicated as the most sensitive to RPS26 regulation in yeast (Ferretti M, 2017, Nat Struct Mol Biol). Regarding WebLOGO analysis; we wrote in the manuscript that we did not identify any specific motif or enrichment within analysed transcripts in comparison to background. We will clarify this ambiguity in revised version of the manuscript.

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      As in (7).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      We agree that presented data could benefit from addition of some experiments. Therefore we will address questions regarding the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active. However, we did not state that RPS26 binds directly to RNA with expanded CGG repeats and that this interaction is crucial for translation regulation of studied RNA. We just tested such hypotheses. We will improve the text narration in revised version of the manuscript to make major conclusions clearer.

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer 2 for this comment. We will show the data derived from a few different cell models that we already have obtained. Moreover, we will include results of experiments with luminescence readout for FMRpolyG fused with luciferase upon RPS26 KD.

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNA-tagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation(Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank Reviewer 3 for critical comments and suggestions. We agree that the proposed title may be misleading and the presented data does not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Hence, we will change the title together with a narrative regarding these unfortunate statements that go beyond the presented results.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We will address the question regarding the influence of the content of CGG repeats and START codon selection (including different near-cognate start codons) on RPS26-sensitive translation, and present these data in revised version of the manuscript.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final translation of investigated mRNA including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be affected upon RPS26 depletion (Figure 2B&C), however, we will address other possibilities as well.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G et al., Front Genet 2019), additional evaluations for cellular viability would strengthen this conclusion.

      We thank Reviewer 3 for this suggestion. We addressed the effect of RPS26 KD on apoptotic process induced by FMRpolyG. We will perform other experiments regarding different aspects of FMRpolyG-mediated cell toxicity as well.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review:

      This article is a direct follow-up to the paper published last year in eLife by the same group. In the previous article, the authors discovered a zinc finger protein, Kipferl, capable of guiding the HP1 protein Rhino towards certain genomic regions enriched in GRGGN motifs and packaged in heterochromatin marked by H3K9me3. Unlike other HP1 proteins, Rhino recruitment activates the transcription of heterochromatic regions, which are then converted into piRNA source loci. The molecular mechanism by which Kipferl interacts specifically with Rhino (via its chromodomain) and not with other HP1 proteins remained enigmatic. 

      In this latest article, the authors go a step further by elucidating the molecular mechanisms important for the specific interaction of Rhino and not other HP1 proteins with Kipferl. A phylogenetic study carried out between the HP1 proteins of 5 Drosophila species led them to study the importance of an AA Glycine at position 31 located in the Rhino chromodomain, an AA different from the AA (aspartic acid) found at the same position in the other HP1 proteins. The authors then demonstrate, through a series of structure predictions, biochemical, and genetic experiments, that this specific AA in the Rhino-specific chromodomain explains the difference in the chromatin binding pattern between Rhino and the other Drosophila HP1 proteins. Importantly, the G31D conversion of the Rhino protein prevents interaction between Rhino and Kipferl, phenocopying a Kipferl mutant. 

      Strengths: 

      The authors' effective use of phylogenetic analyses and protein structure predictions to identify a substitution in the chromodomain that allows Rhino's specific interaction with Kipferl is very elegant. Both genetic and biochemical approaches are applied to rigorously probe the proposed explanation. They used a point mutation in the endogenous locus that replaces the Rhino-specific residue with the aspartic acid residue present in all other HP1 family members. This novel allele largely phenocopies the defects in hatch rate, chromatin organization, and piRNA production associated with kipferl mutants, and does not support Kipferl localization to clusters. The data are of high quality, the presentation is clear and concise, and the conclusions are generally well-supported.

      Weaknesses: 

      The reviewers identified potential ways to further strengthen the manuscript.

      (1) The one significant omission is RNAseq on the rhino point mutant, which would allow direct comparison to cluster, transposon, and repeat expression in kipferl mutants. 

      In this eLife Advances submission, we aim to elucidate the molecular interaction between Rhino and the zinc finger protein Kipferl and how it evolved. Using various assays, of which piRNA sequencing is the most relevant and comprehensive, we show that the rhino[G31D] mutation phenocopies a rhino loss-of-function situation for Kipferl and a kipferl loss-of-function situation for Rhino. Further confirmation of this statement by additional RNA-seq experiments to probe the extent of selective TE de-repression would indeed be a possibility. We decided to test for TE de-repression phenotypes using sensitive RNA-FISH experiments of a handful of TEs that are deregulated in kipferl loss of function flies (Baumgartner at al. 2022). This showed that the same TEs are also deregulated in rhino[G31D] flies, further confirming the similarity of the two genotypes. We have added these data to the text and to Figure 5-figure supplement 2, which shows representative RNA FISH images.

      (2) The manuscript would benefit from adding more evolutionary comparisons. The following or similar analyses would help put the finding into a broader evolutionary perspective:

      i) Is Kipferl's surface interacting with Rhino also conserved in Kipferl orthologs? In other words, are the Rhino-interacting amino acids of Kipferl under any pressure to be conserved?

      We performed an analysis of the Kipferl interface that interacts with the Rhino chromodomain in those species where Kipferl could be unambiguously identified. This showed that the residues involved in the Rhino interaction are generally conserved. We have added this analysis to Figure 1-figure supplement 4.

      ii) The remarkable conservation of Rhino's G31 is at odds with the arms race that is proposed to be happening between the fly's piRNA pathway proteins and transposons. Does this mean that Rhino's chromodomain is "untouchable" for such positive selection? 

      We agree that the conservation of the G31 residue argues against this binding interface being under positive selection in Rhino. Without understanding the pressures acting on Rhino that underlie the previously published positive selection, we find it difficult to draw firm conclusions. Mutating G31 in fly species that lack Kipferl would be an interesting experiment.

      Recommendations for the authors:

      (1) RNAseq is important to the full characterization of the phenotype and should be included. It's now clear that the major piRNA clusters are not required for fertility, so I would also include an analysis of piRNA production and Rhino binding to regions flanking isolated insertions. 

      See our response to raised weakness #1 above. Briefly, we have now added an analysis of TE de-repression based on RNA-FISH experiments (Figure 5-figure supplement 2). Regarding the proposed analysis of piRNA production and Rhino binding to regions flanking isolated TE insertions: this is an important issue that we carefully analysed in our previous work characterising the kipferl mutant (Baumgartner et al. 2022). In the present work, we focused on generating a rhino mutant that uncouples Rhino from Kipferl.

      (2) The authors do not provide direct biochemical evidence that the chromodomain substitution blocks Rhino binding to Kipferl. However, Rhino protein is very low abundance, making analysis of the endogenous protein very difficult.

      Based on our previous work (Baumgartner et al 2022), the Rhino chromodomain interacts directly with the fourth zinc finger of Kipferl. Mutation of a single residue in the predicted interface (Rhino[G31D]) phenocopies a kipferl mutant, strongly suggesting that this mutation disrupts the Rhino-Kipferl interaction. Definitive evidence will have to await the reconstitution of this interaction using recombinant proteins. Our attempts to purify recombinant Kipferl (expressed in bacteria or in insect cells) or the protein fragments relevant to the interaction were unsuccessful so far. While we obtained soluble fractions of the first ZnF array, there was always a high level of co-purifying nucleic acids that we were not able to remove.

      (3) Even if the Rhino G31D mutant retains its ability to interact with H3K9me3 it does not localize correctly on the chromatin preventing certain regions such as locus 80F from being converted into piRNA source loci. However other regions such as satellite regions attract the Rhino mutant protein converting them into super piRNA source loci, phenocopying the effects observed in a Kipferl mutant. Why Rhino when not bound to Kipferl concentrates in satellite regions is a question that remains unanswered.

      This is a very interesting question indeed. We have not been able to elucidate the molecular basis of how Rhino is recruited to satellite repeats in Kipferl mutants. For example, we performed a proximity biotinylation experiment with GFP-Rhino in Kipferl mutant ovaries, but this experiment did not reveal any protein that would explain the observed accumulation of Rhino at the complex satellite repeats.

      (4) In the phylogenetic analysis the authors identified two residues as Rhino-specific and conserved sequence alterations, the D31G mutation and the G62 insertion. However, the authors limit their study to D31G mutation, and nothing is performed on the G32 insertion. It would have been interesting to know the impact of this insertion on Rhino's biology. 

      The role, if any, of the Rhino-specific G62 insertion and its effect on Rhino localisation or function is an interesting topic for further study. We have not investigated the G62 residue experimentally. In the current manuscript, we limited our efforts to the analysis of the G31D mutation, as the goal was to identify the mode of interaction with Kipferl, and the G62 residue is not predicted to contact Kipferl according to AlphaFold.

      (5) The authors report that the G31D mutation of Rhino phenocopies the Kipferl mutant. Rhino is wrongly localized in the nucleus, and Rhino G31D recruitment in certain Kipferl-enriched regions is affected, as at the 80F locus, which correlates with a strong drop in piRNA production from this locus. To go a step further in demonstrating that G31D phenocopies the Kipferl mutant, it would have been informative to analyse how much TE piRNAs are affected and whether TEs are deregulated.

      See our response to similar comments above. We have added RNA-FISH experiments to illustrate that the TE de-repression phenotypes are comparable between rhino[G31D] and kipferl loss of function ovaries (Figure 5-figure supplement 2). Analyses of TE-mapping piRNAs also show well correlated phenotypes (Figure 5-figure supplement 1).

      (6) Figure 3A: To homogenize with the immunostaining presented in Figure 3B, can the authors add on the bar graph depicting female fertility the results obtained with kipferl-/- and rhino-/- genotype? 

      rhino mutants are completely (100%) sterile and the fertility of kipferl mutants was previously measured to range between 15% and 40% (Baumgartner et al. 2022).

      (7) Figure 4A: It would have been interesting to show Venn diagrams showing the overlap of genomic regions enriched for Kipferl versus regions enriched for Rhi in a WT and in a Rhi G31D mutant. 

      We consider the analysis presented in Figure 4 to be more meaningful, as a Venn diagram would require binary cut-offs.

      (8) Figure 1B: In the phylogenic analysis for Rhino/HP1d two D. simulans lines are presented. Can the authors clarify this point?

      There are two Rhino paralogs in D. simulans: one paralog (NCBI: AAY34025.1) is more similar to D. melanogaster Rhino, contains one intron and is located at chromosome chr2R (assembly Apr. 2005, WUGSC mosaic 1.0/droSim1: 12256895-12258668). The second paralog (XP_002106478.1) is located on chromosome X (6734493-6735248) and does not contain an intron. We have added a clarifying statement to the corresponding figure legend.

      (9) To determine whether Rhino G31D point mutation affects the overall function of Rhino, the authors analysed Kipferl-independent piRNA source loci by looking at Responder and 1,688 family satellites. I'm not sure that these loci can be classified as Kipferl-independent piRNA source loci since a strong increase of piRNA production from these loci in Kipferl mutant is observed. In my point of view, the 42AB and 38C are real Kipferl-independent piRNA source loci as piRNA production from these loci is not affected by Kipferl KD. 

      Indeed, the Rsp and 1,688 family satellites are not completely independent of Kipferl, as their expression and Rhino occupancy differ between wild-type and kipferl loss-of-function phenotypes (including rhino[G31D]). However, we believe that this increase is due to a strong dependence on different sequestration mechanisms and is not mediated by a direct function of Kipferl at these sites. Similarly, we observe slight differences in piRNA production for the peripheral parts of cluster 42AB, as well as differences in Rhino occupancy despite an unaltered piRNA profile at cluster 38C (Baumgartner et al. 2022). Thus, different flavours of Kipferl-independence exist, with the only truly Kipferl-independent piRNA sources likely to be the piRNA clusters in the testis. A clear classification is further complicated by previously observed compensatory effects in the piRNA pathway, leading us to adopt the current definition of "requiring Kipferl for Rhino recruitment" to distinguish Kipferl-dependent from Kipferl-independent sites.

      (10) The authors report that the G31D mutation of Rhino phenocopies the Kipferl mutant. Rhino is wrongly localized in the nucleus, and Rhino G31D recruitment in certain Kipferl-enriched regions is affected, as at 80F locus, which correlates with a strong drop in piRNA production from this locus. To go a step further in demonstrating that G31D phenocopies the Kipferl mutant, it would have been interesting to look at how much TE piRNAs are affected and whether TEs (and which class of TE) are deregulated by RNAseq and/or in situ hybridization. 

      See our response to similar comments above. Our new RNA-FISH experiments and TE-mapping piRNA analysis extend the comparison of phenotypes between kipferl mutants and rhino[G31D] mutants and are consistent with our previous conclusions (Figure 5-figure supplements 1 and 2).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Jellinger et al. performed engram-specific sequencing and identified genes that were selectively regulated in positive/negative engram populations. In addition, they performed chronic activation of the negative engram population over 3 months and observed several effects on fear/anxiety behavior and cellular events such as upregulation of glial cells and decreased GABA levels.

      Strengths:

      They provide useful engram-specific GSEA data and the main concept of the study, linking negative valence/memory encoding to cellular level outcomes including upregulation of glial cells, is interesting and valuable.

      Weaknesses:

      A number of experimental shortcomings make the conclusion of the study largely unsupported. In addition, the observed differences in behavioral experiments are rather small, inconsistent, and the interpretation of the differences is not compelling.

      Major points for improvement:

      (1) Lack of essential control experiments

      With the current set of experiments, it is not certain that the DREADD system they used was potent and stable throughout the 3 months of manipulations. Basic confirmatory experiments (e.g., slice physiology at 1m vs. 3m) to show that the DREADD effects on these vHP are stable would be an essential bottom line to make these manipulation experiments convincing.

      In previous work from our lab performing long-term activation of Gq DREADD receptors in the vHPC, we quantify the presence of Gq receptor expression over 3-, 6- and 9-month timepoints and show that there is no decrease in receptor expression, as measured via fluorescence intensity (Suthard et al., 2023). In this study, we also address that even if our manipulation is only working for 1 month, rather than 3 months, we are observing the long-term effects of this shorter-term stimulation. This is still relevant, and only changes how we interpret these findings, as shorter-term stimulation or disruption of neuronal activity can still have detrimental effects on behavior.

      Furthermore, although the authors use the mCherry vector as a control, they did not have a vehicle/saline control for the hM3Dq AAV. Thus, the long-term effects such as the increase in glial cells could simply be due to the toxicity of DREADD expression, rather than an induced activity of these cells.

      For chemogenetic studies, our experimental rationale utilized a standard approach in the field, which includes one of two control options: 1) active receptor vs. control vector + ligand or 2) active receptor + ligand or saline control. We chose the first option, as this more properly controls for the potential off-target effects of the ligand itself, as shown in other previous work (Xia et al., 2017). This is particularly important for studies using CNO, as many off-target effects have been noted as a limitation (Manvich et al., 2018). We chose to use DCZ as it is closely related to CNO and newer ligands, but comes with added benefits of high specificity, low off-target effects, high potency and brain penetrance (Nagai et al., 2020), but any potential off-target effects of DCZ are yet to be completely investigated as this ligand is very new.

      Evidence of DREADD toxicity has been shown at high titer levels of AAV2/7- CamKIIα-hM4D(Gi)-mCherry in the hippocampus at 5 weeks, as the reviewer pointed out in their above comment (Goossens et al., 2021). Our viral strategy is targeted to a much smaller number of cells using AAV9-DIO-Flex-hM3Dq-mCherry at a lower titer, unlike expression within a much larger population of CaMKII+ excitatory neurons in this study. Additionally, visual comparison of their viral load and expression with ours shows much more intense expression that spans a larger area of the hippocampus (Goossens et al, 2021; Figure 1D), whereas ours is isolated to a smaller region of vHPC (see Figure 1B).

      Further, we attempted to quantify a decrease in neuronal health (Yousef et al., 2017) resulting from DREADD expression via NeuN counts within multiple hippocampal subregions for the 6- and 14-month groups across active Gq receptor and mCherry conditions and did not observe significant decreases in NeuN as a result (Supplemental Figure 1). However, immunohistochemistry of an individual marker may not be sufficient to capture the entire health profile of an individual neuron and future work should consider other markers of cell death or inflammation, which we have added to the Limitations & Future Work section of our Discussion.

      (2) Figure 1 and the rest of the study are disconnected

      The authors used the cFos-tTA system to label positive/negative engram populations, while the TRAP2 system was used for the chronic activation experiments. Although both genetic tools are based on the same IEG Fos, the sensitivity of the tools needs to be validated. In particular, the sensitivity of the TRAP2 system can be arbitrarily altered by the amount of tamoxifen (or 4OHT) and the administration protocols. The authors should at least compare and show the percentage of labeled cells in both methods and discuss that the two experiments target (at least slightly) different populations. In addition, the use of TRAP2 for vHP is relatively new; the authors should confirm that this method actually captures negative engram populations by checking for reactivation of these cells during recall by overlap analysis of Fos staining or by artificial activation.

      We thank the reviewer for their comments and opportunity to discuss the marked differences between TRAP2 and DOX systems. In particular, we agree that while both systems rely on the the Fos promoter to drive an effector of interest, their efficacy and temporal resolution vary substantially depending on genetic cell-type, brain region, temporal parameters of Dox or 4-OHT delivery, subject-by-subject metabolic variability, and threshold to Fos induction given the promoter sequences inherent to each system. For example, recent studies have reported the following:

      - The TRAP2 line labels a subset of endogenously activeCA1 pyramidal cells (e.g. 5-18%) while the DOX system labels 20-40% of CA1 pyramidal cells (DeNardo et al, 2019; Monasterio et al, BioRxiv 2024 ).

      - The temporal windows for each range from hours in TRAP2 to 24-48 hours for DOX (DeNardo et al, 2019; Denny et al, 2014; Liu & Ramirez et al, 2012).

      - The efficacy of “tagging” a population of cells with TRAP2 vs with DOX will constrain the number of possible cells that may overlap with cFos upon re-exposure to a given experience (e.g. see the observed overlaps in vCA1 - BLA circuits (Kim & Cho, 2020), compared to vCA1 in general (Ortega-de San Luis et al, 2023) and valence-specific vCA1 populations (Shpokayte et al, 2022).

      - Tagging vCA1 cells with both the TRAP2 and DOX systems are nonetheless sufficient to drive corresponding behaviors (e.g. vCA1 terminal stimulation drives behavioral changes with the DOX and TRAP2 system (Shpokayte et al, 2022) and vCA1 stimulation of an updated fear-linked ensemble drives light-induced freezing in a neutral context utilizing the TRAP2 and DOX systems (Ortega-de San Luis et al, 2023)).

      Finally, and promisingly, as more studies continue to link the in vivo physiological dynamics of these cell populations tagged using each system (e.g. compare Pettit et al, 2022 with Tanaka et al, 2018) and correlating their activity to behavioral phenotypes, our field is in the prime position to uncover deeper principles governing hippocampus-mediated engrams in the brain. Together, we believe a more comprehensive understanding of these systems is fully warranted, especially in the service of further cataloging cellular similarities and differences within such tagged populations.

      (3)  Interpretation of the behavior data

      In Figures 3a and b, the authors show that the experimental group showed higher anxiety based on time spent in the center/open area. However, there were no differences in distance traveled and center entries, which are often reduced in highly anxious mice. Thus, it is not clear what the exact effect of the manipulation is. The authors may want to visualize the trajectories of the mice's locomotion instead of just showing bar graphs.

      Our findings show that our experimental group displays higher levels of anxiety-like behaviors as measured via time spent in center/open area, while there are no differences in distance traveled or center entries. For distance traveled, our interpretation is in line with complementary research (Jimenez et al, 2018; Kheirbek et al, 2013) that shows no changes in distance traveled/distance traveled in the center coupled with changes in anxiety levels as a result of manipulation within anxiety-related circuits. More broadly, any locomotion-related deficit could cause a change in distance traveled that is unrelated to anxiety-like behaviors alone. For example, a reduction in distance traveled could be coupled with a decrease in time spent in the center, but could also result only from motor or exploratory deficits. We hope that this explanation clarifies our interpretation of the open field and elevated plus maze findings in light of other literature.

      In addition, the data shown in Figure 4b is somewhat surprising - the 14MO control showed more freezing than the 6MO control, which can be interpreted as "better memory in old". As this is highly counterintuitive, the authors may want to discuss this point. The authors stated that "Mice typically display increased freezing behavior as they age, so these effects during remote recall are expected" without any reference. This is nonsense, as just above in Figure 4a, older mice actually show less freezing than young mice. Overall, the behavioral effects are rather small and random. I would suggest that these data be interpreted more carefully.

      In Figure 4B, we present our findings from remote recall and observe increased freezing levels in control mice with age, as mentioned by the reviewer, indicating increased memory. This is in line with previous work from Shoji & Miyakawa, 2019 which has been added as a reference for the quotation described above; we thank the reviewer for pointing this error out. As the reviewer has pointed out, above in Figure 4A, we measured freezing levels across all groups during contextual fear conditioning before the start of chronic stimulation, as this was the session we ‘tagged’ a negative memory in. Although it appears that there may be slightly lower levels of freezing in older (14-month old) mice, our findings do not determine statistical significance for difference between age group, only effects of time and subject which are expected as freezing increases within the session and animals display high levels of variability in freezing levels across many experiments (Figure 4A i-iii). We also find in previous work that control mice receiving 3-, 6- and 9-months of chronic DCZ stimulation in the vHPC with empty vector (mCherry) receptor show an increase in freezing with age (Suthard et al, 2023; Figure 2A ii).

      (4) Lack of citation and discussion of relevant study

      Khalaf et al. 2018 from Gräff lab showed that experimental activation of recall-induced populations leads to fear attenuation. Despite the differences in experimental details, the conceptual discrepancy should be discussed.

      As mentioned by the reviewer, Khalaf et al. 2018 showed that experimental activation of recall-induced populations in the dentate gyrus leads to fear attenuation. Specifically, they pose that this fear attenuation occurs in these ensembles through updating or unlearning of the original memory trace via the engagement, rather than suppression, of an original traumatic experience. Despite the differences in experimental details with our current study and this work, we agree that the conceptual discrepancy should be discussed. First, one major difference is that we are reactivating an ensemble that was tagged during fear memory encoding, while Khalaf et al. are activating a remote recall-induced ensemble that was tagged one month after encoding. Although there is high overlap between the encoding and recall ensembles when mice are exposed to the conditioning context, these ensembles are not identical and may result in different behavioral phenotypes when chronically reactivated. Further, Khalaf et al rely on reactivation of the recall-induced ensemble during extinction to facilitate rapid fear attenuation. This differs from our current work, as their reactivation is occurring during the extinction process in the previously conditioned context, while we are reactivating chronically in the animal’s home cage over the course of a longer time period. It may be necessary that the memory is first reactivated, and thus, more liable to re-contextualization, in the original context compared to an unrelated homecage environment where there are presumably no related cues present. Importantly, this previous work tests the attenuation of fear shortly after an extinction process, while we are not traditionally extinguishing the context with aid of the memory reactivation. Finally, we are testing remote recall (3 months post-conditioning), while they are testing at a shorter time interval (28 days). In line with these ideas, future work may seek to tease out the mechanistic differences between recent and remote memory extinction both in terms of natural memory recall and chronically manipulated memory-bearing cells.

      Reviewer #2 (Public Review):

      Summary:

      Jellinger, Suthard, et al. investigated the transcriptome of positive and negative valence engram cells in the ventral hippocampus, revealing anti- and pro-inflammatory signatures of these respective valences. The authors further reactivated the negative valence engram ensembles to assay the effects of chronic negative memory reactivation in young and old mice. This chronic re-activation resulted in differences in aspects of working memory, and fear memory, and caused morphological changes in glia. Such reactivation-associated changes are putatively linked to GABA changes and behavioral rumination.

      Strengths:

      Much of the content of this manuscript is of benefit to the community, such as the discovery of differential engram transcriptomes dependent on memory valence. The chronic activation of neurons, and the resultant effects on glial cells and behavior, also provide the community with important data. Laudable points of this manuscript include the comprehensiveness of behavioral experiments, as well as the cross-disciplinary approach.

      Weaknesses:

      There are several key claims made that are unsubstantiated by the data, particularly regarding the anthropomorphic framing of "rumination" on a mouse model and the role of GABA. The conclusions and inferences in these areas need to be carefully considered.

      (1) There are many issues regarding the arguments for the behavioural data's human translation as "rumination." There is no definition of rumination provided in the manuscript, nor how rumination is similar/different to intrusive thoughts (which are psychologically distinct but used relatively interchangeably in the manuscript), nor how rumination could be modelled in the rodent. The authors mention that they are attempting to model rumination behaviours by chronically reactivating the negative engram ("To understand if our experimental model of negative rumination..."), but this occurs almost at the very end of the results section, and no concrete evidence from the literature is provided to attempt to link the behavioural results (decreased working memory, increased fear extinction times) to rumination-like behaviours. The arguments in the final paragraph of the Discussion section about human rumination appear to be unrelated to the data presented in the manuscript and contain some uncited statements. Finally, the rumination claims seem to be based largely upon a single data figure that needs to be further developed (Figure 6, see also point 2 below).

      (2) The staining and analysis in Figure 6 are challenging to interpret, and require more evidence to substantiate the conclusions of these results. The histological images are zoomed out, and at this resolution, it appears that only the pyramidal cell layer is being stained. A GABA stain should also label the many sparsely spaced inhibitory interneurons existing across all hippocampal layers, yet this is not apparent here. Moreover, both example images in the treatment group appear to have lower overall fluorescence intensity in both DAPI and GABA. The analysis is also unclear: the authors mention "ROIs" used to measure normalized fluorescence intensity but do not specify what the ROI encapsulates. Presumably, the authors have segmented each DAPI-positive cell body and assessed fluorescence however, this is not explicated nor demonstrated, making the results difficult to interpret.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work.

      (3) A smaller point, but more specific detail is needed for how genes were selected for GSEA analysis. As GSEA relies on genes to be specified a priori, to avoid a circular analysis, these genes need to be selected in a blind/unbiased manner to avoid biasing downstream results and conclusions. It's likely the authors have done this, but explicitly noting how genes were selected is an important context for this analysis.

      As mentioned in our Methods section, gene sets were selected based on pre-existing biology and understanding of genes canonically involved in “neurodegeneration” such as those related to apoptotic pathways and neuroinflammation or “neuroprotection” such as brain-derived neurotrophic factor, to name a few. A limitation of this method is that we must avoid making strong claims about the actual function of these up- or down-regulated genes without performing proper knock-in or knock-out studies, but we hope that this provides an unbiased inventory for future experiments to perform causal manipulations.

      Reviewer #3 (Public Review):

      Summary:

      The authors note that negative ruminations can lead to pathological brain states and mood/anxiety dysregulation. They test this idea by using mouse engram-tagging technology to label dentate gyrus ensembles activated during a negative experience (fear conditioning). They show that chronic chemogenetic activation of these ensembles leads to behavioral (increased anxiety, increased fear generalization, reduced fear extinction) and neural (increases in neuroinflammation, microglia, and astrocytes).

      Strengths:

      The question the authors ask here is an intriguing one, and the engram activation approach is a powerful way to address the question. Examination of a wide range of neural and behavioral dependent measures is also a strength.

      Weaknesses:

      The major weakness is that the authors have found a range of changes that are correlates of chronic negative engram reactivation. However, they do not manipulate these outcomes to test whether microglia, astrocytes, or neuroinflammation are causally linked to the dysregulated behaviors.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      - Figure 2c should include Month0, the BW before the start of the manipulation.

      Regrettably, we do not have access to the Month 0 body weights at this time as this project changed hands over the course of the past year or so. This is an inherent limitation that we missed during analysis and we pose this as a limitation in the Results section after describing this finding. Therefore, it is possible that over the first month of stimulation (Month 0-1), there may have been a drop in body weight that rebounded by the first measurement at Month 1 that continued to increase normally through Months 2-3, as shown in our Figure 1. Thank you for this note.

      - Figure 6a looks confusing - the background signal in the green channel is very different between control and experimental groups. Were representative images taken with different microscope settings?

      The representative images were taken with the same microscope power settings, but were adjusted in brightness/contrast within FIJI for clarity in the Figure – we apologize that this was misleading in any way and thank the reviewer for their feedback. Further, based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work.

      - Typo mChe;try

      This typo was fixed

      - "During this contextual... mice in the 6- and 14- month groups..." Isn't it 3- and 11- month respectively at the time of fear conditioning? Throughout the manuscript, this point was written very confusingly.

      Yes, we thank the reviewer for pointing this out. It has been corrected to 3- and 11-month old mice at the timing of fear conditioning and clarified throughout the manuscript where applicable.

      - "GABAergic eYFP fluorescence" Where does the eYFP come from? The methods state that GABA quantification is based on IHC staining.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this

      E/I imbalance in future work. We discuss this E/I balance not being directly assessed in the Limitations & Future Directions section of our Discussion, noting the importance of detailed quantification of both excitatory and inhibitory markers within the hippocampus.

      Reviewer #2 (Recommendations For The Authors):

      (1) There is a full methods section ("Analysis of RNA-seq data") that mostly describes RNA-seq analysis that seemingly does not appear in the paper. This section should be reviewed.

      We have included this portion of the methods that explain the previous workflow from Shpokayte et al., 2022 where this dataset was generated and this has been noted in the “Analysis of RNA-seq data” section of the methods.

      (2) Figure 6: GABA staining should be more critically analyzed, as discussed above, and validated with another GABA antibody for rigor. From the representative images provided in Figure 6, it looks possibly as though the hM3Dq images were simply not fully in the focal plane when being imaged or were over-washed, as DAPI staining also appears to be lower in these images.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work. Specifically, it will be necessary to rigorously investigate both excitatory and inhibitory markers within this region to ensure these claims are substantiated. Thank you for this suggestion.

      (3) The first claim that human GABAergic interneurons cause rumination is uncited. (Page 19, first sentence beginning with: "Evidence from human studies suggests...").

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work. Apologies for the lack of citation in-text, the proper citation for this finding is Schmitz et al, 2017.

      (4) Gene names throughout the manuscript and figure are written in the wrong format for mice (eg: Page 13, second line: SPP1, TTR, and C1QB1 instead of Spp1, Ttr, C1qb1).

      This was corrected throughout the manuscript.

      (5) Tense on Page 15 third sentence of the second paragraph: "...spatial working memory was assessed...".

      This was corrected throughout the manuscript.

      (6) Supplemental Figure 1 would benefit from normalization of the NeuN+ cell counts. The inclusion of an excitatory and inhibitory neuron marker in this figure might benefit the argument that there is a change in the excitation/inhibition of the hippocampus - as the numbers of excitatory neurons outweigh the numbers of inhibitory neurons that would be assayed here.

      In an effort to normalize the NeuN+ cell counts, for each of our ROIs (6-8 single tiles for each brain region (DG, vCA1, vSub) x 3-5 coronal slices = ~18 single tiles per mouse x 3-4 mice) we captured a 300 x 300 micrometer, single-tile z-stack at 20x magnification. These ROIs were matched for dimensions and brain regions across all groups for each hippocampal subregion quantified. We initially proposed to normalize these NeuN counts over DAPI, but because DAPI includes all nuclei (microglia, oligodendrocytes, astrocytes and neurons), we weren’t sure this was the most optimal tool. We do agree that further quantification of excitatory and inhibitory cell markers would be vital to more concrete interpretation of our findings and we have added this to our Limitations & Future Work section of the Discussion.

      Reviewer #3 (Recommendations For The Authors):

      (1) The DOX tagging window lacks temporal precision. I suggest the authors note this as a limitation.

      We thank the reviewer for noting this, and we have added this limitation to the Methods section with the context of the 24-48 hour DOX window being longer than other methods like TRAP.

      (2) Is there a homeostatic response to chronic engram stimulation? That is, is DCZ as effective in increasing neuronal excitability on day 90 as it is on day 1. This could be addressed with electrophysiology, or with IEG induction. Alternatively, the authors could refer to previous literature-- for example, Xia et al (2017) eLife-- that examined whether there was any blunting of the effects of DREADD ligands after sustained delivery via drinking water. There, of course, may be other papers as well.

      As noted by the reviewer, it is important to determine if DCZ maintains its effects on neuronal excitability throughout the 3 month administration period. To address this, previous work has shown that CNO administration in drinking water over one month consistently inhibited hM4Di+ neurons without altering baseline neuronal excitability as measured by firing rate and potassium currents (Xia et al, 2017). Although this is only for one month, it is administered via the same oral route as our DCZ protocol and suggests that at least for that amount of time we are likely producing consistent effects. In our reply above to Reviewer #1’s comment, we also note that even if DCZ is only having an effect for one month, rather than 3 months, we are still observing enduring changes that resulted from this short-term disturbance.

      (3) Please double check there is no group effect on weight in 6-month-old mice in Figure 2C.

      Two-way RM ANOVA showed no main effect of Group within the 6-month-old control and hM3Dq groups.

      Group: F(1,17) = 1.361, p=0.2594.

      (4) The shock intensity is much higher than is typical for fear conditioning studies in mice. Why was this the case?

      Yes, we do agree that this shock intensity is on the higher side of typical paradigms in mice, however, our lab has utilized 0.75mA to 1.5mA intensity foot shocks for contextual fear conditioning in the past (Suthard & Senne et al, 2023; 2024; Dorst & Senne et al, 2023; Grella et al., 2022; Finkelstein et al., 2022) and we maintained this protocol for internal consistency. However, it would be interesting to systematically investigate how differing intensities of foot shock, subsequent tagging of this ensemble and reactivation would uniquely impact behavioral state acutely and chronically in mice.

      (5) Remote freezing is very low. The authors should comment on this-- perhaps repeated testing has led to some extinction?

      A reviewer above suggested a similar phenomenon may be occuring, specifically fear attenuation as a result of chronic stimulation. They referenced previous work from Khalaf et al. 2018, where they reactivated a recall-induced ensemble, while we reactivated an ensemble tagged during encoding. We expand upon this work in light of our findings within the Limitations & Future Work section of our Discussion. However, we do appreciate the lower levels of freezing observed in remote recall and sought out other literature to understand the typical range of remote freezing levels. One thing that we note is that our remote recall is occurring 3 months after conditioning, which is much longer than typical 14-28 day protocols. However, we find that freezing levels at remote timepoints from 21-45 days results in contextual freezing levels of between 20-50% approximately (Kol et al., 2020), as well as 40-75% approximately in a variety of 28 day remote recall experiments (Lee et al., 2023). This information, together with our current experimental protocol demonstrates a wide range of remote freezing levels that may depend heavily on the foot shock intensity, duration of days after conditioning, and animal variability.

      (6) "mice display increased freezing with age": please add a reference.

      Apologies, we missed the citation for that claim and it has been added in-text and in the references list (Shoji & Miyakawa, 2019).

      (7) Related to the low freezing levels for remote memory, why is generalization minimal? Many studies have shown that there is a time-dependent emergence of generalized fear, yet here this is not seen. Is it linked to extinction (as above)? Or genetic background?

      Previous work has shown that rats receiving multiple foot shocks during conditioning displayed a time-dependent generalization of context memory, while those receiving less shocks did not (Poulos et al., 2016), as the reviewer noted in their comment. In our current study, we observe low levels of generalization in all of our groups compared to freezing levels displayed in the conditioned context at the remote timepoint, in opposition to this time-dependent enhancement of generalization. It is possible that the genetic background of our C57BL/6J mice compared to the Long-Evans rat strain in this previous work accounts for some of this difference. In addition, it is possible that the longer duration of time (3 months) compared to their remote timepoint (28 days) resulted in time-dependent decrease in generalization that decreases with greater durations of time from original conditioning. As noted above, it is indeed plausible that the reactivation of a contextual fear ensemble over time is attenuating freezing levels for both the original and similar contexts (Khalaf et al, 2018). We discuss the differences in our study and this 2018 work more comprehensively above.

      (8) Morphological phenotypes of astrocytes/microglia. Would be great to do some transcriptomic profiling of microglia/astrocytes to couple with the morphological characterization (but appreciate this is beyond the scope of current work).

      We thank the reviewer this suggestion, we agree that would be an incredibly informative future experiment and have added this to our Limitations & Future Experiments section of the Discussion.

      (9) The authors could consider including a limitations section in their discussion which discusses potential future directions for this work:

      - causal experiments.

      - E/I balance is not assessed directly (interestingly, in this regard, expanded engrams are linked to increased generalization [e.g., Ramsaran et al 2023]).

      Thank you for this suggestion, we have added a Limitations & Future Directions section to our Discussion and have expanded upon these suggested points.

      (10) For Figure 10, consider adding an experimental design/timeline.

      We are making the assumption that the reviewer meant Figure 1 instead of Figure 10 here, but note that there is a description of the viral expression duration (D0-D10), followed by an off Dox period of 48 hours (D10-D12), with subsequent engram tagging of a negative (foot shock) or positive (male-to-female exposure) on D12. In our experiments (Shpokayte et al., 2022), Dox was administered for 24 hours (D12-D13), which was followed by sacrificing the animal for cell suspension and sequencing of the positive and negative engram populations. This figure also shows the viral strategy for the Tet-tag system (Figure 1A), as well as representative viral expression in vHPC (Figure 1B). We are happy to add additional experimental design/timeline information to this figure that would be helpful to the reviewer.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The mechanisms of how axonal projections find their correct target requires the interplay of signalling pathways, and cell adhesion that act over short and long distances. The current study aims to use the small ventral lateral clock neurons (s-LNvs) of the Drosophila clock circuit as a model to study axon projections. These neurons are born during embryonic stages and are part of the core of the clock circuit in the larval brain. Moreover, these neurons are maintained through metamorphosis and become part of the adult clock circuit. The authors use the axon length by means of anti-Pdf antibody or Pdf>GFP as a read-out for the axonal length. Using ablation of the MB- the overall target region of the s-LNvs, the authors find defects in the projections. Next, by using Dscam mutants or knock-down they observe defects in the projections. Manipulations by the DNs - another group of clock neurons- can induce defects in the s-LNvs axonal form, suggesting an active role of these neurons in the morphology of the s-LNvs.

      Strengths:

      The use of Drosophila genetics and a specific neural type allows targeted manipulations with high precision.

      Proposing a new model for a small group of neurons for axonal projections allows us to explore the mechanism with high precision.

      Weaknesses:

      It is unclear how far the proposed model can be seen as developmental.

      The study of changes in fully differentiated and functioning neurons may affect the interpretation of the findings.

      We appreciate the reviewer's feedback on the strengths and weaknesses of our study.

      We acknowledge the strengths of our research, particularly the precision afforded by using Drosophila genetics and a specific neural type for targeted manipulations, as well as the proposal of a new model for studying axonal projections in a small group of neurons.

      We understand the concerns about the developmental aspects of our proposed model and the use of Pdf-GAL4 >GFP as a read-out for the axonal length (revised manuscript Figure 1--figure supplement 1). However, even with the use of Clk856-GAL4 that began to be expressed at the embryonic stage (revised manuscript Figure 3--figure supplement 1) to suppress Dscam expression, the initial segment of the dorsal projection of s-LNvs (the vertical part) remained unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      Thank you for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Public Review):

      Summary:

      The paper from Li et al shows a mechanism by which axons can change direction during development. They use the sLNv neurons as a model. They find that the appearance of a new group of neurons (DNs) during post-embryonic proliferation secretes netrins and repels horizontally towards the midline, the axonal tip of the LNvs.

      Strengths:

      The experiments are well done and the results are conclusive.

      Weaknesses:

      The novelty of the study is overstated, and the background is understated. Both things need to be revised.

      We appreciate your acknowledgment that the experiments were well-executed and the results conclusive. This validation reinforces the robustness of our findings.

      We take note of your feedback regarding the novelty of the study being overstated and the background being understated. While axonal projections navigate without distinct landmarks, like the midline or the layers, columns, and segments, they pose more challenges and uncertainties. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons.

      We value your insights, and we have carefully addressed these points in our manuscript revision to improve the overall quality and presentation of our research.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      The overall idea of using the s-LNvs as a model is indeed intriguing. There are genetic tools available to tackle these cells with great precision.

      However, based on the stage at which these cells are investigated raises some issues, that I feel are critical to be addressed.

      These neurons develop their axonal projections during embryogenesis and are fully functioning when the larvae hatch, thus to investigate axonal pathfinding one would have to address embryonic development.

      The larval brain indeed continues to grow during larval life, however extensive work from the Hartenstein lab, Truman lab, and others have shown that the secondary (larval born) neurons do not yet wire into the brain, but stall their axonal projections.

      It is thus quite unclear, what the authors are actually studying.

      One interpretation could be that the authors observe changes in axon length due to morphological changes in the brain. Indeed, the fact that the MB expands the anatomy of the surrounding neuropil changes too.

      Moreover, it is unclear when exactly the Pdf-Gal4 (and other drivers) are active, thus how far (embryonic) development of s-LNvs is affected, or if it's all happening in the differentiated, functioning neuron. (Gal4 temporal delay and dynamics during embryonic development may further complicate the issue). As far as I am aware the MB drivers might already be active during embryonic stages.

      Since the raised issue is quite fundamental, I am not sure what might be the best and most productive fashion to address this.

      Eg. either to completely re-focus the topic on "neural morphology maintenance" or to study the actual development of these cells.

      We thank the reviewer for the detailed and insightful feedback on our study. We have tested whether Pdf-Gal4 could effectively label s-LNv, and tracked the s-LNv projection in the early stage after larvae hatching. We did not observe the PDF antibody staining signal and the GFP signal driven by Pdf-GAL4 when the larvae were newly hatched. At 2-4 hours ALH, PDF signals were primarily concentrated at the end of axons, while GFP signals were mainly concentrated at the cell body. Helfrich-Förster initially detected immunoreactivity for PDF in the brains approximately 4-5 hours ALH. The GFP signal expressed by Pdf-GAL4 driver does have signal delay. However, at 8 hours ALH, the GFP signal strongly co-localized with the PDF signal within the axons (see revised manuscript lines 98-101) (Figure 1—figure supplement 1).

      Based on previous research findings and our staining of Clk856-GAL4 >GFP, it is indeed confirmed that the dorsal projection of s-LNvs in Drosophila is formed during the embryonic stage (Figure 3—figure supplement 1). The s-LNvs in first-instar larval Drosophila are capable of detecting signal output and may play a role in regulating certain behaviors. Our selection of tools for characterizing the projection pattern of s-LNv was not optimal, leading us to overlook the crucial detail that the projection had already formed during its embryonic stage.

      However, even when employing Clk856-GAL4 to suppress Dscam expression from the embryonic stage, the initial segment of the dorsal projection of s-LNvs (the vertical part) remains unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      From the results searched in the Virtual Fly Brain (VFB) database (https://www.virtualflybrain.org/), it is clear that the neurons that form synaptic connections with s-LNvs at the adult stage are essentially completely different from the neurons that are associated with them at the L1 larval stage. Thus, most neurons that form synapses with s-LNvs in the early larvae either cease to exist after metamorphosis or assume other roles in the adult stage. Similar to the scenario where Cajal-Retzius cells and GABAergic interneurons establish transient synaptic connections with entorhinal axons and commissural axons, respectively, these cells form a transient circuit with presynaptic targets and subsequently undergo cell death during development. In our model, the neurons that synapse with s-LNvs in early development serve as "placeholders," offering positive or negative cues to guide the axonal targeting of s-LNvs towards their ultimate destination.

      Thank you again for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      In the introduction too many revisions are cited and very few actual research papers. This should be corrected and the most significant papers in the field should be cited. For example, there is no reference to the pioneering work from the Christine Holt lab or the first paper looking at axon guidance and guideposts by Klose and Bentley, Isbister et al 1999.

      The introduction should encapsulate the actual knowledge based on actual research papers.

      We acknowledge your concern regarding the citation of review papers rather than primary research papers in the introduction. Following your suggestion, we have revised the introduction section to incorporate references to relevant research papers.

      In the introduction and discussion: The authors cite revisions where the signals that guide axons across different regions including turning are shown and they end up saying: "However, how the axons change their projection direction without well-defined landmarks is still unclear." I think the sentence should be changed. Many things are still not clear but this is not a good phrasing. Maybe they could focus on their temporal finding?

      We appreciate the reviewer's feedback and insightful suggestions. We agree that emphasizing the temporal aspect is crucial in our study. However, we also recognize the significance of understanding the origin of signals that guide axonal reorientation at specific locations. While axonal projections navigating without distinct landmarks pose more challenges and uncertainties compared to those guided by prominent landmarks like the midline, our research demonstrates the crucial role of a specific cell population near turning points in providing accurate guidance cues to ensure precise axonal reorientation. We have revised our phrasing in the introduction and discussion to better reflect these key points (see revised manuscript lines 69-71 and 350-354). Thank you for highlighting the significance of focusing on our temporal findings and the complexities involved in studying axonal projection.

      Many rather old papers have looked into the effect of repulsive guideposts to guide axon projections. In particular, I can think of the paper from Isbister et al. 1999 (DOI: 10.1242/dev.126.9.2007) that not only shows how semaphoring guides Ti axon projection but also shows how the pattern of expression of sema 2a changes during development to guide the correct projection. I really think that the novelty of the paper should be revised in light of the actual knowledge in the field.

      We appreciate the reviewer's reference to the seminal work by Isbister et al. (1999) and the importance of guidepost cells in axon projection guidance, which we have already cited in our revised manuscript. It is crucial to recognize that segmented patterns such as the limb segment traversed by Ti1 neuron projections or neural circuits formed in a layer- or column-specific manner also serve as intrinsic "guideposts," offering valuable insights into axonal pathfinding processes. In our model, explicit guidance cues are lacking. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons (see revised manuscript lines 350-354). We have ensured that our revised manuscript reflects these insights and emphasizes the significance of studying axonal guidance in the absence of distinct guideposts. Thank you for underscoring these essential points, which enhance our understanding of axonal projection dynamics.

      Minors:

      Line 54, the authors start talking about floorplate at the end of a section on Drosophila. Please use “In vertebrates”, or “in invertebrates” or “in Drosophila” etc.. when needed to put things in context.

      We thank the reviewer for this suggestion and have modified this sentence. Please refer to lines 62-63 of the revised manuscript.

      Line 69: many factors change the axonal outgrowth. The authors are missing the paper from Fernandez et al. 2020, who have shown that unc5 the receptor of netrin induces the stalling for sLNvs projections before the turn. https://doi.org/10.1016/j.cub.2020.04.025

      We thank the reviewer for this suggestion and have added this research article. Please refer to line 79 of the revised manuscript.

      Line 99: "precisely at the pivotal juncture". It I hard to see how it was done in the figures shown. Can the authors add a small panel with neuronal staining showing this (please no HRP)?

      For all figures, tee magenta is too strong and it is really hard to see the sLNvs projections. Can this be sorted, please?

      We have depicted the pivotal juncture in the schematic diagram on the left side of Figure 1C. Additionally, we have included a separate column of images without HRP in Figure 1A. Moreover, we have modified the pseudo-color of HRP from magenta to blue to enhance the visualization of the s-LNv projection. The figure legends have also been correspondingly modified.

      Line 407: Spatial position relationship between calyx and s-LNvs. OK107-GAL4 labels ... calyx and s-LNvs labeled by, which which.

      We have modified it according to your suggestion. Please refer to lines 430-432 of the revised manuscript.

      Line 137 typo RPRC

      We thank the reviewer for noticing this mistake, which has now been corrected. Please refer to line 148-149 of the revised manuscript.

      Section 158-164. the paper from Zhang et al 2019 needs to be cited since they have found the same effect of decreasing Dscam even if they didn't think about horizontal projection.

      Thanks to the suggestion, we have included in the manuscript the phenotype observed by Zhang et al. (2019) upon knocking down Dscam1-L in adults. Please refer to lines 170-172 of the revised manuscript.

      Line 176: typo senses (instead of sensor).

      Thank you for pointing out our mistake. We have modified it according to your suggestion. Please refer to line 189 of the revised manuscript.

      Line 193: more than Interesting it is Notable. Add "ubiquitus" knockdown.

      Thank you for the suggestion. We have included the word "ubiquitus" to enhance the precision of the narrative. Please refer to line 206 of the revised manuscript.

      Line 224: the pattern of expression of the crz cells is not visible where the projections of sLNvs are located. Are they in that region? Or further away?

      We've changed the pseudo-color of HRP, and in the updated Figure 5- figure supplement 1, you can see the projection pattern of crz+ cells, positioned close to the end of the s-LNv axon terminal.

      Line 243: applied? Do you mean "used"

      Thank you for the suggestion. We have revised it at line 256.

      Figure 5 Sup1: the schematic shows DNs proliferation that is not visible on the GFP image. Please comment.

      We have modified the Figure 5 figure supplementary 1 for 120 h per-GAL4, Pdf-GAL80 >GFP expression pattern. Due to the strong GFP intensity in some DN neurons, there was a loss of GFP signal. Additionally, in Figure 6 figure supplementary 1, we have added co-localization images of DN and s-LNv at 72 h and 96 h. To better illustrate the co-localization information, we have shown only a portion of the layers in the right panel. We hope these additions clarify your concerns.

      Line 251: cite Fernandez et al. 2020 with Purohit et al 2012.

      We have modified it according to your suggestion. Please refer to line 264 of the revised manuscript.

      Line 272: you have not shown synergistic effects because you have not modulated both pathways at the same time. You should talk about complementary.

      We have modified it according to your suggestion at lines 25, 285, 439.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) Point for more elaborate discussion: Apparently the timescale of negative feedback signals is conserved between endothelial cell migration in vitro (with human cells) and endothelial migration during the formation of ISVs in zebrafish. What do you think might be an explanation for such conserved timescales? Are there certain processes within cytoskeletal tension build up that require this quantity of time to establish? Or does it relate to the time that is needed to begin to express the YAP/TAZ target genes that mediate feedback?

      The underlying mechanisms responsible for the conserved timescale is a major direction that we continue to explore. Localization of YAP/TAZ to the nucleus is likely not rate-limiting. We showed previously that acute RhoA activation produced significant YAP/TAZ nuclear localization within minutes, while subsequent co-transcriptional activity aligned with the gene expression dynamics observed here (Berlew et al., 2021). We hypothesize that the dynamics of YAP/TAZdependent transcription and the translation of those target genes are rate-limiting for initial feedback loop completion (tic = 4 hours). This is supported by work from us and others in a variety of cell lines showing YAP/TAZ transcriptional responses take place during the first few hours after activation. (Franklin et al., 2020; Mason et al., 2019; Plouffe et al., 2018) While our data identify mediators of initial feedback loop completion, the molecular effectors that determine the timescale of new cytoskeletal equilibrium establishment (teq = 8 hours) remain unclear.

      (2) Do you expect different timescales for slower endothelial migratory processes (e.g. for instance during fin vascular regeneration which takes days)?

      We selected the ISV development model because it exhibits similar migratory kinetics to our previously-explored human ECFC migration in vitro. The comparable kinetics allowed us to study dynamics of the feedback loop in vivo on similar time scales, but we have not explored models featuring either slower or faster dynamics. 

      It would be interesting to test how feedback dynamics are impacted in distinct endothelial migratory processes. Our data suggest that the feedback loop is necessary for persistent migration; however, YAP and TAZ respond to a diversity of upstream regulators in addition to mechanical signals, which might depend on the process of vascular morphogenesis. For example, after fin amputation, inflammation and tissue regeneration may impact the biochemical and mechanical environment experienced by the endothelium. Additionally, cells display different migratory behaviors in ISV morphogenesis compared to fin regeneration. During ISV formation, sprouting tip cells migrate dorsally through avascular tissue, followed by stalk cells. (Ellertsdóttir et al., 2010) In contrast, the fin vasculature regenerates by forming an intermediate vascular plexus, where some venous-derived endothelial cells migrate towards the sprouting front, while others migrate against it. (Xu et al., 2014) We are excited to study the role of this feedback loop in these different modes of neovessel formation in future studies.

      (3) Is the ~4hrs and 8hrs feedback time window a general property or does it differ between specific endothelial cell types? In the veins the endothelial cells generate less stress fibers and adhesions compared to in the arteries. Does this mean that there might be a difference in the feedback time window, or does that mean that certain endothelial cell types may not have such YAP/TAZcontrolled feedback system?

      Recent studies suggest that venous endothelial cells are the primary endothelial subtype responsible for blood vessel morphogenesis. (Lee et al., 2022, 2021; Xu et al., 2014) They are highly motile and mechanosensitive, migrating against blood flow. (Lee et al., 2022) The Huveneers group has shown that the actin cytoskeleton is differently organized in adult arteries and veins in response to biomechanical properties of its extracellular matrix, rather than intrinsic differences between arterial and venous cells. (van Geemen et al., 2014) This suggests that arterial and venous cells have distinct cytoskeletal setpoints due to mechanical cues in their environment (Price et al., 2021). We expect this to impact the degree of cytoskeletal remodeling and cell migration at equilibrium, rather than the kinetics of the feedback loop per se, though we have not yet tested this hypothesis. Testing these predictions on cytoskeletal setpoint stability and adaptation is a major direction that we continue to explore. 

      (4) The experiments are based on perturbations to prove that transcriptional feedback is needed for endothelial migration. What would happen if the feedback systems is always switched on? An experiment to add might be to analyse the responsiveness of endothelial cells expressing constitutively active YAP/TAZ.

      This is a problem that we are actively pursuing. Though the feedback system forms a coherent loop, we anticipate that the identity of the node of the loop selected for constitutive activation will influence the outcome, depending on whether that node is rate-limiting for feedback kinetics and the extent of intersection of that node with other signaling events in the cell. For example, we have observed that constitutive YAP activation drives profound changes to the transcriptional landscape including, but not limited to, RhoA signaling (Jones et al., 2023). We further anticipate that constitutive activation of feedback loop nodes may alter feedback dynamics, while dynamic or acute perturbation will be required to dissect these contributions in real time. For these reasons, ongoing work in the lab is pursuing these questions using optogenetic tools that enable precise spatial and temporal control (Berlew et al., 2021).   

      (5) To investigate the role of YAP-mediated transcription in an accurate time-dependent manner the authors may consider using the recently developed optogenetic YAP translocation tool: https://doi.org/10.15252/embr.202154401

      We are enthusiastic about the power of optogenetics to interrogate the nodes and timescales of this feedback system, and we are now funded to pursue this line of research. 

      Reviewer #2:

      The idea is intriguing, but it is not clear how the feedback actually works, so it is difficult to determine if the events needed could occur within 4 hrs. Specifically, it is not clear what gene changes initiated by YAP/TAZ translocation eventually lead to changes in Rho signaling and contractility. Much of the evidence to support the model is preliminary. Some of the data is consistent with the model, but alternative explanations of the data are not excluded. The fish washout data is quite interesting and does support the model. It is unclear how some of the in vitro data supports the model and excludes alternatives.

      Major strengths:

      The combination of in vitro and in vivo assessment provides evidence for timing in physiologically relevant contexts, and a rigorous quantification of outputs is provided. The idea of defining temporal aspects of the system is quite interesting.

      Major weaknesses:

      The evidence for a "loop" is not strong; rather, most of the data can also be interpreted as a linear increase in effect with time once a threshold is reached. Washout experiments are key to setting up a time window, yet these experiments are presented only for the fish model. A major technical challenge is that siRNA experiments take time to achieve depletion status, making precise timing of events on short time scales problematic. Also, Actinomycin D blocks most transcription so exposure for hours likely leads to secondary and tertiary effects and perhaps effects on viability. No RNA profiling is presented to validate proposed transcriptional changes.

      We thank the reviewer for these helpful suggestions. We have expanded our explanation of the history and known mediators of the feedback loop in the introduction. We and, independently, the Huveneers group recently reported that human endothelial cells maintain cytoskeletal equilibrium for persistent motility through a YAP/TAZ-mediated feedback loop that modulates cytoskeletal tension. (Mason et al., 2019; van der Stoel et al., 2020) Because YAP and TAZ are activated by tension of the cytoskeleton (Dupont et al., 2011), suppression of cytoskeletal tension by YAP/TAZ transcriptional target genes constitutes a negative feedback loop (Fig. 1A). We described key components of this cell-intrinsic feedback loop, which acts as a control system to maintain cytoskeletal homeostasis for persistent motility via modulation of Rho-ROCK-myosin II activity. (Mason et al., 2019) Both we and the Huveneers group found that perturbation of genes and pathways regulated by YAP/TAZ mechanoactivation can functionally rescue motility in YAP/TAZ-depleted cells (e.g., RhoA/ROCK/myosin II, NUAK2, DLC1). (Mason et al., 2019; van der Stoel et al., 2020) We further showed previously that both YAP/TAZ depletion and acute YAP/TAZ-TEAD inhibition consistently increased stress fiber and FA maturation and arrested cell motility, accounting for these limitations of siRNA. (Mason et al., 2019)

      Enduring limitations to the temporal, spatial, and cell-specific control of the genetic and pharmacologic methods have inspired us to initiate alternative approaches, which are the subject of ongoing efforts. Further research will be necessary in the zebrafish to determine the extent to which the observed migratory dynamics are driven by cytoskeletal arrest. 

      To identify early YAP/TAZ-regulated transcriptional changes, we have added RNA profiling of control and YAP/TAZ depleted cells cultured on stiff matrices for four hours. Genes upregulated by YAP/TAZ depletion were enriched for Gene Ontology (GO) terms associated with Rho protein signal transduction, vascular development, cellular response to vascular endothelial growth factor (VEGF) stimulus, and endothelial cell migration (Fig. 9B). These data support a role for YAP and TAZ as negative feedback mediators that maintain cytoskeletal homeostasis for endothelial cell migration and vascular morphogenesis.  

      Reviewer #3:

      The authors used ECFC - endothelial colony forming cells (circulating endothelial cells that activate in response to vascular injury).

      Q: Did the authors characterize these cells and made sure that they are truly endothelial cells - for example examine specific endothelial markers, arterial-venous identity markers & Notch signalling status, overall morphology etc prior to the start of the experiment. How were ECFC isolated from human individuals, are these "healthy" volunteers - any underlying CVD risk factors, cells from one patient or from pooled samples, what injury where these humans exposed to trigger the release of the ECPFs into the circulation, etc. The materials & methods on ECFC should be expanded.

      Human umbilical cord blood-derived ECFCs were isolated at Indiana University School of Medicine and kindly provided by Dr Mervin Yoder. Cells were cultured as described by the Yoder group (Rapp et al., 2011) and our prior paper (Mason et al., 2019). We have expanded the materials and methods section to describe the source and characterization of these cells.

      The authors suggest that loss of YAP/TAZ phenocopies actinomycin-D inhibition - "both transcription inhibition and YAP/TAZ depletion impaired polarization, and induced robust ventral stress fiber formation and peripheral focal adhesion maturation". However, the cell size of actinomycin-D treated cells (Fig. 1B, top right panel), differs from the endothelial cell size upon siYAP/TAZ (Fig. 1E, top right panel) - and vinculin staining seems more pronounced in actinomycin-D treated cells (B, bottom right) when compared to siYAP/TAZ group. Cell shape is defined by acto-myosin tension.

      Q: Besides Fraction of focal adhesion >1um; focal adhesion number did the authors measure additional parameters related to cytoskeleton remodelling / focal adhesions that can substantiate their statement on similarity between loss of YAP/TAZ and actinomycin-D treatment. Would it be possible to make a more specific genetic intervention (besides YAP/TAZ) interfering with the focal adhesion pathway as opposed to the broad spectrum inhibitor actinomyocin-D.

      Our previous paper (Mason et al., 2019) delineated the mechanistic relationships between YAP/TAZ signaling, focal adhesion turnover, actomyosin polymerization, and the intervening mechanisms of myosin regulation. Specifically, we demonstrated that YAP/TAZ regulate the myosin phosphatase kinase, NUAK2, and ARHGAP genes to mediate this feedback. Expanding on this work, the current study aimed to define the temporal kinetics of the cytoskeletal mechanotransductive feedback in vitro and in vivo. We used actinomycin-D and YAP/TAZ depletion to interrogate the role of transcriptional regulation and YAP/TAZ signaling, respectively. In this revision, we have added RNA profiling that identifies early YAP/TAZ-regulated transcriptional changes and further points to other molecular mediators of focal adhesions (e.g. TRIO, RHOB, THBS1) that will be the subjects of future studies.    

      Q: Does the actinomycin-D treatment affect responsiveness to Vegf? induce apoptosis or reduce survival of the ECFC?

      We have not looked specifically at the effect of actinomycin-D treatment on responsiveness to VEGF. However, actinomycin-D has been reported to reduce transcription of VEGF receptors (E et al., 2012). In contrast, we found that YAP/TAZ depletion upregulated GO terms associated with endothelial cell migration and response to VEGF stimulus (Fig. 9B), as well as receptors to angiogenic growth factors, including KDR and FLT4 (Fig. 9E). These results suggest YAP/TAZ depleted cells may be more sensitive to VEGF stimulation but remain nonmotile due to cytoskeletal arrest.

      We showed previously that long-term treatment with actinomycin-D reduces ECFC survival (Mason et al., 2019).

      Q: Which mechanism links ECM stiffness with endothelial surface area in the authors scenario. In zebrafish, activity of endothelial guanine exchange factor Trio specifically at endothelial cell junctions (Klems, Nat Comms, 2020) and endoglin in response to hemodynamic factors (Siekmann, Nat Cell Biol 2017) have been show to control EC shape/surface area - do these factors play a role in the scenario proposed by the authors.

      Our new transcriptional profiling indicates both Trio and endoglin are regulated through YAP and TAZ in human ECFCs. We plan to follow up on these findings.

      Q: The authors report that EC migrate faster on stiff substrate, and concomitantly these cells have a larger surface area. What is the physiological rationale behind these observations. Did the authors observe such behaviors in their zebrafish ISV model? How do these observations integrate with the tip - stalk cell shuffling model (Jakobsson & Gerhardt, Nat Cell Biol, 2011) and Notch activity in developing ISVs.

      This question raises important distinctions between the mode of migration in ISV morphogenesis and endothelial cells adherent to substrates. Cells behave and respond to mechanical cues differently in 2D vs. 3D matrices. (LaValley and Reinhart-King, 2014) Additionally, the microenvironment in vivo is much more complex, combining numerous biochemical signals and changing mechanical properties. (Whisler et al., 2023) We are actively investigating the downstream targets of YAP/TAZ mechanotransduction and how that integrates with other pathways known to regulate vascular morphogenesis, such as Notch signaling. 

      The authors examined the formation of arterial intersegmental vessels in the trunk of developing zebrafish embryos in vivo. They used a variety of pharmacological inhibitors of transcription and acto-myosin remodelling and linked the observed morphological changes in ISV morphogenesis with changes in endothelial cell motility.

      Q: Reduced formation and dorsal extension of ISVs may have several reasons, including reduced EC migration and proliferation. The Tg(fl i1a:EGFP) reporter however is not the most suitable line to monitor migration of individual endothelial cells. Can the authors repeat the experiments in Tg(fl i1a:nEGFP); Tg(kdrl:HRAS-mCherry) double transgenics to visualize movement-migration of the individual endothelial cells and EC proliferation events, in the different treatment regimes.

      So far, we have not tracked individual endothelial cells during ISV morphogenesis. We agree this is the best approach and are pursuing a similar technique for these experiments.

      ISV formation is furthermore affected by Notch signalling status and a series of (repulsive) guidance cues.

      Q: Does de novo blockade of gene expression with Actinomycin D affect Notch signalling status, expression of PlexinD - sFlt1, netrin1 or arterial-venous identify genes.

      While we have not performed gene expression analysis under the Actinomycin D condition, Actinomycin D functions as a broad transcription inhibitor. We are currently pursuing the downstream targets of YAP/TAZ mechanotransduction in both ECFCs and zebrafish.

      Remark: The authors may want to consider using the Tg(fl i1:LIFEACT-GFP) reporter for in vivo imaging of actin remodelling events.

      We thank the reviewer for their helpful suggestion.

      Remark: the authors report "As with broad transcription inhibition, in situ depletion of YAP and TAZ by RNAi arrested cell motility, illustrated here by live-migration sparklines over 10 hours: siControl: , siYAP/TAZ: (25 μm scale-bar: -)". Can the authors make a separate figure panel for this, how many cells were measured?

      Please refer to our previous publication for the complete details on this data (Mason et al., 2019). We have added the citation in the text.

      Remark: in the wash-out experiments, exposure to the inhibitors is not the same in the different scenarios - could it be that the longer exposure time induces "toxic" side effect that cannot be "washed out" when compared to the short treatment regimes?

      This is a possible limitation of the pharmacological approach and have included it in the discussion section. We are currently exploring alternative approaches to interrogate the timescale of the feedback loop more precisely.  

      References

      Berlew EE, Kuznetsov IA, Yamada K, Bugaj LJ, Boerckel JD, Chow BY. 2021. Single-Component Optogenetic Tools for Inducible RhoA GTPase Signaling. Advanced Biology 5:2100810. doi:10.1002/adbi.202100810

      Dupont S, Morsut L, Aragona M, Enzo E, Giulitti S, Cordenonsi M, Zanconato F, Le Digabel J,Forcato M, Bicciato S, Elvassore N, Piccolo S. 2011. Role of YAP/TAZ in mechanotransduction. Nature 474:179–183. doi:10.1038/nature10137

      E G, Cao Y, Bhattacharya S, Dutta S, Wang E, Mukhopadhyay D. 2012. Endogenous Vascular Endothelial Growth Factor-A (VEGF-A) Maintains Endothelial Cell Homeostasis by Regulating VEGF Receptor-2 Transcription. J Biol Chem 287:3029–3041. doi:10.1074/jbc.M111.293985

      Ellertsdóttir E, Lenard A, Blum Y, Krudewig A, Herwig L, Affolter M, Belting H-G. 2010. Vascular morphogenesis in the zebrafish embryo. Developmental Biology, Special Section: Morphogenesis 341:56–65. doi:10.1016/j.ydbio.2009.10.035

      Franklin JM, Ghosh RP, Shi Q, Reddick MP, Liphardt JT. 2020. Concerted localization-resets precede YAP-dependent transcription. Nat Commun 11:4581. doi:10.1038/s41467-02018368-x

      Jones DL, Hallström GF, Jiang X, Locke RC, Evans MK, Bonnevie ED, Srikumar A, Leahy TP, Nijsure MP, Boerckel JD, Mauck RL, Dyment NA. 2023. Mechanoepigenetic regulation of extracellular matrix homeostasis via Yap and Taz. Proceedings of the National Academy of Sciences 120:e2211947120. doi:10.1073/pnas.2211947120

      LaValley DJ, Reinhart-King CA. 2014. Matrix stiffening in the formation of blood vessels. Advances in Regenerative Biology 1:25247. doi:10.3402/arb.v1.25247

      Lee H-W, Shin JH, Simons M. 2022. Flow goes forward and cells step backward: endothelial migration. Exp Mol Med 54:711–719. doi:10.1038/s12276-022-00785-1

      Lee H-W, Xu Y, He L, Choi W, Gonzalez D, Jin S-W, Simons M. 2021. Role of Venous Endothelial Cells in Developmental and Pathologic Angiogenesis. Circulation 144:1308–1322. doi:10.1161/CIRCULATIONAHA.121.054071

      Mason DE, Collins JM, Dawahare JH, Nguyen TD, Lin Y, Voytik-Harbin SL, Zorlutuna P, Yoder MC, Boerckel JD. 2019. YAP and TAZ limit cytoskeletal and focal adhesion maturation to enable persistent cell motility. Journal of Cell Biology 218:1369–1389. doi:10.1083/jcb.201806065

      Plouffe SW, Lin KC, Moore JL, Tan FE, Ma S, Ye Z, Qiu Y, Ren B, Guan K-L. 2018. The Hippo pathway effector proteins YAP and TAZ have both distinct and overlapping functions in the cell. J Biol Chem 293:11230–11240. doi:10.1074/jbc.RA118.002715

      Price CC, Mathur J, Boerckel JD, Pathak A, Shenoy VB. 2021. Dynamic self-reinforcement of gene expression determines acquisition of cellular mechanical memory. Biophysical Journal 120:5074–5089. doi:10.1016/j.bpj.2021.10.006

      Rapp BM, Saadatzedeh MR, Ofstein RH, Bhavsar JR, Tempel ZS, Moreno O, Morone P, Booth DA, Traktuev DO, Dalsing MC, Ingram DA, Yoder MC, March KL, Murphy MP. 2011. Resident Endothelial Progenitor Cells From Human Placenta Have Greater Vasculogenic Potential Than Circulating Endothelial Progenitor Cells From Umbilical Cord Blood. Cell Med 2:85–96. doi:10.3727/215517911X617888

      Tammela T, Zarkada G, Nurmi H, Jakobsson L, Heinolainen K, Tvorogov D, Zheng W, Franco CA, Murtomäki A, Aranda E, Miura N, Ylä-Herttuala S, Fruttiger M, Mäkinen T, Eichmann A, Pollard JW, Gerhardt H, Alitalo K. 2011. VEGFR-3 controls tip to stalk conversion at vessel fusion sites by reinforcing Notch signalling. Nat Cell Biol 13:1202–1213. doi:10.1038/ncb2331

      van der Stoel M, Schimmel L, Nawaz K, van Stalborch A-M, de Haan A, Klaus-Bergmann A, Valent ET, Koenis DS, van Nieuw Amerongen GP, de Vries CJ, de Waard V, Gloerich M, van Buul JD, Huveneers S. 2020. DLC1 is a direct target of activated YAP/TAZ that drives collective migration and sprouting angiogenesis. Journal of Cell Science 133:jcs239947. doi:10.1242/jcs.239947

      van Geemen D, Smeets MWJ, van Stalborch A-MD, Woerdeman LAE, Daemen MJAP, Hordijk PL, Huveneers S. 2014. F-Actin–Anchored Focal Adhesions Distinguish Endothelial Phenotypes of Human Arteries and Veins. Arteriosclerosis, Thrombosis, and Vascular Biology 34:2059–2067. doi:10.1161/ATVBAHA.114.304180

      Whisler J, Shahreza S, Schlegelmilch K, Ege N, Javanmardi Y, Malandrino A, Agrawal A, Fantin A, Serwinski B, Azizgolshani H, Park C, Shone V, Demuren OO, Del Rosario A, Butty VL, Holroyd N, Domart M-C, Hooper S, Szita N, Boyer LA, Walker-Samuel S, Djordjevic B, Sheridan GK, Collinson L, Calvo F, Ruhrberg C, Sahai E, Kamm R, Moeendarbary E. 2023. Emergent mechanical control of vascular morphogenesis. Science Advances 9:eadg9781. doi:10.1126/sciadv.adg9781

      Xu C, Hasan SS, Schmidt I, Rocha SF, Pitulescu ME, Bussmann J, Meyen D, Raz E, Adams RH, Siekmann AF. 2014. Arteries are formed by vein-derived endothelial tip cells. Nat Commun 5:5758. doi:10.1038/ncomms6758

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors): 

      Major points about revised manuscript 

      (1) While I acknowledge that the Laccase2 vector is probably the best available in terms of its clean circRNA-expression potential, the authors still lack an estimation of the circRNA overexpression efficiency, specifically the circular-to-linear expression ratio. In their second rebuttal letter, the authors argue that they do not have the option to use another probe and that they are limited by the Backsplicing junction (BSJ)-specific one. I assume they mean that such a probe might only partially hybridize with the linear form and therefore give a poor or no signal in the Northern blot. However, in this referee's opinion, it is precisely because of this limitation that the authors should have used another probe against both the linear and circular RNAs to simultaneously and quantitatively detect both isoforms. This would have allowed them to reliably estimate a circular-to-linear ratio. Perhaps the linear isoform is indeed not expressed or is very low for this circRNA overexpression vector, but the probe used by the authors does not prove it. I think that this addition to the manuscript is not strictly necessary at this stage, but it would certainly improve the results.  

      We fully agree with this recommendation. Our efforts to show this using northern blotting was unfortunately unsuccesful due to background signal. To accommodate the question about circ-to-linear ratio, we instead used an RT-qPCR strategy to measure the linear vs circRNA expression derived from the LaccasecircHIPK3 expression constructs/cell lines. To be able to compare obtained results from different amplicons, we measured primer efficiencies (using amplification standard curves – not shown) of two linear Laccase version amplicons and our divergent primers targeting circHIPK3, which were found to be directly comparable. Using these primer sets in RT-qPCR on the same RNA preparation (total cellular RNA) from the northern blot (Supplementary figure S5H) revealed a ~4 fold higher expression of circHIPK3 compared to linear precursor RNA (Supplementary Figure S5I). 

      This demonstrates that the Laccase vector system efficiently produces circHIPK3 RNA as expected. 

      The few changes to the manuscript (results section text and reference to Supplementary Figure S5I) has been highlighted in yellow. The materials and methods section and Table S1 have been modified to include description of RTqPCR and specific primers.

  3. Jun 2024
    1. Author response:

      The following is the authors’ response to the current reviews.

      Joint Public Review:

      Xie et al. propose that the asymmetric segregation of the NuRD complex is regulated in a V-ATPase-dependent manner, and plays a crucial role in determining the differential expression of the apoptosis activator egl-1 and thus critical for the life/death fate decision.

      Remaining concerns are the following:

      The authors should provide the point-by-point response to the following issues. In particular, authors should provide clear reasoning as to why they did not address some of the following comments in the previous revisions. The next response should be directly answering to the following concerns.

      (1) Discussion should be added regarding the criticism that NuRD asymmetric segregation is simply a result of daughter cell size asymmetry. It is perfectly fine that the NuRD asymmetry is due to the daughter cell size difference (still the nucleus within the bigger daughter would have more NuRD, which can determine the fate of daughter cells). Once the authors add this clarification, some criticisms about 'control' may become irrelevant.

      We thank the reviewer for this suggestion. We will add the following text in the revised discussion on page 14, line 26:

      “…We cannot rule out the possibility that NuRD asymmetric segregation results from daughter cell size asymmetry. According to this perspective, the nucleus in the larger daughter cell could possess more NuRD, potentially influencing the fate of the daughter cells. However, it is important to note that the nuclear protein histone or the MYST family histone acetyltransferase is equally segregated in daughter cells of different sizes.….”

      (2) ZEN-4 is a kinesin that predominantly associates with the midzone microtubules and a midbody during mitosis. Given that midbodies can be asymmetrically inherited during cell division, ZEN-4 is not a good control for monitoring the inheritance of cytoplasmic proteins during asymmetric cell division. Other control proteins, such as a transcriptional factor that predominantly localizes in the cytoplasm during mitosis and enters into nucleus during interphase, are needed to clarify the concern.

      We clarified the issue of ZEN-4 below:

      The critique assumes that "midbodies can be asymmetrically inherited during cell division." However, this assumption does not apply to our study of Q cell asymmetric divisions. In our earlier research, we demonstrated that midbodies in Q cells are released post-division and subsequently engulfed by surrounding epithelial cells (Chai et al., Journal of Cell Biology, 2012). Moreover, we have shown that midbodies from the first cell division in C. elegans embryos are also released and engulfed by the P1 cell (Ou et al., Cell Research, 2013). Therefore, the notion of midbody asymmetric inheritance is irrelevant to this manuscript. Additionally, our manuscript already presents the example of the MYST family histone acetyltransferase, illustrating a nuclear protein that predominantly localizes in the cytoplasm during mitosis and symmetrically enters the nucleus during interphase.

      As for pHluorin experiments, symmetric inheritance of GFP and mCherry is not an appropriate evidence to estimate the level of pHluorin during asymmmetric Q cell division. This issue remains unsolved.

      We acknowledge the limitation of pHluorin in measuring the pH level in a living cell. Future studies could be performed to measure the dynamics of pH levels when advanced tools are available.

      (3) Q-Q plot (quantile-quantile plot) in Figure S10 can be used for visually checking normality of the data, but it does not guarantee that the distribution of each sample is normal and has the standard deviation compared with the other samples. I recommend the authors to show the actual statistical comparison P-values for each case. The authors also need to show the number of replicate experiments for each figure panel.

      We thank the reviewer for pointing this out. We will provide P-values for each case and the number of replicate experiments in the revised Figure 5-figure supplement 1 ( corresponding to Figure S10) and the figure legend.

      The authors left inappropriate graphs in the revised manuscript. In Figure 3E, some error bars are disconnected and the other are stuck in the bars. In Figure S4C, LIN-53 in QR.a/p graph shows lines disconnected from error bars.

      We thank the reviewer for pointing this out. We will correct these error bars.

      I am bit confused with the error bars in Figure 2B. Each dot represents a fluorescent intensity ratio of either HDA-1 or LIN-53 between the two daughter cells in a single animal. Plots are shown with mean and SEM, but several samples (for example, the left end) exhibit the SEM error bar very close to a range of min and max. I might misunderstand this graph but am concerned that Figure 2B may contain some errors in representing these data sets. I would like to ask the authors to provide all values in a table format so that the reviewers could verify the statistical tests and graph representation.

      We thank the reviewer for pointing this out. We apologize for the typo in Figure 2B figure legend. We will correct SEM to SD.

      (4) The authors still do not provide evidence that the increase in sAnxV::GFP and Pegl-1gfp or the increase in H3K27ac at the egl-1 gene in hda-1(RNAi) and lin-53(RNAi) animals is not a consequence of global effects on development. Indeed, the images provided in Figure S7B demonstrate that there are global effects in these animals. no causal interactions have been demonstrated.

      We cannot exclude the global effects and have discussed this issue in our previous manuscript on page 9, line 26:

      “...Considering the pleiotropic phenotypes caused by loss of HDA-1, we cannot exclude the possibility that ectopic cell death might result from global changes in development, even though HDA-1 may directly contribute to the life-versus-death fate determination.”

      (5) Figure 4: Due to the lack of appropriate controls for the co-IP experiment (Fig. 4), I remain unconvinced of the claim that the NuRD complex and V-ATPase specifically interact. Concerning the co-IP, the authors now mention that the co-IP was performed three times: "Assay was performed using three biological replicates. Three independent biological replicates of the experiment were conducted with similar results." However, the authors did not use ACT-4::GFP or GFP alone as controls for their co-IP as previously suggested. This is critical considering that the evidence for a specific HDA-1::GFP - V-ATPase interaction is rather weak (compare interactions between HDA-1::GFP and V-ATPase subunits in Fig 4B with those of HDA-1::GFP and subunits of NuRD in Fig S8B).

      We conducted GFP pull-down experiments and MS spectrometric analysis for HDA-::GFP and ACT-4::GFP using identical protocols, yielding consistent results. We agree with the reviewer that in our Western blot, inclusion of ACT-4::GFP is a more effective negative control compared to empty beads.

      (6) Based on Fig 5E, it appears that Bafilomycin treatment causes pleiotropic effects on animals (see differences in HDA-1::GFP signal in the three rows). The authors now state: "Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we found no evidence of such pleiotropic effects at the organismal level with the dosage and duration of treatment employed in this study". However, the 'evidence' mentioned is not shown. It is critical that the authors provide this evidence.

      We thank the Reviewer for pointing out this issue. We only checked the viability of the L1 larvae and morphology of animals at the organismal level with the BafA1 dosage and duration of treatment and did not notice any death of the animals and apparent abnormality in morphology (N > 20 for each treatment). However, as the reviewer pointed out, there can be some abnormalities at the cellular level. We thus revised this above description as the following, on page 11, line 27:

      “…Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we did not observe any larval deaths and apparent abnormality in morphology at the organismal level (N > 20 for each treatment) at the dose and duration of treatment employed in this study...”


      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors propose that the asymmetric segregation of the NuRD complex in C. elegans is regulated in a V-ATPase-dependent manner, that this plays a crucial role in determining the differential expression of the apoptosis activator egl-1, and that it is therefore critical for the life/death fate decision in this species. If proven, the proposed model of the V-ATPase-NuRD-EGL-1-Apoptosis cascade would shed light onto the mechanisms underlying the regulation of apoptosis fate during asymmetric cell division, and stimulate further investigation into the intricate interplay between V-ATPase, NuRD, and epigenetic modifications. However, the strength of evidence for this is currently incomplete.

      Public Review:

      Xie et al. propose that the asymmetric segregation of the NuRD complex is regulated in a V-ATPase-dependent manner, and plays a crucial role in determining the differential expression of the apoptosis activator egl-1 and thus critical for the life/death fate decision.

      While the model is very intriguing, the reviewers raised concerns regarding the rigor of the method. One issue is with statistics (either insufficient information or inadequate use of statistics), and second is the concern that the asymmetry observed may be caused by one cell dying (resulting in protein degradation, RNA degradation etc). We recommend that the authors address these issues.

      We extend our sincere thanks to the Editors and Reviewers for their insightful comments on this study.

      Major #1:

      There are still many misleading statements/conclusions that are not rigorously tested or that are logically flawed. These issues must be thoroughly addressed for this manuscript to be solid.

      (1) Asymmetry detected by scRNA seq vs. imaging may not represent the same phenomenon, thus should not be discussed as two supporting pieces of evidence for the authors' model, and importantly each method has its own flaw. First, for scRNA seq, when cells become already egl-1 positive, those cells may be already dying, and thus NuRD complex's transcripts' asymmetry may not have any significance. The data presented in FigS1D, E show that there are lots of genes (6487 out of 8624) that are decreased in dying cells. Thus, it is not convincing to claim that NuRD asymmetry is regulated by differential RNA amount.

      We agree with the reviewer's comment. Indeed, scRNA-seq reveals phenomena different from those observed in protein imaging, and NuRD asymmetry may not be regulated by differential RNA levels. Seven years ago, when we started this project, NuRD asymmetry during asymmetric neuroblast division was unknown. We first found NuRD mRNA asymmetry using scRNA-seq and then NuRD protein asymmetry using fluorescence imaging. We have documented the whole process of discovering NuRD asymmetry, although the asymmetry of NuRD complex transcripts does not necessarily imply protein asymmetry. We have revised statements related to "NuRD asymmetry being regulated by differential RNA amounts" and discussed this issue in the revised manuscript on page 14, line 2:

      " The transcript asymmetry detected by scRNA-seq may not correspond to the protein asymmetry detected by microscopic imaging. Our scRNA-seq data shows that 6487 out of 8624 genes were not detected in egl-1-positive cells, the putative apoptotic cells. Cells that are egl-1 positive may be undergoing apoptosis, rendering the asymmetry of NuRD complex transcripts insignificant in inferring protein asymmetry. Thus, the observed transcript asymmetry of the NuRD subunits between live and dead cells may be coincidental with NuRD protein asymmetry during asymmetric neuroblast division, rather than serving as a regulatory mechanism."

      (2) Regarding NuRD protein's asymmetry, there are still multiple issues. Most likely explanation of their asymmetry is purely daughter size asymmetry. Because one cell is much bigger than the other (3 times larger), NuRD components, which are not chromatin associated, would be inherited to the bigger cell 3 times more than the smaller daughter. Then, upon nuclear envelope reformation, NuRD components will enter the nucleus, and there will be 3 times more NuRD components in the bigger daughter cell. It is possible that this is actually the underling mechanism to regulate gene expression differentially, but this possibility is not properly acknowledged. Currently, the authors use chromatin associated protein (Mys-1) as 'symmetric control', but this is not necessarily a fair comparison. For NuRD asymmetry to be meaningful, an example of protein is needed that is non-chromatin associated in mitosis, distributed to daughter cells proportional to daughter cell size, and re-enter nucleus after nuclear envelope formation to show symmetric distribution. And if daughter size asymmetry is the cause of NuRD asymmetry, other lineages that do not undergo apoptosis but exhibit daughter size asymmetry would also show NuRD asymmetry. The authors should comment on this (if such examples exist, it is fine in that in those cell types, NuRD asymmetry may be used for differential gene expression, not necessarily to induce cell death, but such comparison provides the explanation for NuRD asymmetry, and puts the authors finding in a better context).

      For more than one decade, we have meticulously explored the relationship between protein asymmetry and cell size asymmetry during ACDs of Q cells. A notable example of even protein distribution is the cytokinetic kinesin ZEN-4, as documented in our 2012 publication in the Journal of Cell Biology (Chai et al., JCB, 2012). This study, primarily focusing on the fate of the midbody post-cell division, also showcased the dynamics of GFP-tagged ZEN-4 during ACDs of QR.a cells in movie S1. Intriguingly, beyond its role in the cytokinetic ring, we observed a uniform dispersal of ZEN-4 throughout the cytoplasm. Remarkably, following cell division, ZEN-4 transitions evenly into the nuclei of the daughter cells, a phenomenon with implications yet to be fully understood. One hypothesis is that ZEN-4's nuclear localization may prevent the formation of ectopic microtubule bundles in the cytosol during interphase. Below, we present a snapshot from our original movie, clearly showing the symmetrical distribution of ZEN-4 into the nuclei of the two daughter cells.

      (3) For the analysis of protein asymmetry between two daughters in Fig S4C, the method of calibration is unclear, making it difficult to interpret the results.

      In Figure S4C, we quantified the relative total fluorescence of the Q cell, with the quantification method illustrated in Figure S4A. To further clarify our quantification approach, we have updated Figure S4A and the "Live-Cell Imaging and Quantification" section in the Materials and Methods:

      “…To determine the ratios of fluorescence intensities in the posterior to anterior half (P/A) of Q.a lineages or A/P of Q.p lineages, the cell in the mean intensity projection was divided into posterior and anterior halves. ImageJ software was used to measure the mean fluorescence intensities of two halves with background subtraction. The slide background's mean fluorescence intensity was measured in a region devoid of worm bodies. The background-subtracted mean fluorescence intensities of the two halves were divided to calculate the ratio. The same procedure was used to determine the fluorescence intensity ratios between two daughter cells. Total fluorescence intensity was the sum of the posterior and anterior fluorescence intensities or the sum of fluorescence intensities from two daughter cells (Figure S4A). …”

      (4) As for pHluorin experiments, the authors were asked to test the changes in fluorescence observed are due to changes in pH or changes in the amount of pHluorin protein. They need to add a ratio-metric method in this manuscript. A brief mention to Page 12 line 12 is insufficient to clarify this issue.

      We appreciate the concerns about potential changes in pH or pHluorin protein levels. While we cannot completely dismiss the impact of changes in the amount of pHluorin protein, it appears improbable that the asymmetry of pHluorin fluorescence is attributed to an asymmetric amount of pHluorin protein. This inference is supported by the observation that other fluorescent proteins, such as GFP or mCherry, did not exhibit any asymmetry during ACDs of Q cells. An example of GFP alone during the ACD of QL.p is illustrated in figure 5A from Ou and Vale, JCB, 2009. The fluorescence intensities in the large QL.pa cell and the small QL.aa are indistinguishable.

      Major #2:

      Some issues surrounding statistics must be resolved.

      (1) Fig. 1FG, 2D, 3BDEG, 5BD and 6B used either one-sample t-test or unpaired two-tailed parametric t-test for statistical comparison. These t-tests require a verification of each sample fitting to a normal distribution. The authors need to describe a statistical test used to verify a normal distribution of each sample.

      (2) Fig. 2D, 3D, and 3G have very small sample size (N=3-4, N=6, N=3, respectively), it is possible that a normal distribution cannot be verified. How can the authors justify the use of one-sample t-test and unpaired parametric t-test ?

      (3) Statistical comparison in Fig. 2D and Fig. 6B should be re-assessed. For Fig. 2D, the authors need to compare the intensity ratio of HDA-1/LIN53 between sister cells dying within 35 min and those over 400 min. For Fig. 6B, they need to compare the intensity ratio of VHA-17 between DMSO- and BafA1- treated cells at the same time point after anaphase.

      We appreciate the reviewer's advice on the statistical analysis of our data. In response, we performed normality tests on the datasets presented in Figures 1F, 1G, 3B, 5B, 5D, and 6B, all of which passed the tests (as demonstrated in Figure S10). We also acknowledge the reviewer's comment on the inadequate sample sizes in Figures 2D, 3D, 3E, and 3G for fitting a normal distribution. Therefore, we have revised our statistical analysis methods for these figures and updated both the figures and their legends. The revised statistical results support the primary conclusions of this study.

      In response to the reviewer's observation regarding the small sample size in Figure 2D , which precluded normality verification, and the suggestion to compare sister cells that die within 35 minutes to those surviving over 400 minutes, we adapted our approach. We implemented the Kruskal-Wallis test to evaluate the differences among the groups. To assess the specific differences between each group and the 400 min MSpppaap group, we conducted the Dunn’s multiple comparisons test. The revised Figure 2D illustrates the updated statistical significance.

      For Figure 3D, due to the small sample size precluding normality verification, we applied the Wilcoxon test with 1 as the theoretical median. The revised Figure 3D illustrates the updated statistical significance.

      For Figure 3E, where the sample size also hindered normality verification, we conducted the Kruskal-Wallis test to evaluate the overall effect. Additionally, Dunn’s multiple comparisons test was utilized to examine the differences between groups. The revised Figure 3E illustrates the updated statistical significance.

      For Figure 3G, the reviewer pointed out the small sample size and the limited statistical power due to having only three data points per group. To address this, we revised the figure to visually present each data point, aiming to more clearly illustrate the variation trends.

      For Figure 6B, following the reviewer's suggestion, we compared the DMSO group directly with the Baf A1 group, updating Figure 6B to reflect this comparison as advised.

      These adjustments have been made to ensure the statistical analyses are robust and appropriate given the sample sizes and to align with the reviewer's recommendations, enhancing the clarity and accuracy of our findings.

      Recommendations for the authors:

      We recommend using grey scale (instead of 'heatmap' representation) to show the protein distribution of interest. Heatmap does not help at all, because 'total protein amount per cell' (instead of signal intensity on each pixel) is what matters in the context of this paper. Heatmap presentation does not allow readers to integrate signal intensity with their eyes.

      We thank the editor for pointing this out. We have changed heatmaps to inverted fluorescence images in grey scale.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study presents a valuable tool for searching molecular dynamics simulation data, making such data sets accessible for open science. The authors provide convincing evidence that it is possible to identify useful molecular dynamics simulation data sets and their analysis can produce valuable information.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      Tiemann et al. have undertaken an original study on the availability of molecular dynamics (MD) simulation datasets across the Internet. There is a widespread belief that extensive, well-curated MD datasets would enable the development of novel classes of AI models for structural biology. However, currently, there is no standard for sharing MD datasets. As generating MD datasets is energy-intensive, it is also important to facilitate the reuse of MD datasets to minimize energy consumption. Developing a universally accepted standard for depositing and curating MD datasets is a huge undertaking. The study by Tiemann et al. will be very valuable in informing policy developments toward this goal.

      Strengths:

      The study presents an original approach to addressing a growing concern in the field. It is clear that adopting a more collaborative approach could significantly enhance the impact of MD simulations in modern molecular sciences.

      The timing of the work is appropriate, given the current interest in developing AI models for describing biomolecular dynamics.

      Weaknesses:

      The study primarily focuses on one major MD engine (GROMACS), although this limitation is not significant considering the proof-of-concept nature of the study.

      We thank the reviewer for his/her comments. Moving forward, our plan includes expanding this research to encompass other MD engines used in biomolecular simulations and materials sciences, such as NAMD, Charmm, Amber, LAMMPS, etc. However, this requires parsing associated files to supplement the sparse metadata generally available for the related datasets

      Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      The authors make a prototype search engine available.

      The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      Materials sciences seem to be severely under-represented --- commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

      We thank the reviewer for his/her comments. As mentioned to Reviewer 1, we intend to extend this work to other MD engines in the near future to go beyond Gromacs and even biomolecular simulations. Furthermore, as the value of accessing and indexing specialized MD databases such as MDDB, MemprotMD, GPCRmd, NMRLipids, ATLAS, and others has been mentioned by the reviewer, it is indeed one of our next steps to continue to expand the MDverse catalog of MD data. This indexing may also extend the visibility and widespreaded adoptability of these specific databases.

      Reviewer #3 (Public Review):

      Molecular dynamics (MD) simulations nowadays are an essential element of structural biology investigations, complementing experiments and aiding their interpretation by revealing transient processes or details (such as the effects of glycosylation on the SARS-CoV-2 spike protein, for example (Casalino et al. ACS Cent. Sci. 2020; 6, 10, 1722-1734 https://doi.org/10.1021/acscentsci.0c01056) that cannot be observed directly. MD simulations can allow for the calculation of thermodynamic, kinetic, and other properties and the prediction of biological or chemical activity. MD simulations can now serve as "computational assays" (Huggins et al. WIREs Comput Mol Sci. 2019; 9:e1393.

      https://doi.org/10.1002/wcms.1393). Conceptually, MD simulations have played a crucial role in developing the understanding that the dynamics and conformational behaviour of biological macromolecules are essential to their function, and are shaped by evolution. Atomistic simulations range up to the billion atom scale with exascale resources (e.g. simulations of SARS-CoV-2 in a respiratory aerosol. Dommer et al. The International Journal of High Performance Computing Applications. 2023; 37:28-44. doi:10.1177/10943420221128233), while coarse-grained models allow simulations on even larger length- and timescales. Simulations with combined quantum mechanics/molecular mechanics (QM/MM) methods can investigate biochemical reactivity, and overcome limitations of empirical forcefields (Cui et al. J. Phys. Chem. B 2021; 125, 689 https://doi.org/10.1021/acs.jpcb.0c09898).

      MD simulations generate large amounts of data (e.g. structures along the MD trajectory) and increasingly, e.g. because of funder mandates for open science, these data are deposited in publicly accessible repositories. There is real potential to learn from these data en masse, not only to understand biomolecular dynamics but also to explore methodological issues. Deposition of data is haphazard and lags far behind experimental structural biology, however, and it is also hard to answer the apparently simple question of "what is out there?". This is the question that Tiemann et al explore in this nice and important work, focusing on simulations run with the widely used GROMACS package. They develop a search strategy and identify almost 2,000 datasets from Zenodo, Figshare and Open Science Framework. This provides a very useful resource. For these datasets, they analyse features of the simulations (e.g. atomistic or coarse-grained), which provides a useful snapshot of current simulation approaches. The analysis is presented clearly and discussed insightfully. They also present a search engine to explore MD data, the MDverse data explorer, which promises to be a very useful tool.

      As the authors state: "Eventually, front-end solutions such as the MDverse data explorer tool can evolve being more user-friendly by interfacing the structures and dynamics with interactive 3D molecular viewers". This will make MD simulations accessible to non-specialists and researchers in other areas. I would envisage that this will also include approaches using interactive virtual reality for an immersive exploration of structure and dynamics, and virtual collaboration (e.g. O'Connor et al., Sci. Adv.4, eaat2731 (2018). DOI:10.1126/sciadv.aat2731)

      The need to share data effectively, and to compare simulations and test models, was illustrated clearly in the COVID-19 pandemic, which also demonstrated a willingness and commitment to data sharing across the international community (e.g. Amaro and Mulholland, J. Chem. Inf. Model. 2020, 60, 6, 2653-2656 https://doi.org/10.1021/acs.jcim.0c00319; Computing in Science & Engineering 2020, 22, 30-36 doi: 10.1109/MCSE.2020.3024155). There are important lessons to learn here, for simulations to be reproducible and reliable, for rapid testing, for exploiting data with machine learning, and for linking to data from other approaches. Tiemann et al. discuss how to develop these links, providing good perspectives and suggestions.

      I agree completely with the statement of the authors that "Even if MD data represents only 1 % of the total volume of data stored in Zenodo, we believe it is our responsibility, as a community, to develop a better sharing and reuse of MD simulation files - and it will neither have to be particularly cumbersome nor expensive. To this end, we are proposing two solutions. First, improve practices for sharing and depositing MD data in data repositories. Second, improve the FAIRness of already available MD data notably by improving the quality of the current metadata."

      This nicely states the challenge to the biomolecular simulation community. There is a clear need for standards for MD data and associated metadata. This will also help with the development of standards of best practice in simulations. The authors provide useful and detailed recommendations for MD metadata. These recommendations should contribute to discussions on the development of standards by researchers, funders, and publishers. Community organizations (such as CCP-BioSim and HECBioSim in the UK, BioExcel, CECAM, MolSSI, learned societies etc) have an important part to play in these developments, which are vital for the future of biomolecular simulation.

      We thank the reviewer for his/her comments. Beyond the points mentioned to Reviewers 1 and 2, as the reviewer suggested, it would be of great interest to combine innovative and immersive approaches to visualize and possibly interact with the data collected. This is indeed more and more amenable thanks to technologies such as WebGL and programs such as Mol*, or even - as also pointed out by the reviewer - through virtual reality, for example with the mentioned Narupa framework or with the UnityMol software. For a comprehensive review on MD trajectory visualization and associated challenges, we refer to our recent review article https://doi.org/10.3389/fbinf.2024.1356659.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Some minor text editing would improve the readability of the manuscript.

      It would be very useful if the authors could share their perspectives on the best and most efficient approach to sharing datasets and code associated with a publication. My concern lies in the fact that Github, which is currently the dominant platform for sharing code, is not well-suited for hosting large MD datasets. As a result, researchers often need to adopt a workflow where code is shared on Github and datasets are stored elsewhere (e.g., Zenodo). While this is feasible, it adds extra work. Ideally, a transparent process could be developed to seamlessly share code and datasets linked to a study through a unified interface.

      We thank the reviewer for this excellent suggestion. To our knowledge, there is yet no easy framework to jointly store and share code and data, linked to their scientific publication. Of course, code can be submitted to “generic” databases along with the data, but at the current state, those do not provide such useful features like collaborative work & track recording as done to the extent of GitHub.

      Although GitHub is indeed a suitable platform to deposit code, we strongly advise researchers to archive their code in Software Heritage. In addition to preserving source code, Software Heritage provides a unique identifier called SWHID that unambiguously makes reference to a specific version of the source code.

      So far, it is the responsibility of the scientific publication authors to link datasets and source codes (whether in GitHub or Software Heritage) in their paper, but also to make the reverse link from the data and code sharing platforms to the paper after publication.

      As mentioned by the reviewer, a unified interface that could ease this process would significantly contribute to FAIR-ness in MD.

      Reviewer #2 (Recommendations For The Authors):

      L180: I am not aware that TRR files contain energy terms as stated here, my understanding was that EDR files primarily served that purpose.

      “…available in one dataset. Interestingly, we found 1,406 .trr files, Which contain trajectory but also additional information such as velocities, energy of the system, etc’ While the file is especially useful in terms of reusability, the large size (can go up to several 100GB) limits its deposition in most…”

      Indeed, our formulation was ambiguous. The EDR files contain the detailed information on energies, whereas TRR files contain numerous values from the trajectory such as coordinates, velocities, forces and to some extent also energies

      (https://manual.gromacs.org/current/reference-manual/file-formats.html#trr)

      L207: The text states that the total time was not available from XTC files, only the number of frames. However, XTC files record time stamps in addition to frame numbers. As long as these times are in the Gromacs standard of picoseconds, the simulation time ought to be available from XTCs.

      “…systems and the number of frames available in the files (Fig. 3-B). Of note, the frames do not directly translate to the simulation runtime - more information deposited in other files (e.g. .mdp files) is needed to determine the complete runtime of the simulation. The system was up…”.

      Thank you for the useful comment, we removed this sentence. We now mention that studying the simulation time would be of interest in the future, especially when we will perform an exhaustive analysis of XTC files.

      “Of note, as .xtc files also contain time stamps, it would be interesting to study the relationship between the time and the number of frames to get useful information about the sampling. Nevertheless, this analysis would be possible only for unbiased MD simulations. So, we would need to decipher if the .xtc file is coming from biased or unbiased simulations, which may not be trivial.”

      Analysis of MDP files: Were these standard equilibrium MD or can you distinguish biased MD or free energy calculations?

      Currently we do not distinguish between biased and unbiased MD, but in the future we may attempt to do so, e.g. by correlating it with standard equilibration force-fields/parameters, timesteps or similar. Nevertheless, a true distinction will remain challenging.

      L336: typo: pikes -> spikes (or peaks?)

      “…simulations of Lennard-Jones models (Jeon et al., 2016). Interestingly, we noticed the appearance of several pikes at 400K, 600K and 800K, which were not present before the end of the year 2022. These peaks correspond to the same study related to the stability of hydrated crystals (Dybeck et al., 2023)’ Overall, thhis analysis revealed that a wide range of temperatures have been explored,…”

      Thank you. We have corrected this typo.

      Make clear how multiple versions of data sets are handled, e.g., if v1, v2, and v3 of a dataset are provided in Zenodo then which one is counted or are all counted?

      We collected the latest version only of datasets, as exposed by default by the Zenodo API. To reflect this, we added the following sentence to the Methods and Materials section, Initial data collection sub-section:

      “By default, the last version of the datasets was collected.”

      L248 Analysis of GRO files seems fairly narrow because PDB files are very often used for exactly the same purpose, even in the context of Gromacs simulations, not the least because it is familiar to structural biologists that may be interested in representative MD snapshots. Despite all the shortcomings of abusing the PDB format for MD, it is an attempt at increased interoperability. Perhaps the authors can make sure that readers understand that choosing GRO for analysis may give a somewhat skewed picture, even within Gromacs simulations.

      Thanks for this comment. We collected about 12,000 PDB files that could indeed be output from Gromacs simulations and easily be shared due to the universality of this format, but that could as well come from different sources (like other MD packages or the PDB database itself). We purposely decided to limit our study to files strictly associated with the Gromacs package, like MDP and XTC file types. However, we will extend our survey to all other structure-like formats and especially the PDB file type. We reflected this purpose in the following sentence (after line 281)

      “Beyond .gro files, we would like to analyze the ensemble of the ~12,000 .pdb files extracted in this study (see Figure 2-B) to better characterize the types of molecular structures deposited.”

      A simple template metadata file would be welcome (e.g., served from a GitHub/GitLab repository so that it can be improved with community input).

      Thank you for this suggestion that we fundamentally agree with. However, the generation of such a file is a major task, and we believe that the creation of a metadata file template requires far-reaching considerations, therefore is beyond the scope of this paper and should not be decided by a small group of researchers. Indeed, this topic requires a large consensus of different stakeholders, from users, to MD program developers, and journal editors. It would be especially useful to organize dedicated workshops with representatives of all these communities to tackle this specific issue, as mentioned by Reviewer3 in his/her public review. As a basis for this discussion, we humbly proposed at the end of this manuscript a few non-constraining guidelines based on our experience retrieving the data.

      To emphasize this statement, we added the following sentence at the end of the “Guidelines for better sharing of MD simulation data” section (line 420):

      “Converging on a set of metadata and format requires a large consensus of different stakeholders from users, to MD program developers, and journal editors. It would be especially useful to organize specific workshops with representatives of all these communities to collectively tackle this specific issue.”

      In "Data and code availability" it would be good to specify licenses in addition to stating "open source". Thank you for pointing out that GitLab/GitHub are not archives and that everyone should be strongly encouraged to submit data to stable archival repositories.

      We added the corresponding licenses for code and data in the “Data and code availability” section.

      Reviewer #3 (Recommendations For The Authors)

      The paper is well written, with very few typographical or other minor errors.

      Minor points:

      Line 468-9 "can evolve being more user-friendly" should be "can evolve to being more user-friendly", I think.

      Thank you, we have changed the wording accordingly.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study reports on the packing of molecules in cellular compartments, such as actin-based protrusions. The study provides solid evidence for parameters that enable the building of a biophysical model of filopodia, which is required to gain a complete understanding of these important actin-based structures. Some areas of the manuscript require further clarification.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes an alternative method by SDS-PAGE calibration of Halo-Myo10 signals to quantify myosin molecules at specific subcellular locations, in this specific case filopodia, in epifluorescence datasets compared to the more laborious and troublesome single molecule approaches. Based on these preliminary estimates, the authors developed further their analysis and discussed different scenarios regarding myosin 10 working models to explain intracellular diffusion and targeting to filopodia.

      Strengths:

      Overall, the paper is elegantly written and the data analysis is appropriately presented.

      Weaknesses:

      While the methodology is intriguing in its descriptive potential and could be the beginning of an interesting story, a good portion of the paper is dedicated to the discussion of hypothetical working mechanisms to explain myosin diffusion, localization, and decoration of filopodial actin that is not accompanied by the mandatory gain/loss of function studies required to sustain these claims.

      To be fair, the detailed mechanisms that we raise related to diffusion, localization, and decoration are based on extensive work by others. Many prior papers use domain deletions of Myo10 and fall in the category of gain/loss-of-function studies. It is true that we have not repeated those extensive studies, but it seems appropriate to connect with and cite their work where appropriate.

      Reviewer #2 (Public Review):

      Summary:

      The paper sought to determine the number of myosin 10 molecules per cell and localized to filopodia, where they are known to be involved in formation, transport within, and dynamics of these important actin-based protrusions. The authors used a novel method to determine the number of molecules per cell. First, they expressed HALO tagged Myo10 in U20S cells and generated cell lysates of a certain number of cells and detected Myo10 after SDS-PAGE, with fluorescence and a stained free method. They used a purified HALO tagged standard protein to generate a standard curve which allowed for determining Myo10 concentration in cell lysates and thus an estimate of the number of Myo10 molecules per cell. They also examined the fluorescence intensity in fixed cell images to determine the average fluorescence intensity per Myo10 molecule, which allowed the number of Myo10 molecules per region of the cell to be determined. They found a relatively small fraction of Myo10 (6%) localizes to filopodia. There are hundreds of Myo10 in each filopodia, which suggests some filopodia have more Myo10 than actin binding sites. Thus, there may be crowding of Myo10 at the tips, which could impact transport, the morphology at the tips, and dynamics of the protrusions themselves. Overall, the study forms the basis for a novel technique to estimate the number of molecules per cell and their localization to actin-based structures. The implications are broad also for being able to understand the role of myosins in actin protrusions, which is important for cancer metastasis and wound healing.

      Strengths:

      The paper addresses an important fundamental biological question about how many molecular motors are localized to a specific cellular compartment and how that may relate to other aspects of the compartment such as the actin cytoskeleton and the membrane. The paper demonstrates a method of estimating the number of myosin molecules per cell using the fluorescently labeled HALO tag and SDS-PAGE analysis. There are several important conclusions from this work in that it estimates the number of Myo10 molecules localized to different regions of the filopodia and the minimum number required for filopodia formation. The authors also establish a correlation between number of Myo10 molecules filopodia localized and the number of filopodia in the cell. There is only a small % of Myo10 that tip localized relative to the total amount in the cell, suggesting Myo10 have to be activated to enter the filopodia compartment. The localization of Myo10 is log-normal, which suggest a clustering of Myo10 is a feature of this motor.

      Weaknesses:

      One main critique of this work is that the Myo10 was overexpressed. Thus, the amount in the cell body compared to the filopodia is difficult to compare to physiological conditions. The amount in the filopodia was relatively small - 100s of molecules per filopodia so this result is still interesting regardless of the overexpression. However, the overexpression should be addressed in the limitations.

      This is a reasonable perspective and we now note this caveat in the Limitations section so that readers will take note. Our goal here was to understand a system in which Myo10 is the limiting reagent for filopodia, rather than a native system that expresses high Myo10 on its own. Because U2OS cells do not express detectable levels of Myo10 (see below), the natural perturbation here is overexpressing Myo10 to stimulate filopodial growth.

      The authors have not addressed the potential for variability in transfection efficiency. The authors could examine the average fluorescence intensity per cell and if similar this may address this concern.

      Indeed, cells are heterogenous and will naturally express different levels of Myo10 not only due to transfection efficiency, but also due to their state (cell cycle stage, motile behavior, and more). In fact, we measure the transfection efficiency of each bioreplicate and account for it in our calibration procedure. We also measure the fluorescence intensity per cell, which lets us calculate the total Myo10s per cell and the cell-to-cell variability. These Myo10 distributions across cells are shown in Fig. 1D-E.

      We note here an error that we made in applying this transfection efficiency correction in the first submission. When we obtain the total Myo10 molecules by SDS-PAGE, we should divide by the total number of transfected cells. However, due to an operator precedence error, the transfection efficiency appeared in the numerator rather than the denominator. We have now corrected this error, which has the effect of increasing the number of molecules in all of our measurements. The effect of this correction has strengthened one of the paper’s main conclusions, that Myo10 is frequently overloaded at filopodial tips.

      The SDS PAGE method of estimating the number of molecules is quite interesting. I really like this idea. However, I feel there are a few more things to consider. The fraction of HALO tag standard and Myo10 labeled with the HALO tagged ligand is not determined directly. It is suggested that since excess HALO tagged ligand was added we can assume nearly 100% labeling. If the HALO tag standard protein is purified it should be feasible to determine the fraction of HALO tagged standard that is labeled by examining the absorbance of the protein at 280 and fluorophore at its appropriate wavelength.

      This is a fair point raised by the reviewer, and we have now measured a labeling efficiency of 90% in Supplementary Figure 2A-C. We have adjusted all values according to this labeling efficiency.

      The fraction of HALO tagged Myo10 labeled may be more challenging to determine, since it is in a cell lysate, but there may be some potential approaches (e.g. mass spec, HPLC).

      As noted, this value is considerably more challenging. Instead, we determined conditions under which labeling in cells is saturated. We have now stained with a concentration range for both fixed and live cell samples. Saturation occurs with ~0.5 μM HaloTag ligand-TMR in fixed/permeabilized cells and in live cells (Supplementary Figure 2D-E). This comparison of live cells vs. permeabilized cells allows us to say that the intact plasma membrane is not limiting labeling under these conditions.

      In Figure 1B, the stain free gel bands look relatively clean. The Myo10 is from cell lysates so it is surprising that there are not more bands. I am not surprised that the bands in the TMR fluorescence gel are clean, and I agree the fluorescence is the best way to quantitate.

      Figure 1B shows the focused view at high MW, and there is not much above Myo10. The full gel lanes shown in Supp. Fig. 1C show the expected number of bands from a cell lysate.

      In Figure 3C, the number of Myo10 molecules needed to initiate a filopodium was estimated. I wonder if the authors could have looked at live cell movies to determine that these events started with a puncta of Myo10 at the edge of the cell, and then went on to form a filopodia that elongated from the cell. How was the number of Myo10 molecules that were involved in the initiation determined? Please clarify the assumptions in making this conclusion.

      We thank the reviewer (and the other reviewers) for this excellent suggestion. We have now carried out these live cell experiments. These experiments were quite challenging, because we needed to collect snapshots of ~50 cells to measure the mean fluorescence intensity of transfected cells and then acquire movies of several cells for analysis. The U2OS cells were also highly temperature-sensitive and would retract their filopodia without objective heating.

      We have now analyzed filopodial initiation events and measured considerably more Myo10 at the first signs of accumulation– in the 100s of molecules. The dimmer spots that we measured in the first draft were likely unrelated to filopodial initiation, and we have corrected the discussion on this point.

      We now also track further growth from a stable filopodial tip (the phased-elongation mechanism from Ikebe and coworkers) and find approximately 500 molecules bud off in those events. We also track filopodial elongation rates as a function of Myo10 numbers. We have added additional live cell imaging sections that include these results.

      It is stated in the discussion that the amount of Myo10 in the filopodia exceeds the number of actin binding sites. However, since Myo10 contains membrane binding motifs and has been shown to interact with the membrane it should be pointed that the excess Myo10 at the tips may be interacting with the membrane and not actin, which may prevent traffic jams.

      This is also an excellent point to consider, and we have expanded the relevant discussion along these lines. We agree that the Myo10 at the filopodial tip is likely membrane-bound. We now estimate the 2D membrane area occupied by Myo10, and find that it reaches nearly full packing in many cases (under a number of assumptions that we spell out more fully in the manuscript).

      Reviewer #3 (Public Review):

      Summary:

      The unconventional myosin Myo10 (aka myosin X) is essential for filopodia formation in a number of mammalian cells. There is a good deal of interest in its role in filopodia formation and function. The manuscript describes a careful, quantitative analysis of Myo10 molecules in U2OS cells, a widely used model for studying filopodia, how many are present in the cytosol versus filopodia and the distribution of filopodia and molecules along the cell edge. Rigorous quantification of Myo10 protein amounts in a cell and cellular compartment are critical for ultimately deciphering the cellular mechanism of Myo10 action as well as understand the molecular composition of a Myo10-generated filopodium.

      Consistent with what is seen in images of Myo10 localization in many papers, the vast majority of Myo10 is in the cell body with only a small percentage (appr 5%) present in filopodia puncta. Interestingly, Myo10 is not uniformly distributed along the cell edge, but rather it is unevenly localized along the cell edge with one region preferentially extending filopodia, presumably via localized activation of Myo10 motors. Calculation of total molecules present in puncta based on measurement of puncta size and measured Halo-Myo10 signal intensity shows that the concentration of motor present can vary from 3 - 225 uM. Based on an estimation of available actin binding sites, it is possible that Myo10 can be present in excess over these binding sites.

      Strengths:

      The work represents an important first step towards defining the molecular stoichiometry of filopodial tip proteins. The observed range of Myo10 molecules at the tip suggests that it can accommodate a fairly wide range of Myo10 motors. There is great value in studies such as this and the approach taken by the authors gives one good confidence that the numbers obtained are in the right range.

      Weaknesses:

      One caveat (see below) is that these numbers are obtained for overexpressing cells and the relevance to native levels of Myo10 in a cell is unclear.

      A similar concern was raised by Reviewer 2; please see above.

      An interesting aspect of the work is quantification of the fraction of Myo10 molecules in the cytosol versus in filopodia tips showing that the vast majority of motors are inactive in the cytosol, as is seen in images of cells. This has implications for thinking about how cells maintain this large population in the off-state and what is the mechanism of motor activation. One question raised by this work is the distinction between cytosolic Myo10 and the population found at the ‘cell edge’ and the filopodia tip. The cortical population of Myo10 is partially activated, so to speak, as it is targeted to the cortex/membrane and presumably ready to go. Providing quantification of this population of motors, that one might think of as being in a waiting room, could provide additional insight into a potential step-by-step pathway where recruitment or binding to the cortical region/plasma membrane is not by itself sufficient for activation.

      As mentioned in our response to Reviewer 2, we have now carried out quantitation in live cells to capture Myo10 transitions from cell body into filopodial movement. We attempted to identify this membrane-bound population of motors in our new live cell experiments but were unable to make convincing measurements. Notably, we see no noticeable enrichment of Myo10 at the cortex relative to the cytosol. Although we believe there is a membrane-bound waiting room (akin to the 3D-2D-1D mechanism of Molloy and Peckham), we suspect that the 2D population is diffusing too rapidly to be detected under our imaging conditions.

      Specific comments:

      (1) It is not obvious whether the analysis of numbers of Myo10 molecules in a cell that is ectopically overexpressing Myo10 is relevant for wild type cells. It would appear to be a significant excess based on the total protein stained blot shown in Fig S1E where a prominent band the size of tagged Myo10 seen in the transfected sample is almost absent in the WT control lane.

      Even “wildtype” cells vary considerably in their Myo10 expression levels. For example, melanoma cells often heavily upregulate Myo10, while these U2OS cells produce nearly none (Supplementary Figure 1E). Thus, there is no single, widely acceptable target for Myo10 expression in wildtype cells.

      Please note that the new Supplementary Figure 1E is a Myo10 Western blot, not total protein staining as before.

      Ideally, and ultimately an important approach, would be to work with a cell line expressing endogenously tagged Myo10 via genome engineering. This can be complicated in transformed cells that often have chromosomal duplications.

      Indeed, we chose U2OS cells for this work because they do not express detectable levels of Myo10, and thus we can avoid all of these complications. Here we can examine how Myo10 levels control filopodial production through ectopic expression.

      However, even though there is an excess of Myo10 it would appear that activation is still under some type of control as the cytosolic pool is quite large and its localization to the cell edge is not uniform. But it is difficult to gauge whether the number of molecules in the filopodium is the same as would be seen in untransfected cells. Myo10 can readily walk up a filopodium and if excess numbers of this motor are activated they would accumulate in the tip in large numbers, possibly creating a bulge as and indeed it does appear that some tips are unusually large. Then how would that relate to the normal condition?

      As noted above, the normal condition depends on the cellular system. However, endogenous Myo10 also accumulates in bulges at filopodial tips, so this is not a phenotype unique to Myo10 overexpression. For example, the images from Figure 1 of the Berg and Cheney (2002) citation show bulges from endogenous Myo10 in endothelial cells.

      (2) Measurements of the localization of Myo10 focuses in large part on ‘Myo10 punctae’. While it seems reasonable to presume that these are filopodia tips, the authors should provide readers with a clear definition of a puncta. Is it only filopodia tips, which seems to be the case? Does it include initiation sites at the cell membrane that often appear as punctae?

      We define puncta as any clusters/spots of Myo10 signal detected by segmentation, not limited to any location within the surface-attached filopodia. We exclude puncta that appear in the cell interior (~5 of which appear in Fig. 1A). These are likely dorsal filopodia, but there are few of these compared to the surface attached filopodia of U2OS cells. In Figure 2, “puncta” includes all Myo10 clusters along the filopodia shaft, though a majority happen to be tip-localized (please see Supplementary Figure 4B). We have edited the main text for clarification.

      Along those lines, the position of dim punctae along the length of a filopodium is measured (Fig 3D). The findings suggest that a given filopodium can have more than one puncta which seems at odds if a puncta is a filopodia tip. How frequently is a filopodium with two puncta seen? It would be helpful if the authors provided an example image showing the dim puncta that are not present at the tip.

      We have now provided an example image of dim puncta along filopodia in Supplementary Figure 4C.

      (3) The concentration of actin available to Myo10 is calculated based on the deduction from Nagy et al (2010) that only 4/13 of the actin monomers in a helical turn are accessible to the Myo10 motor (discussion on pg 9; Fig S4). Subsequent work (Ropars et al, 2016) has shown that the heads of the antiparallel Myo10 dimer are flattened, but the neck is rather flexible, meaning that the motor can a variable reach (36 - 52 nm). Wouldn’t this mean that more actin could be accessible to the Myo10 motor than is calculated here?

      Although we see why the reviewer might believe otherwise, the 4/13 fraction of accessible actin holds. This fraction is obtained from consideration of the fascin-actin bundle structure alone, independent of the reach of any particular myosin motor. Every repeating layer of 13 actin subunits (or 36 nm) has 4 accessible myosin binding-sites. The remaining 9 sites are rejected because a single myosin motor domain will have a steric clash with a neighboring actin filament in the bundle. A myosin with an exceptionally long reach might reach the next 13 subunit layer, but that layer also has only 4 binding sites. Thus, we can calculate the number of binding sites per unit length along the filopodium. This number would hold for a dimeric myosin with any reach, including myosin-5 or myosin-2.

      (4) Quantification of numbers of Myo10 molecules in filopodial puncta (Fig 3C) leads the authors to conclude that ‘only ten or fewer Myo10 molecules are necessary for filopodia initiation’ (pg 7, top). While this is a reasonable based on the assumption that the formation of a puncta ultimately results from an initiation event, little is known about initiation events and without direct observation of coalescence of Myo10 at the cell edge that leads to formation of a filopodium, this seems rather speculative.

      As noted above, we have now performed the necessary live cell imaging of filopodial nucleation events and have updated our conclusions accordingly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have made a series of comments that might help the authors improve their manuscript:

      - A full calibration of the methodology would require testing a wider range of protein amounts, to exhaustively detect the dynamic range of the technique. The authors acknowledge in the discussion that “Furthermore, our estimates of molecules are predicated on the calibration curve of the Halo Standard Protein on the SDS-PAGE gels, which is likely the highest source of error on our molecule counts”. A good way of convincing a nasty reviewer is to provide a calibration with more than 3 reference points. At least this will help exclude from the analysis cells where Myo10 estimates are not in the linear regime of detection.

      We completely agree with the reviewer’s suggestion to build a robust calibration curve. The SDS gel shown in Figure 1C originally contained 4 reference points, but the highest HaloTag standard protein point oversaturated the detector at the set exposure in the TMR channel and was omitted. We have now re-run the SDS gel to include a HaloTag standard protein curve comprising 5 points, alongside all three bioreplicates from the fixed cell experiments and all three bioreplicates from the live cell experiments (updated in Figure 1B-C). We had saved frozen lysates from the original fixed cell work, so we were able to reanalyze our data with the new set of standards. The Myo10 quantities are consistent, but with much tighter CIs from the standard curve.

      - As already said this methodology is intriguing, however, a correlative validation with a conventional SMLM approach to address the bona-fide of the method would be ideal.

      Unfortunately, single molecule approaches for validation are impractical for us. Due to the relatively high magnification of our TIRF microscope and the large spread area of the U2OS cells, single cells typically extend beyond the field of view. We acknowledge the benefits of SMLM quantitative techniques and other approaches cited in the introduction section. To avoid use of special tools/instruments, we offer our methodology, based off Pollard group’s quantitative Western blotting of GFP, as a simpler alternative accessible to anyone.

      - TMR is a small ligand likely interacting also with Halo in its denatured state. However, to clear any doubts a parallel Native-PAGE investigation should be included, or if existing a specific reference should be provided.

      Perhaps there is a misunderstanding here. One of the key advantages of the HaloTag labeling system is that the engineered dehalogenase is covalently modified by the ligand (the TMR-ligand is a suicide substrate). This means that the TMR remains bound even under denaturing conditions, which allows its detection in SDS-PAGE. Native gels are unnecessary here.

      - Moreover, SDS-PAGE is run at alkaline pH, have the authors considered these points when designing the methodology? Fluorescence images were taken in PBS, which has a different pH. Could the authors, or the literature, exclude these aspects as potential pitfalls in the methodology? Also temperature is affecting fluorescence emission, but it is easier to control with certain tolerance in the room-temperature regime.

      Our method does not compare fluorescence values that cross the experimental systems (SDS-PAGE vs. microscopy). Cellular proteins and HaloTag protein standards are compared in a single setting of SDS-PAGE to obtain the average number of Myo10s per transfected cell. Likewise, all measurements on intact (live or fixed) cells are conducted in that single setting to obtain average fluorescence per cell. Thus, there is no issue with the different buffers or temperatures affecting fluorescence emission.

      - The authors should test their approach also with truncation variants of Myosin10 (for instance lacking the PH or motor domain). This is a classical approach that might prove the potential of the technique when altering the capacity of the protein to interact with a main binding partner. Also, treatments that induced filopodia formation might prove useful (i.e., hypotonic media induce filopodia formation in some fibroblast cell lines in our hands).

      The reviewer raises interesting suggestions that we aim to address in future experiments, but truncation variants and environmental perturbations are beyond the focus of the current manuscript. Here, we report on the otherwise unperturbed state when we add exogenous full-length Myo10 to the U2OS cells. But indeed, experiments with Myo10 domain truncations, PI3K and PTEN inhibition, and cargo protein / activating cofactor knock-downs (among others) are on our drawing board.

      - Most of the mechanisms hypothesized in the discussion are sound and plausible. However, the authors have chosen an experimental model where transient transfection of exogenous Myo10 in U2OS is performed. This approach poses two main and fundamental questions that are not resolved by the data provided:

      A) how do different expression levels affect the Myo10 counting?

      Our counting procedure does not assume uniform expression across a population of cells– quite the opposite, in fact. We directly measure Myo10 expression levels on a cell-by-cell basis with microscopy, once we know the number of molecules in our total pool (see the Methods for details). As an example of the final output, Figs. 1D and 1E show the total number of Myo10 molecules per cell for fixed and live cells, respectively.

      B) how does endogenous and unlabeled Myo10 hamper the bonafide of counts? The authors claimed “U2OS cells express low levels of Myo10, so there is a small population of unlabeled endogenous Myo10 unaddressed by this paper”. As presented, the low levels of endogenous Myo10 sound an arbitrary parameter, and there are no data presented that can limit if not exclude this bias in the analysis. To produce data in a genetically modified cell line with Halo-tag on the endogenous protein will represent a much cleaner system. Alternatively, the authors should look for Myo10 KO cell lines where they can back-transfect their Halo-Tagged Myo10 construct in a more consistent framework, focusing on cells with low-to-mid levels of expression.

      We agree, this is an important point to nail down (and is often neglected in the literature). We have now measured the endogenous Myo10 levels in U2OS cells by Western blotting and found that it is undetectable compared to our HaloTagged construct expression. Please see Supp. Fig 1E. Thus, for all intents and purposes, every Myo10 molecule in these experiments came from our expression plasmid. Accordingly, we have removed this caveat from the paper.

      Minor points

      - Figure 1B. To help the reader SDS-PAGE gels annotations should be clearer already from the figure.

      We have updated the annotations for clarity.

      - Methods should be organized in sessions. As it stands, it is hard for the reader to look for technical details.

      We have expanded and added subsections to the Methods as requested.

      - The good practice of indicating the gene and transcript entry numbers and the primer used to amplify and clone into the backbone vectors is getting lost in many papers. I would strongly encourage the authors to add this information to the methods.

      We have included the gene entries to the methods and will include a full FASTA file of the coding sequence as supplementary information to avoid any ambiguity here.

      The authors write “It is unclear how myosins navigate to the right place at the right time, but our results support an important interplay between Myo10 and the actin network.” It is a bit scholastic to say that Myo10 and actin have an important interplay, they are major binding partners. What is the new knowledge contained in this sentence?

      Agreed– we have deleted the sentence in question.

      Reviewer #2 (Recommendations For The Authors):

      The authors should address all the weaknesses indicated in the public review.

      There were a few other places that require clarification.

      On page 4, the last paragraph. It is stated that the targeting of Myo10 was reported/proposed based on previous work (ref 31). The next few sentences are not referenced and thus likely refer to ref 31. The authors did not measure the parameters discussed in these sentences, so it is important to clarify that they are referring to previous work and not the current study.

      Indeed, the next few sentences still refer to old reference 31, so we have now edited the paragraph for clarity.

      On page 7, the reference to Figure 3A indicates that the trend of higher Myo10 correlating with more filopodia. However, the reference to Figure 3B indicates total intracellular Myo10 weakly correlates with more filopodia. However, the x-axis on Figure 3B is filopodia molecules not the intracellular Myo10. Please clarify.

      We appreciate the reviewer for catching our mistake. Those plots are now in Fig. 2 and have been edited accordingly.

      Reviewer #3 (Recommendations For The Authors):

      The Discussion of results at the end of each section is rather brief and could be expanded on a bit more.

      Before we were operating under the constraints of an eLife Short Report. We have now expanded the discussion for a full article.

      The authors mention that actin filaments at the tips of filopodia could be frayed, citing Medalia et al, 2007 (ref 40). That paper describes an early cryoEM analysis of filopodia from the amoeba Dictyostelium. EM images of mammalian filopodia tips, e.g. Svitkina et al, 2003, JCB, do not show quite the same organization of actin as seen in the Dictyostelium filopodia tips. However, recent work from the Bershadsky lab, Li et al, 2023, presents a few cryoEM images of tips of left-bent filopodia that are tightly adhered to a substrate and there it looks like actin filaments become disorganized in tips, along with membrane bulging. The authors should consider expanding discussion of the filopodia tips to take into account what is known for mammalian filopodia.

      We thank the reviewer for bringing these enlightening papers to our attention. We have now included these citations in the discussion.

      Fig 1D - The x-axis is a bit odd, it goes from 0 then to 2.5e+06 with no indication of the bin size. Can this be re-labelled or the scale displayed a bit differently?

      We have double-checked the axis breaks, which are large because the underlying values are large. We have also provided the bin size as requested for all histograms.

      Fig 4A - What is the bin size for the histogram?

      As above, we have now updated the figure legends (now in Fig. 3) to include the bin size.

      Methods -

      - Please provide an accession number for the Myo10 nucleotide sequence used for this work as there are at least two known isoforms.

      Thank you for noting this. We are using the full-length, not the headless isoform. We have now updated the Methods accordingly.

      - No mention is made of the SDS sample buffer used, was that also added to the sample?

      We have now updated the Methods accordingly.

      - How are samples boiled at 70 deg C? Do the authors actually mean ‘heated’?

      Indeed. We have now corrected “boiled” to “heated.”

      - Could the authors please briefly explain the connected component analysis used to identify filopodia?

      We have now updated the Methods accordingly.

      - The intensity of filopodia was determined by dividing tip intensity by the total bioreplicate sum of intensities then multiplying it by the total pool, if this reviewer understands correctly. It sounds like intensities are being averaged across a whole cell population instead of cell-by-cell. Is that correct? If so, can the authors please provide the underlying rationale for this? If not, then please better describe what was actually done.

      We apologize for the confusion. Intensities are being averaged (summed) across a whole cell population, but importantly that step is only used to obtain a scale factor that converts the fluorescence signal at the microscope to the number of molecules. We then use that scale factor for all cells imaged in the bioreplicate, to both 1) find the total Myo10 in that cell, and 2) find the total amount of that Myo10 in any given location within that cell.

      To further clarify, each bioreplicate has a known total number of Myo10 molecules associated with the number of cells loaded onto the SDS gel. From the SDS gel, we have an average number of Myo10 molecules per positively transfected cell. If 50 cell images are analyzed, then there is a Myo10 ‘total pool’ of (50 cells) * (average Myo10 molecules/cell). The fluorescence signal intensities in microscopy were summed for all cells within the bioreplicate (50 cells in this example). However, due to variation in expression, not every cell has the same signal intensity when imaged under the same conditions. It would be inaccurate to assume each cell contains the average Myo10 molecules/cell. Therefore, to get the number of molecules within a given Myo10 cell (or punctum), the summed cell (punctum) intensity was divided by the bioreplicate fluorescence signal intensity sum and multiplied by ‘total pool.’

      - The authors quantify Myo10 protein amounts by western blotting using Halo tag fluorescence, a method that should provide good accuracy. The results depend on the transfection efficiency and it is rarely the case that it is 100%. The authors state that they use a ‘value correction for positively transfected cells’ (pg 11). It is likely that there was a range of expression levels in the cells, how was a cut-off for classifying a cell as non-expressing determined or set?

      As described in the Methods, “microscopy was used to count the percentage of transfected cells from ~105-190 randomly surveyed cells per bioreplicate.” Cells were labeled and located with DAPI. If no TMR signal could be visually detected by microscopy, then the cell was deemed to be non-Myo10 expressing. We did not set a cutoff fluorescence value, as untransfected cells have no detectable signal. Please see Supplementary Figure 1F for examples.

      - “In-house Python scripts” are used for image analysis. Will these be made publicly available?

      Yes, we will package these up on GitHub.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases. 

      We appreciate the reviewer for the positive assessment as well as all the comments and suggestions.

      Reviewer #2 (Public Review): 

      Summary: 

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine. 

      Strengths: 

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach. 

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6. 

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general. 

      Thank you very much for your comments and suggestions.

      Weaknesses: 

      Two relatively minor issues are raised here for consideration: 

      p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....". 

      We appreciate the clarification regarding the description of our experimental approach. We agree that our structures do not represent reaction intermediates but rather mixtures of substrate and product states within the enzyme-bound environment. We will revise the text accordingly to more accurately reflect our methodology.

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn. 

      Thank you for your insightful comments. We recognize the importance of visualizing metal ion density alongside product density data. As you commented, distinguishing between Mg2+ and Na+ is challenging, and in Fig 2A, no distinguishable density was observed at 20s. Mn2+, with its higher electron density, is detectable even at low occupancy. To address this, we will include figure panels in Figure 3 or supplementary figures to present Mn2+ and product densities concurrently.

    1. Author response:

      a) that the investigation is very interesting and inventive, and has the potential to reveal some novel insights.

      We thank the reviewers and are excited to improve upon the manuscript through their suggestions.

      b) that the problem of temporal autocorrelation in the fMRI and behavioral data has not been dealt with clearly and convincingly

      We agree that convincingly accounting for fMRI temporal autocorrelation is important to our claims. To reduce its effects, we used field standard methods: prewhitening and autocorrelation modeling with SPM’s FAST algorithm (shown by Olszowy et al. 2019 to be superior to SPM’s default setting), as well as a high-pass filter of 128 Hz. There is still some first-order autocorrelation structure present across voxels in the left hippocampal beta series: across participants there is slightly positive autocorrelation between the betas of decision trials on successive trials, that decays to ~0 at subsequent lags. We note that our task is a narrative, and some patterns over time are expected; instead of attempting to fully eliminate all temporal structure in the data, we aim to show that the temporal distance between trials is unlikely to explain our effects.

      In the within versus between social dimension representational similarity analysis, the average temporal distance between trials is the same within and between dimensions. The clustering analysis is a between subject analysis about individual differences–and the same overall temporal structure is experienced by all participants.

      The trajectory analysis does not focus on consecutive trials across characters, but rather on consecutive trials within characters, where the time gap between successive trials is relatively large and highly variable. An average of over a minute of time elapses between successive decision trials for a given character (versus ~20 seconds across characters), which is on average almost 11 narrative slides and 3 decision trials. Across characters, the temporal gap between decision trials ranges between 12 seconds to more than 10 minutes, reducing the likelihood that temporal autocorrelation drives character-related estimates. We also highlight the shuffled choices control model, which shares the same temporal autocorrelation structure as the model of interest but had significantly poorer social location decoding–a strong indication that temporal autocorrelation alone can’t explain these results. For each participant, we shuffled their choices and re-computed trajectories that preserved the origin and end locations but produced different locations along the way. Our model decoded location significantly better than this null model, and this difference in performance can't be explained by differences in temporal autocorrelation in the neural or behavioral data.

      In the revision, we will further address this concern. For example, we will report more details on the task structure to aid in interpretation and will more precisely characterize the temporal autocorrelation profile. Where appropriate, we will also improve on and/or add more control analyses that preserve the autocorrelation structure.

      c) that a number of important interesting questions have not been addressed: Are the differences between social partners encoded in the hippocampus? Are the social dimensions encoded in a consistent manner across social partners?

      We believe that we should be able to decode other interesting task- and relationship-related features from the hippocampal patterns, as suggested by the reviewers. In the revision, we will attempt several such analyses, while taking care to control for temporal autocorrelation.

      d) that the cluster analysis in the brain-behavior correlation analysis is not well motivated or validated and should be clarified.

      We agree with the reviewers that this clustering analysis should be better described and validated. We aimed to ask whether less diverse and distinctive cognitive representations of the relationship trajectories relate to smaller real-world social networks. This question of impoverished cognitive maps was first raised by Edward Tolman; we think it is relevant here, as well. In the revision, we will clarify its motivations and implications, and better evaluate it for its robustness. Here, we address a few comments made by the reviewers.

      Reviewer 2 noted that other analyses could be used to ask whether social cognitive map complexity relates to real-world social network complexity. While the proposed alternatives are interesting (e.g., correlating decoding accuracy with social network size), we believe these analyses ask different questions. The current co-clustering analysis was intended to estimate map complexity jointly from the behavioral and neural signatures of the social map across characters. In contrast, the spline location decoding is within character; the accuracy of this decoding does not say much about representations across characters. And although we think character decoding is an interesting possible addition to this manuscript, its accuracy may reflect other aspects of the relationships, beyond just spatial representation. Thus, we will provide a clearer and better validated version of the current analysis to address this question.

      We would also like to clarify that we did not collect the Social Network Index questionnaire in the Initial sample; as such these results are more tentative than the other analyses, due to the inability to confirm them in a separate sample. Reviewer 2 also suggests that a single outlier could drive this effect; but estimating the effect with robust regression also returns a right-tailed p < 0.05, showing that the relationship is robust to outliers.

      References

      Olszowy, W., Aston, J., Rua, C. & Williams, W.B. Accurate autocorrelation modeling substantially improves fMRI reliability. Nature Communications. (2019).

    1. Author response:

      eLife assessment:

      This important work provides another layer of regulatory mechanism for TGF-beta signaling activity. The evidence supports the involvement of microtubules as a reservoir of Smad2/3, however, additional evidence to convincingly demonstrate the functional involvement of Rudhira in this process is highly appreciated. The work will be of broad interest to developmental biologists in general and molecular biologists in the field of growth factor signaling.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      This manuscript aimed to study the role of Rudhira (also known as Breast Carcinoma Amplified Sequence 3), an endothelium-restricted microtubules-associated protein, in regulating of TGFβ signaling. The authors demonstrate that Rudhira is a critical signaling modulator for TGFβ signaling by releasing Smad2/3 from cytoskeletal microtubules and how Rudhira is a Smad2/3 target gene. Taken together, the authors provide a model of how Rudhira contributes to TGFβ signaling activity to stabilize the microtubules, which is essential for vascular development.

      Strengths

      The study used different methods and techniques to achieve aims and support conclusions, such as Gene Ontology analysis, functional analysis in culture, immunostaining analysis, and proximity ligation assay. This study provides an unappreciated additional layer of TGFβ signaling activity regulation after ligand-receptor interaction.

      We thank the reviewer for acknowledging the importance of our study and providing a clear summary of our findings.

      Weaknesses

      (1) It is unclear how current findings provide a better understanding of Rudhira KO mice, which the authors published some years ago.

      Our previous study demonstrated that Rudhira KO mice have a predominantly developmental cardiovascular phenotype that phenocopies TGFβ loss of function (Shetty, Joshi et al., 2018). Additionally, we found that at the molecular level, Rudhira regulates cytoskeletal organization (Jain et al., 2012; Joshi and Inamdar, 2019). Our current study builds upon these previous findings, showing an essential role of Rudhira in maintaining TGFβ signaling and controlling the microtubule cytoskeleton during vascular development. On one hand Rudhira regulates TGFβ signaling by promoting the release of Smads from microtubules, while on the other, Rudhira is a TGFβ target essential for stabilizing microtubules. Thus, our current study provides a molecular basis for Rudhira function in cardiovascular development.

      (2) Why do they use HEK cells instead of SVEC cells in Figure 2 and 4 experiments?

      Our earlier studies have characterized the role of Rudhira in detail using both loss and gain of function methods in multiple cell types (Jain et al., 2012; Shetty, Joshi et al., 2018; Joshi and Inamdar, 2019). As endothelial cells are particularly difficult to transfect, and because the function of Rudhira in promoting cell migration is conserved in HEK cells, it was practical and relevant to perform these experiments in HEK cells (Figures 2 and 4E).

      (3) A model shown in Figure 5E needs improvement to grasp their findings easily.

      We have modified Figure 5E for clarity.

      Reviewer #2 (Public Review):

      Summary

      It was first reported in 2000 that Smad2/3/4 are sequestered to microtubules in resting cells and TGF-β stimulation releases Smad2/3/4 from microtubules, allowing activation of the Smad signaling pathway. Although the finding was subsequently confirmed in a few papers, the underlying mechanism has not been explored. In the present study, the authors found that Rudhira/breast carcinoma amplified sequence 3 is involved in the release of Smad2/3 from microtubules in response to TGF-β stimulation. Rudhira is also induced by TGF-β and is probably involved in the stabilization of microtubules in the delayed phase after TGF-β stimulation. Therefore, Rudhira has two important functions downstream of TGF-β in the early as well as delayed phase.

      Strengths:

      This work aimed to address an unsolved question on one of the earliest events after TGF-β stimulation. Based on loss-of-function experiments, the authors identified a novel and potentially important player, Rudhira, in the signal transmission of TGF-β.

      We thank the reviewer for the critical evaluation and appreciation of our findings.

      Weaknesses:

      The authors have identified a key player that triggers Smad2/3 released from microtubules after TGF-β stimulation probably via its association with microtubules. This is an important first step for understanding the regulation of Smad signaling, but underlying mechanisms as well as upstream and downstream events largely remain to be elucidated.

      We acknowledge that the mechanisms regulating cytoskeletal control of Smad signaling are far from clear, but these are out of scope of this manuscript. This manuscript rather focuses on Rudhira/Bcas3 as a pivot to understand vascular TGFβ signaling and microtubule connections.

      (1) The process of how Rudhira causes the release of Smad proteins from microtubules remains unclear. The statement that "Rudhira-MT association is essential for the activation and release of Smad2/3 from MTs" (lines 33-34) is not directly supported by experimental data.

      We agree with the reviewer’s comment. Although we provide evidence that the loss of Rudhira (and thereby deduced loss of Rudhira-MT association) prevents release of Smad2/3 from MTs (Fig 3C), it does not confirm the requirement of Rudhira-MT association for this. In light of this, we have modified the statement to ‘Rudhira associates with MTs and is essential for the activation and release of Smad2/3 from MTs”.

      (2) The process of how Rudhira is mobilized to microtubules in response to TGF-β remains unclear.

      Our previous study showed that Rudhira associates with microtubules, and preferentially binds to stable microtubules (Jain et al., 2012; Joshi and Inamdar, 2019). Since TGFβ stimulation is known to stabilize microtubules, we hypothesize that TGFβ stimulation increases Rudhira binding to stable microtubules. We have mentioned this in our revised manuscript.

      (3) After Rudhira releases Smad proteins from microtubules, Rudhira stabilizes microtubules. The process of how cells return to a resting state and recover their responsiveness to TGF-β remains unclear.

      We show that dissociation of Smads from microtubules is an early response and stabilization of microtubules is a late TGFβ response. However, we agree that the sequence of these molecular events has not been characterized in-depth in this or any other study, making it difficult to assign causal roles (eg. whether release of Smads from MTs is a pre-requisite for MT stabilization by Rudhira) or reversibility. However, the TGFβ pathway is auto regulatory, leading to increased turnover of receptors and Smads and increased expression of inhibitory Smads, which may recover responsiveness to TGFβ. Additionally, the still short turnover time of stable microtubules (several minutes to hours) may also promote quick return to resting state.

      We have discussed this in our revised manuscript.

    1. Author response:

      eLife assessment

      This important study provides new insight into the dynamics that underlie the development of therapy resistance in prostate cancer by revealing that divergent tumor evolutionary paths occur in response to different treatment timing and that these converge on common resistance mechanisms. The use of barcoded lineage tracing and characterization of isolated tumor clonal populations provides compelling evidence supporting the importance of clonal dynamics in a tumor ecosystem for treatment resistance. Several open questions remain, however, raising the possibility of alternative interpretations of the data set in its current form. Overall, the findings deepen our understanding of prostate cancer evolution and hold promising implications for how drug resistance can be addressed or prevented.

      We are pleased the reviewers found our work reporting distinct evolutionary paths to resistance based on timing of treatment to be important and supported by compelling evidence.  We also acknowledge the need for additional work to clarify some details, particularly regarding the mechanism of clonal cooperativity as a catalyst of resistance.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Lee, Eugine et al. use in vivo barcoded lineage tracing to investigate the evolutionary paths to androgen receptor signaling inhibition (ARSI) resistance in two different prostate cancer clinical scenario models: measurable disease and minimal residual disease. Using two prostate cancer cell lines, LNCaP/AR and CWR22PC, the authors find that in their minimal residual disease models, the outgrowth of pre-existing resistant clones gives rise to ARSI-resistant tumors. Interestingly, in their measurable disease model or post-engraftment ARSI setting, these pre-existing resistant clones are depleted and rather a subset of clones that give rise to the treatment of naïve tumors adapt to ARSI treatment and are enriched in resistant tumors. For the LNCaP/AR cell line, characterization of pre-existing resistant clones in treatment naïve and ARSI treatment settings reveal increased baseline androgen receptor transcriptional output as well as baseline upregulation of glucocorticoid receptor (GR) as the primary driver of pre-existing resistance. Similarly, the authors found induction of high GR expression over long-term ARSI treatment in ARSI-sensitive clones for adaptive resistance to ARSI. For CWR22Pc cells, HER3/NRG1 signaling was the primary driver for ARSI resistance in both measurable disease and minimal residual disease models. Not only were these findings consistent with the authors' previous reports of GR and NRG1/Her3 as the molecular drivers of ARSI resistance in LNCaP/AR and CWR22Pc, respectively, but also demonstrate conserved resistance mechanisms despite pre-existing or adaptive evolutionary paths to resistance. Lastly, the authors show adaptive ARSI resistance is dependent on interclonal cooperation, where the presence of pre-existing resistant clones or "helper" clones is required to promote adaptive resistance in ARSI-sensitive clones.

      Strengths:

      The authors employ DNA barcoding, powerful a tool already demonstrated by others to track the clonal evolution of tumor populations during resistance development, to study the effects of the timing of therapy as a variable on resistance evolution. The authors use barcoding in two cell line models of prostate cancer in two clinical disease scenarios to demonstrate divergent evolutionary paths converging on common resistant mechanisms. By painstakingly isolating clones with barcodes of interest to generate clonal cell lines from the treatment of naïve cell populations, the authors are able to not only characterize pre-existing resistance but also show cooperativity between resistant and drug-sensitive populations for adaptive resistance.

      Weaknesses:

      While the finding that different evolutionary paths result in common molecular drivers of ARSI resistance is novel and unexpected, this work primarily confirms the authors' previous published work identifying the resistance mechanisms in these cell lines. The impact of the work would be greater with additional studies understanding the specific molecular/genetic mechanisms by which cells become resistant or cooperate within a population to give rise to resistant population subclones.

      We agree that additional insights into the mechanism of adaptive resistant and the role of cell-cell cooperativity are clear next steps for this work. We propose to do so through single cell characterization (RNA-seq, ATAC-seq) of tumor evolution in a time course experiment where we can track each clone using expressed barcodes. This will allow us to explore the dynamics of interaction between the "adaptable" and "helper" clones. Unfortunately, the barcode methodology used in this initial report is DNA-based; therefore, a follow-up study using a transcribable barcode library is needed to address these fascinating questions.

      This study would also benefit from additional explanation or exploration of why the two resistance driver pathways described (GR and NRG1/Her3) are cell line specific and if there are genetic or molecular backgrounds in which specific resistance signaling is more likely to be the predominant driver of resistance.

      In the case of NRG1/HER3 pathway mediated resistance, we know that this mechanism requires that the PTEN/PIK3CA pathway be wildtype.  This is the case for the CWR22Pc model described in the manuscript. Furthermore, we have data showing that PTEN deletion in these cells rescues the phenotype, meaning that CWR22Pc cells with PTEN deletion are no longer dependent on NRG1/HER3 signaling for ARSI resistance.

      In contrast, LNCaP/AR cells are PTEN null at baseline and therefore must evolve alternative mechanisms of ARSI resistance. Since our initial identification of the GR mechanism, we and others have extended the finding to additional models (VCaP, LAPC4) (PMID: 24315100; PMID: 28191869). Another recent insight is the importance of RB1 and TP53 status in maintenance of luminal lineage identity during ARSI therapy, and the recognition of lineage plasticity as a resistance mechanism in cell lines/tumor models that lack these two tumor suppressors. In summary, baseline genetics clearly plays a role in which ARSI resistance pathway is  likely to emerge. We will clarify this point in the revision with additional discussion.

      Reviewer #2 (Public Review):

      Summary

      The authors aimed to characterise the evolutionary dynamics that occur during the resistance to androgen receptor signalling inhibition, and how this differs in established tumours vs. residual disease, in prostate cancer. By using a barcoding method, they aimed to both characterise the distribution of clones that support therapy resistance in these settings, while also then being able to isolate said clones from the pre-graft population via single-cell cloning to characterise the mechanisms of resistance and dependency on cooperativity.

      While, interestingly, the timing of combination therapies has been shown to be critical to avoid cross-resistance, the timing of therapy has not been specifically considered as a factor dictating resistance pathways. Additionally, the role of residual disease and dormant populations in driving relapse is of increasing interest, yet a lot remains to be understood of these populations. The question of whether different clinical manifestations of therapy resistance follow similar evolutionary pathways to resistance is therefore interesting and relevant for the field.

      The methods applied are elegant and the body of work is substantial. The proposed divergent evolutionary pathways pose interesting questions, and the findings on cooperativity provide insight. However, whether the model truly reflects minimal residual disease to the extent that the authors suggest may limit the relevance of the findings at this stage. Certain patterns in the DNA barcoding results also call into question whether the results fully support the strong claims of the authors, or whether alternative explanations could exist. While the potential to isolate individual clones in the pre-graft setting is a great strength of the method applied and the isolation of these clones is a huge body of work in itself, the limited number of clones that could be isolated also somewhat limits the validation of the findings.

      Strengths

      Very relevant and interesting question, clear clinical relevance, applying elegant methods that hold the potential to provide a novel understanding of multiple aspects of therapy resistance, through from evolutionary patterns to intracellular and cooperative mechanisms of resistance.

      The text is clearly written, logical, and the structure is easy to follow.

      Weaknesses

      (1) The extent to which the model used truly mimics residual disease

      The main conclusions of the paper are built upon results using a model for minimal residual disease. However, the extent to which this truly recapitulates minimal residual disease, particularly with regard to their focus on the timings of therapy, could be discussed further. If in the clinical setting residual disease occurs following the existence of a tumour and its microenvironment, there might be many aspects of the process that are missed when coinciding treatment with engraftment of a xenograft tumour with pre-castration. If any characterisation of the minimal residual disease was possible (such as histologically or through RNA sequencing), this may help demonstrate in what ways this model recapitulates minimal residual disease.

      We appreciate the reviewer's feedback on this point and acknowledge that the pre-ARSI setting used in our studies is not precisely identical to minimal residual disease (MRD) seen clinically, where a patient typically undergoes primary treatment (radical prostatectomy surgery or local radiotherapy) then relapses with distant disease from micrometastases that were not initially detectable.  Having uncovered a key difference in the path to resistance using our pre-ARSI model, we believe our data provide a strong rationale to invest additional effort in designing newer MRD models that more closely mimic the clinical scenario, perhaps through surgical resection of a primary tumor that could “seed” micrometatases prior to therapy. We will highlight this aspect in our revised manuscript and provide clarity on the limitations and scope of our study.

      (2) Whether the observed enrichment of pre-resistant clones is truly that

      The authors strongly make the case that their barcoding experiments provide evidence for pre-existing resistance in the context of minimal residual disease. However, it seems that the clones enriched in the ARSIR tumours are consistently the most enriched clones in the pregraft. Is it possible that the high selective pressure in the pre-engraftment ARSI condition simply leads to an enrichment of the most populous clones from the pregraft? Whereas in the control setting, the reduced selective pressure at the point of engraftment allows for a wider variety of clones to establish in the tumour?

      The reviewer raises an important point about enrichment of ARSI resistance clones in the pregraft but we do not believe that explains the subsequent in vivo data for the following reasons:

      (1) The two most enriched clones in the Pre-ARSIR tumors are the second and third the most enriched clones in pre-graft, not first (Supplementary figure 1E). If the clones were enriched in resistant tumors based on their abundance in starting population, we expect to find the most enriched clone in the tumor.

      (2) By varying the androgen concentration in the pregraft culture media, we could selectively deplete or enrich the same clones enriched in the Pre-ARSIR tumors in vivo, indicating the enrichment of these clones in the resistant tumors is unlikely to be solely based on their relative frequency in the pregraft (Supplementary figure 2).

      We will clarify these points in the revised manuscript.

      Additionally, is there the possibility that the clones highly enriched in the pregraft are in fact a heterogeneous group of cells bearing the same barcode due to stochastic events in the process of viral transduction? Addressing these questions would greatly improve the study.

      The barcode library was deep sequenced to confirm even distribution of the barcodes before it was transferred from Novartis (PMID: 258491301) and we intentionally used a low multiplicity of infection (MOI) to generate barcode lines to ensure single copy insertion. That said, we cannot entirely rule out the possibility that the second and third most enriched clones in the pregraft originated from the same ancestral clone and subsequently acquired two different barcodes.  We will clarify this point in the revised manuscript.

      (3) The robustness of the subsequent work based on 1-2 pre-resistant clones

      While appreciating the volume of work involved in isolating and culturing individual pre-resistant clones, given the previous point, the conclusions would benefit from very robust validations with these single-cell clones. There are only two clones, and the results seem to focus more on one than the other, for which the data is less convincing. For instance, the Enz IC50 data, which in the case for pre-ARSI R2 is restricted to the supplementary, compares the clones A-D. In Figure S8 B, pre-ARSI R2 is compared to clone B, which is, of the four clones shown in the main figure when compared to R1, the one with the lowest Enz IC50. Therefore, while the resistant clones seem to have a significantly higher Enz IC50, comparing both clones to clones A-D may not have achieved this significance. It would also be useful to know how abundant the resistant clones were in the original barcode experiments.

      We acknowledge that studies relying on 1-2 biological samples indeed have limitations. Given our extensive prior work into the role of GR in the development of ARSI resistance (and that of other labs), we focused on demonstrating that both pre-ARSIR1 and pre-ARSIR2 clones exhibit pre-existing GR expression and are primed to further upregulate GR levels under ARSI conditions, thereby relying on GR function to sustain resistance. Given the redundancy of resistant mechanisms of the two clones, we made efforts to isolate additional clones enriched in Pre-ARSIR tumors. However, despite our attempts, we were unable to identify further clones. Pre-ARSIR1 and pre-ARSIR2 are second and third most enriched clones in pre-graft (2.1% and 1.7% respectively).

      (4) The logic used in the final section requires further explanation

      In the final section, the authors suggest that a pre-ARSIR clone is able to cooperate with a pre-Intact clone to aid adaptive ARSI resistance. If this is true, then could it not be that rare, pre-resistant clones support adaptive resistance in established tumours? And, therefore, the mechanism underlying resistance could be through pre-existing resistant clones in both settings. The work would benefit from a discussion to clarify this discrepancy in the interpretation of the findings. This is particularly necessary given the strong wording the authors use regarding their findings, such as that they have provided 'conclusive evidence' for acquired resistance.

      We agree that rare, pre-resistant clones could support adaptive resistance (and therefore resistance in this adaptive setting could, technically be called “pre-existing”) but it is critical to recognize that these rare, pre-resistant “helper” clones are vastly outnumbered by pre-Intact clones that “acquire” resistance through their “help.” We find this to be fascinating biology and we will clarify this logic in the resubmission, as well as future experimental approaches to unravel the mechanism.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Chowdhury and co-workers provide interesting data to support the role of G4-structures in promoting chromatin looping and long-range DNA interactions. The authors achieve this by artificially inserting a G4-containing sequence in an isolated region of the genome using CRISPR-Cas9 and comparing it to a control sequence that does not contain G4 structures. Based on the data provided, the authors can conclude that G4-insertion promotes long-range interactions (measured by Hi-C) and affects gene expression (measured by qPCR) as well as chromatin remodelling (measured by ChIP of specific histone markers).

      Whilst the data presented is promising and partially supports the authors' conclusion, this reviewer feels that some key controls are missing to fully support the narrative. Specifically, validation of actual G4-formation in chromatin by ChIP-qPCR (at least) is essential to support the association between G4-formation and looping. Moreover, this study is limited to a genomic location and an individual G4-sequence used, so the findings reported cannot yet be considered to reflect a general mechanism/effect of G4-formation in chromatin looping.

      Strengths:

      This is the first attempt to connect genomics datasets of G4s and HiC with gene expression. The use of Cas9 to artificially insert a G4 is also very elegant.

      Weaknesses:

      Lack of controls, especially to validate G4-formation after insertion with Cas9. The work is limited to a single G4-sequence and a single G4-site, which limits the generalisation of the findings.

      In the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      To directly address the second point, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4 ChIP-qPCR binding was significant within the inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci.

      We next checked the state of chromatin of the G4-array inserted at the 10M locus, or its negative control. Histone marks H3K4Me1, H3K27Ac, H3K27Me3, H3K9me3 and H3K4Me3 were tested at the G4-array, or the negative control locus. Relative increase in the enhancer histone marks was evident, relative to the control sequence. This was largely similar to the 79M locus, supporting an enhancer-like state. Interestingly, here we further noted presence of the H3K27me3 histone mark. The presence of the H3K27Me3 repressor histone mark, along with H3K4Me1/H3K27Ac enhancer histone marks, support a poised enhancer-like status of the inserted G4 region, as has been observed earlier in other studies. Together, although data from the two distinct G4 insertion sites support the enhancer-like state, there are contextual differences likely due to the sequence/chromatin of the sites adjacent to the inserted sequence.

      Effect of the 10M G4-insertion on activation of surrounding genes (10 Mb window), and not the G4-mutant insert, was evident for most genes. Consistent with the enhancer-like state of the G4-array insert; in line with the 79M G4-array insert.

      These results have been added as the final section in the revised version, data is shown in Figure 7.

      Reviewer #2 (Public Review):

      Summary:

      Roy et al. investigated the role of non-canonical DNA structures called G-quadruplexes (G4s) in long-range chromatin interactions and gene regulation. Introducing a G4 array into chromatin significantly increased the number of long-range interactions, both within the same chromosome (cis) and between different chromosomes (trans). G4s functioned as enhancer elements, recruiting p300 and boosting gene expression even 5 megabases away. The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.

      Strength:

      The findings are valuable for understanding the role of G4-DNA in 3D genome organization and gene transcription.

      Weaknesses:

      The study would benefit from more robust and comprehensive data, which would add depth and clarity.

      (1) Lack of G4 Structure Confirmation: The absence of direct evidence for G4 formation within cells undermines the study's foundation. Relying solely on in vitro data and successful gene insertion is insufficient.

      Using the reported G4-specific antibody, BG4, we performed BG4 ChIP-qPCR at the 79M locus. In addition, a second G4-insertion site was created and BG4 ChIP-qPCR was used to validate intracellular G4 formation. Briefed below, more details in the response above.

      In the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4-ChIP-qPCR was significant within the G4-array inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      (2) Alternative Explanations: The study does not sufficiently address alternative explanations for the observed results. The inserted sequences may not form G4s or other factors like G4-RNA hybrids may be involved.

      As mentioned in response to the previous comment, we confirmed that the inserted sequence indeed forms G4s inside the cells. RNA-DNA hybrid G4s can form within R-loops with two or more tandem G-tracks (G-rich sequences) on the nascent RNA transcript as well as the non-template DNA strand (Fay et al., 2017, 28554731). A recent study has observed that R-loop-associated G4 formation can enhance chromatin looping by strengthening CTCF binding (Wulfridge et al., 2023, 37552993). As pointed out by the reviewer, the possibility of G4-RNA hybrids remains, we have mentioned this possibility for readers in the second last paragraph of the Discussion.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope and considerable variation in some data makes conclusions difficult.

      We noted variation with one of the primers in a few ChIP-qPCR experiments (in Figures 2 and 3D). The changes however were statistically significant across replicates, and consistent with the overall trend of the experiments (Figures 2, 3 and 4). Enhancer function, in addition to ChIP, was also confirmed using complementary assays like 3C and RNA expression.

      (4) Statistical Significance and Interpretation: The study could be more careful in evaluating the statistical significance and magnitude of the effects to avoid overinterpreting the results.

      We reconfirmed our statistical calculations from biological replicate experiments. We carefully looked at potential overinterpretations, and made appropriate changes in the manuscript (details of the changes given below in response to comment to authors).

      Reviewer #3 (Public Review):

      Summary:

      This paper aims to demonstrate the role of G-quadruplex DNA structures in the establishment of chromosome loops. The authors introduced an array of G4s spanning 275 bp, naturally found within a very well-characterized promoter region of the hTERT promoter, in an ectopic region devoid of G-quadruplex and annotated gene. As a negative control, they used a mutant version of the same sequence in which G4 folding is impaired. Due to the complexity of the region, 3 G4s on the same strand and one on the opposite strand, 12 point mutations were made simultaneously (G to T and C to A). Analysis of the 3D genome organization shows that the WT array establishes more contact within the TAD and throughout the genome than the control array. Additionally, a slight enrichment of H3K4me1 and p300, both enhancer markers, was observed locally near the insertion site. The authors tested whether the expression of genes located either nearby or up to 5 Mb away was up-regulated based on this observation. They found that four genes were up-regulated from 1.5 to 3-fold. An increased interaction between the G4 array compared to the mutant was confirmed by the 3C assay. For in-depth analysis of the long-range changes, they also performed Hi-C experiments and showed a genome-wide increase in interactions of the WT array versus the mutated form.

      Strengths:

      The experiments were well-executed and the results indicate a statistical difference between the G4 array inserted cell line and the mutated modified cell line.

      Weaknesses:

      The control non-G4 sequence contains 12 point mutations, making it difficult to draw clear conclusions. These mutations not only alter the formation of G4, but also affect at least three Sp1 binding sites that have been shown to be essential for the function of the hTERT promoter, from which the sequence is derived. The strong intermingling of G4 and Sp1 binding sites makes it impossible to determine whether all the observations made are dependent on G4 or Sp1 binding. As a control, the authors used Locked Nucleic Acid probes to prevent the formation of G4. As for mutations, these probes also interfere with two Sp1 binding sites. Therefore, using this alternative method has the same drawback as point mutations. This major issue should be discussed in the paper. It is also possible that other unidentified transcription factor binding sites are affected in the presented point mutants.

      Since the sequence we used to test the effects of G4 structure formation is highly G-rich, we had to introduce at least 12 mutations to be sure that a stable G4 structure would not form in the mutated control sequence. Sp1 has been reported to bind to G4 structures (Raiber et al., 2012). Therefore, Sp1 binding is likely to be associated with the G4-dependent enhancer functions observed here. We also appreciate that apart from Sp1, other unidentified transcription factor binding sites might be affected by the mutations we introduced. We have discussed these possibilities in the fourth paragraph of the Discussion section in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Whilst the data presented is promising and partially supports the authors' conclusion, this reviewer feels that some key controls are missing to fully support the narrative used. Below are my main concerns:

      (1) The main thing missing in the current manuscript is to validate the actual formation of G4 in chromatin context for the repeat inserted by CRISPR-Cas. Whilst I appreciate this will form promptly a G4 in vitro, to fully support the conclusions proposed the authors would need to demonstrate actual G4-formation in cells after insertion. This could be done by ChIP-qPCR using the G4-selective antibody BG4 for example. This is an essential piece of evidence to be added to link with confidence G4-formation to chromatin looping.

      To address the concern regarding whether the inserted G4 sequence forms G4s in cells, as suggested, we used the G4-selective antibody BG4. PCR primers in the study were designed keeping multiple points in mind: Primers should not bind to any site of G/C alteration in the mutated control insert; either the forward/reverse primer is from the adjacent region for specificity; covers adjacent regions for studying any effects on chromatin; and, PCRs optimized keeping in mind the repeats within the inserted sequence. Given these, primer pairs R1-R4 were chosen for further work following optimizations (Figure 2, top panel). For BG4 ChIP-qPCR we used primer pairs R2, which covered >100 bases of the inserted G4-array, or the G4-mutated control. Significant BG4 binding was clear in the G4-array insert, and not in the G4-mutated insert, demonstrating formation of G4s by the inserted G4-array (Figure S4).

      In response to comment #3 below, we inserted the G4-forming sequence (or its mutated control) at a second locus. This insertion was near the 10 millionth position of chromosome 12 (10M insertion locus in text). Here also, BG4 binding was significant within the G4-array inserted region, and not in the negative control region (Figure S8). Together these demonstrate G4 formation by the inserted sequence at two different loci.

      (2) I found the LNA experiment very elegant. However, what would be the effect of LNA treatment on the control sequence that does not form G4s? This control is essential to disentangle the effect of LNA pairing to the sequence itself vs disrupting the G4-structure.

      As per the reviewer’s suggestion, we performed a control experiment where we treated the G4-mutated insert (control) cells with the G4-disrupting LNA probes. The changes in the expression of the surrounding genes in this case were not significant, indicating that the effects observed in the G4-array insert cells were possibly due to disruption of the inserted G4 structures. This data is presented in Figure S5.

      (3) The authors describe their work and present its conclusion as if this were a genome-wide study, whilst the work is focused on a specific genomic location, and the looping, along with the effect on histone acetylation and gene expression, is limited to this. The authors cannot conclude, therefore, that this is a generic effect and the discussion should be more focused on the specific G4s used and the genomic location investigated. Ideally, insertion of a different G4-forming sequence or of the same in a different genomic location is recommended to really claim a generic effect.

      To address this we inserted the G4-array sequence, or the G4-mutated control sequence, at another relatively isolated locus – at the 10 millionth position of chromosome 12 – denoted as 10M. Using BG4 ChIP-qPCR intracellular G4 formation was confirmed. We observed that the enhancer-like features in terms of enhancer histone marks and increase in the expression of surrounding genes were largely reproduced at the 10M locus on G4 insertion (Figure 7). These results are added as the final section under Results.

      Reviewer #2 (Recommendations For The Authors):

      The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.

      While the present manuscript presents an interesting hypothesis, it would benefit from enhanced novelty and more robust data. The study complements existing G4 research (e.g., PMID: 31177910). While the conclusions hold biological relevance, they largely reiterate established knowledge. Furthermore, the presented data appear preliminary and still lack depth and clarity.

      Hou et al., 2019 (PMID: 31177910) showed presence of potential G4-forming sequences correlated with TAD boundaries, along with enrichment of architectural proteins and transcription factor binding sites. Also, other studies noted enrichment of potential G4-forming sequences at enhancers along with nucleosome depletion and higher transcription factor binding (Hou et al., 2021; Williams et al., 2020). These studies proposed the role of G4s in chromatin/TAD states based on analysis of potential G4-forming sequences using correlative bioinformatics analyses. Here we sought to directly test causality. Insertion of G4 sequence, and formation of intracellular G4s in an isolated, G4-depleted region resulted in altered characteristics of chromatin, and not in the negative control insertion that does not form G4s. These, in contrast to earlier studies, directly demonstrates the causal role of G4s as functional elements that impact local and distant chromatin.

      Major concerns:

      (1) Lack of G4 Structure Confirmation: Implement G4-specific antibodies or fluorescent probes to verify G4 structures inside the cells.

      Detailed response given above. Briefly, in the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4 ChIP-qPCR binding was significant within the G4-array inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      (2) Alternative Explanations: Explore the possibility that the sequences may not form G4s or that other factors like G4-RNA hybrids are involved.

      Response provided in the public reviews section.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope. Consider employing G4 ChIP-seq for genome-wide analysis of G4 association with histone modifications. Address inconsistencies in data like H3K27me3 variation and incomplete H3K9me3 data sets.

      A recent study performed G4 CUT&Tag (Lyu et al., 2022, 34792172) and observed G4 formation at both active promoters and active and poised enhancers. We have discussed this in the sixth paragraph of the Discussion. The H3K27Me3 occupancy at the 79M locus insertions did not have any significant G4-dependent changes, however, at the second insertion site at the 10M locus (introduced in the revised manuscript, Figure 7) there was significant G4-dependent increase in H3K27Me3 occupancy along with the H3K4Me1 and H3K27Ac enhancer histone marks, indicating formation of a poised enhancer-like element.

      We completed the H3K9me3 data sets for both insertion sites.

      (4) Statistical Significance and Interpretation: Re-evaluate the statistical significance of results and interpret them in the context of relevant biological knowledge. Avoid overstating the impact of minor changes.     

      We revised several lines to avoid overstating results. Some of the changes are as below (changes underline/strikethrough)

      - There was an a relatively modest increase in the recruitment of both p300 and a substantial increase in the recruitment of the more functionally active acetylated p300/CBP to the G4-array when compared against the mutated control.

      - As expected, although modest, a decrease in the H3K4Me1 and H3K27Ac enhancer histone modifications was evident within the insert upon the LNAs treatment.

      - Moreover, the enhancer marks were relatively reduced, although not markedly, when the inserted G4s were specifically disrupted.

      (5) Unexplored Aspects: Investigate the relationship between G4 DNA and R-loops, and consider the role of CTCF and cohesin proteins in mediating long-range interactions. Integrate existing research to build a more comprehensive framework and draw more robust conclusions.

      As mentioned in response to one of the earlier comments, a recent publication extensively studied the association between G4s, R-loops, and CTCF binding (Wulfridge et al., 2023). While, here we focused on the primary features of a potential enhancer, further work will be necessary to establish how G4s influence the coordinated action between cohesin and CTCF and consequent chromatin looping. We have described this for readers in the second last paragraph of the Discussion in the revised version.

      Minor Concern:

      (1) Enhancer Definition: The term "enhancer" requires specific criteria. Modify the section heading or provide evidence demonstrating the G4 sequence fulfills all conditions for being an enhancer, such as position independence and long-range effects.

      Although we checked some of the primary features of a potential enhancer: Like expression of surrounding genes, enhancer histone marks, chromosomal looping interactions, and recruitment of transcriptional coactivators, further aspects may need to be validated. As suggested, in the revised manuscript the section heading has been modified to ‘Enhancer-like features emerged upon insertion of G4s.’

      Reviewer #3 (Recommendations For The Authors):

      In addition to the points in my public review, I would like to mention some less significant points.

      The authors mention that "the array of G4-forming sequences used for insertion was previously reported to form stable G4s in human cells" (Lim et al., 2010; Monsen et al., 2020; Palumbo et al., 2009). However, upon reading the publications, I found that these observations were made in vitro. I may have missed something, but there are now several mappings of folded-G4 in human cells based on different approaches. It would be beneficial to investigate whether the hTERT promoter is a site of G-quadruplex formation in vivo. If confirmed, a similar analysis should be conducted on the 275 bp region inserted into the ectopic region to determine if it also has the ability to form a structured G4.

      We performed BG4 ChIP to confirm in vivo G4 formation by the inserted G4-array as suggested (Figures S4, S8). Detail response given above. Briefly, in the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4-ChIP-qPCR was significant within the inserted region, and not in the negative control region (Figure S8). Consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      The inserted sequence originates from a well-characterized promoter. The authors suggest that placing it in an ectopic position creates an enhancer-like region, based on the observation of increased levels of H3K27Ac and H3K4me1 on the WT array. To provide a control that it is not a promoter, it would be useful to also analyze a specific mark of promoter activity, such as H3K4me3.

      As suggested by reviewer, we also analysed the H3K4Me3 promoter activation mark at both the 79M and 10M (introduced in the revised manuscript, Figure 7) insertion loci. We did not observe any significant G4-dependent changes in the recruitment of H3K4Me3 (Figures 2, 7).

      In the discussion, the authors mention "it was proposed that inter-molecular G4 formation between distant stretches of Gs may lead to DNA looping". To investigate this further, it would be worthwhile to examine whether the promoter regions of activated genes (PAWR, PPP1R12A, NAV3, and SLC6A15) contain potentially forming G-quadruplexes (pG4). Additionally, sites that establish more contact with the G4 array described in Figure 6F could be analyzed for enrichment in pG4.

      Thank you for pointing this out. We found promoters of the four genes (PAWR, PPP1R12A, NAV3, and SLC6A15) harbour potential G4-forming sequences (pG4s). Also as suggested, we analysed the contact regions in Fig 6F, along with the whole locus, for pG4s. Relative enrichment in pG4 was seen, particularly within the significantly enhanced interacting regions, which at times spreads beyond the interacting regions also. This is shown in the lower panel of Figure 6F in the revised version. We have described this in Discussion for readers.

    1. Author response:

      eLife assessment

      This important study addresses the idea that defective lysosomal clearance might be causal to renal dysfunction in cystinosis. They observe that restoring expression of vATPase subunits and treatment with Astaxanthin ameliorate mitochondrial function in a model of renal epithelial cells, opening opportunities for translational application to humans. The data are convincing, but the description of methodologies is incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Cystinosis is a rare hereditary disease caused by biallelic loss of the CTNS gene, encoding two cystinosin protein isoforms; the main isoform is expressed in lysosomal membranes where it mediates cystine efflux whereas the minor isoform is expressed at the plasma membrane and in other subcellular organelles. Sur et al proceed from the assumption that the pathways driving the cystinosis phenotype in the kidney might be identified by comparing the transcriptome profiles of normal vs CTNS-mutant proximal tubular cell lines. They argue that key transcriptional disturbances in mutant kidney cells might not be present in non-renal cells such as CTNS-mutant fibroblasts.

      Using cluster analysis of the transcriptomes, the authors selected a single vacuolar H+ATPase (ATP6VOA1) for further study, asserting that it was the "most significantly downregulated" vacuolar H+ATPase (about 58% of control) among a group of similarly downregulated H+ATPases. They then showed that exogenous ATP6VOA1 improved CTNS(-/-) RPTEC mitochondrial respiratory chain function and decreased autophagosome LC3-II accumulation, characteristic of cystinosis. The authors then treated mutant RPTECs with 3 "antioxidant" drugs, cysteamine, vitamin E, and astaxanthin (ATX). ATX (but not the other two antioxidant drugs) appeared to improve ATP6VOA1 expression, LC3-II accumulation, and mitochondrial membrane potential. Respiratory chain function was not studied. RTPC cystine accumulation was not studied.

      In this manuscript, as an initial step, we have studied the first step in respiratory chain function by performing the Seahorse Mito Stress Test to demonstrate that the genetic manipulation (knocking out the CTNS gene and plasmid-mediated expression correction of ATP6V0A1) impacts mitochondrial energetics. We did not investigate the respirometry-based assays that can identify locations of electron transport deficiency, which we plan to address in a follow-up paper.

      We would like to draw attention to Figure 3D, where cystine accumulation has been studied. This figure demonstrates an increased intracellular accumulation of cystine.

      The major strengths of this manuscript reside in its two primary findings.

      (1) Plasmid expression of exogenous ATP6VOA1 improves mitochondrial integrity and reduces aberrant autophagosome accumulation.

      (2) Astaxanthin partially restores suboptimal endogenous ATP6VOA1 expression.

      Taken together, these observations suggest that astaxanthin might constitute a novel therapeutic strategy to ameliorate defective mitochondrial function and lysosomal clearance of autophagosomes in the cystinotic kidney. This might act synergistically with the current therapy (oral cysteamine) which facilitates defective cystine efflux from the lysosome.

      There are, however, several weaknesses in the manuscript.

      (1) The reductive approach that led from transcriptional profiling to focus on ATP6VOA1 is not transparent and weakens the argument that potential therapies should focus on correction of this one molecule vs the other H+ ATPase transcripts that were equally reduced - or transcripts among the 1925 belonging to at least 11 pathways disturbed in mutant RPTECs.

      The transcriptional profiling studies on ATP6V0A1 have been fully discussed and publicly shared. Table 2 lists the v-ATPase transcripts that are significantly downregulated in cystinosis RPTECs. We have also clarified and justified the choice of further studies on ATP6V0A1, where we state the following: "The most significantly perturbed member of the V-ATPase gene family found to be downregulated in cystinosis RPTECs is ATP6V0A1 (Table 2). Therefore, further attention was focused on characterizing the role of this particular gene in a human in vitro model of cystinosis."

      (2) A precise description of primary results is missing -- the Results section is preceded by or mixed with extensive speculation. This makes it difficult to dissect valid conclusions from those derived from less informative experiments (eg data on CDME loading, data on whole-cell pH instead of lysosomal pH, etc).

      We appreciate the reviewer highlighting areas for further improving the manuscript's readership. In our resubmission, we have revised the results section to provide a more precise description of the primary findings and restrict the inferences to the discussion section only.

      (3) Data on experimental approaches that turned out to be uninformative (eg CDME loading, or data on whole=cell pH assessment with BCECF).

      We have provided data whether it was informative or uninformative. Though lysosome-specific pH measurement would be important to measure, it was not possible to do it in our cells as they were very sick and the assay did not work. Hence we provide data on pH assessment with BCECF, which measures overall cytoplasmic and organelle pH, which is also informative for whole cell pH that is an overall pH of organelle pH and cytoplasmic pH.

      (4) The rationale for the study of ATX is unclear and the mechanism by which it improves mitochondrial integrity and autophagosome accumulation is not explored (but does not appear to depend on its anti-oxidant properties).

      We have provided rationale for the study of ATX; provided in the introduction and result section, where we mentioned the following: “correction of ATP6V0A1 in CTNS-/- RPTECs and treatment with antioxidants specifically, astaxanthin (ATX) increased the production of cellular ATP6V0A1, identified from a custom FDA-drug database generated by our group, partially rescued the nephropathic RPTEC phenotype. ATX is a xanthophyll carotenoid occurring in a wide variety of organisms. ATX is reported to have the highest known antioxidant activity and has proven to have various anti-inflammatory, anti-tumoral, immunomodulatory, anti-cancer, and cytoprotective activities both in vivo and in vitro”.

      We are still investigating the mechanism by which ATX improves mitochondrial integrity and this will be the focus of a follow-on manuscript.

      (5) Thoughtful discussion on the lack of effect of ATP6VOA1 correction on cystine efflux from the lysosome is warranted, since this is presumably sensitive to intralysosomal pH.

      We have provided a thoughtful discussion in the revised manuscript on some possible mechanisms that may result in an effect of ATP6V0A1 correction on cysteine efflux from the lysosome.

      (6) Comparisons between RPTECs and fibroblasts cannot take into account the effects of immortalization on cell phenotype (not performed in fibroblasts).

      The purpose of examining different tissue sources of primary cells in nephropathic cystinosis was to assess if any of the changes in these cells were tissue source specific. We used primary cells isolated from patients with nephropathic cystinosis—RPTECs from patients' urine and fibroblasts from patients' skin—these cells are not immortalized and can therefore be compared. This is noted in the results section - “Specific transcriptional signatures are observed in cystinotic skin-fibroblasts and RPTECs obtained from the same individual with cystinosis versus their healthy counterparts”.

      We next utilized the immortalized RPTEC cell line to create CRISPR-mediated CTNS knockout RPTECs as a resource for studying the pathophysiology of cystinosis. These cells were not compared to the primary fibroblasts.

      (7) This work will be of interest to the research community but is self-described as a pilot study. It remains to be clarified whether transient transfection of RPTECs with other H+ATPases could achieve results comparable to ATP6VOA1. Some insight into the mechanism by which ATX exerts its effects on RPTECs is needed to understand its potential for the treatment of cystinosis.

      In future studies we will further investigate the effect of ATX on RPTECs for treatment of cystinosis- this will require the conduct of Phase 1 and Phase 2 clinical studies which are beyond the scope of this current manuscript.

      Reviewer #2 (Public Review):

      Sur and colleagues investigate the role of ATP6V0A1 in mitochondrial function in cystinotic proximal tubule cells. They propose that loss of cystinosin downregulates ATP6V0A1 resulting in acidic lysosomal pH loss, and adversely modulates mitochondrial function and lifespan in cystinotic RPTECs. They further investigate the use of a novel therapeutic Astaxanthin (ATX) to upregulate ATP6V0A1 that may improve mitochondrial function in cystinotic proximal tubules.

      The new information regarding the specific proximal tubular injuries in cystinosis identifies potential molecular targets for treatment. As such, the authors are advancing the field in an experimental model for potential translational application to humans.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      The authors fail to truly define codon optimality, rare codons, and stalling sequences in their work, all of which are distinct terminologies. They use reporters with rare codon usage but do not mention what metrics they use to determine this, such as cAI, codon usage bias, or tAI. The distinction between the type of codon sequences that DDX6 affects is very important to differentiate and should be done here as certain stretches of codons are known to lead to different quality control RNA decay pathways that are not reliant on canonical mRNA decay factors.

      Thank you for the reviewer’s feedback on our work. Clearly defining codon optimality, rare codons, and stalling sequences is indeed crucial. We will emphasize this distinction more in our revisions to help readers better understand our analysis and findings.

      Likewise, the authors sort their Ribo-seq data to determine genes that might exhibit a DDX6 specific mRNA decay effect but fail to go into great depth about common features shared among these genes other than GO term analysis, GC content, and coding sequence (CDS) length. The authors then sort out 35 genes that are both upregulated at the mRNA level and have increased local ribosome footprint along the ORF. They are then able to show that 6 out of 9 of those genes had a DDX6-dependent mRNA decay effect. There was no comment or effort as to why 2 out of those 6 genes tested did not show as strong of a DDX6-dependent decay effect relative to the other targets tested. Thus, the efforts to identify mRNA features at a global level that exhibited DDX6-dependent mRNA decay effects are lacking in this analysis.

      We appreciate the reviewer's insightful comments regarding the need to further characterize the genes influenced by DDX6-mediated mRNA decay. To address this, we carried out additional analyses to identify potential traits of these genes. Our findings revealed that DDX6-regulated coding sequences tend to be longer and exhibit lower predicted mRNA stability scores compared to the average across the transcriptome. This observation indicates a possible connection to codon optimality. It suggests that DDX6 could play a role in regulating a specific subset of mRNAs with inherently lower stability, potentially shedding light on why some genes may exhibit varied decay patterns when DDX6 is depleted.

      Overall, the work done by Weber et al. is sound, with the proper controls. The authors expand significantly on the knowledge of what we know about DDX6 in the process of mRNA decay in humans, confirming the evolutionary conservation of the role of this factor across eukaryotes. The analysis of the RNA-seq and Ribo-seq data could be more in-depth, however, the authors were able to show with certainty that some transcripts containing known repetitive sequences or polybasic sequences exhibited a DDX6-mRNA decay effect.

      We appreciate the reviewer’s acknowledgment of the soundness of our work and the inclusion of proper controls. We are committed to refining our manuscript to meet your expectations and ensure the accuracy and depth of our findings.

      Reviewer #2 (Public Review):

      The experiments were well-performed, and the results clearly demonstrated the requirement of DDX6 in mRNA degradation induced by slowed ribosomes. However, in some cases, the authors interpreted their data in a biased way, possibly influenced by the yeast study, and drew too strong conclusions. In addition, the authors should have cited important studies about codon optimality in mammalian cells. This lack of information hinders placing their important discoveries in a correct context.

      (1) Although the authors concluded that DDX6 acts as a sensor of the slowed ribosome, it is not clear if DDX6 indeed senses the ribosome speed. What the authors showed is a requirement of DDX6 for mRNA decay induced by rare codons, and DDX6 binds to the ribosome to exert this role. For example, DDX6 may bridge the sensor and decay machinery on the ribosome. Without structural or biochemical data on the recognition of the slowed ribosome by DDX6, the role of DDX6 as a sensor remains one of the possible models. It should be described in the discussion section.

      We greatly appreciate the reviewer’s comments and suggestions. We agree that our study does not directly establish that DDX6 senses ribosome speed. We also agree that without structural or biochemical data demonstrating recognition of the slowed ribosome by DDX6, the role of DDX6 as a sensor remains one of the possible models. We will incorporate this point into the discussion section and acknowledge it as an important direction for future research.

      (2) It is not clear if DDX6 directly binds the ribosome. The authors used ribosomes purified by sucrose cushion, but ribosome-associating and FDF motif-interacting factors might remain on ribosomes, even after RNaseI treatment. Without structural or biochemical data of the direct interaction between the ribosome and DDX6, the authors should avoid description as if DDX6 directly binds to the ribosome.

      We agree with the reviewer’s perspective that, even after RNase I treatment, factors associated with the ribosome and interacting with the FDF motif might still remain on the ribosomes that were purified via a sucrose cushion. In the revised manuscript, we will describe the relationship between DDX6 and the ribosome more cautiously, avoiding the depiction of DDX6 directly binding to the ribosome.

      (3) Although the authors performed rigorous reporter assays recapitulating the effect of ribosome-retardation sequences on mRNA stability, this is not the first report showing that codon optimality determines mRNA stability in human cells. The authors did not cite important previous studies, such as Wu et al., 2019 (PMID: 31012849), Hia et al., 2019 (PMID: 31482640), Narula et al., 2019 (PMID: 31527111), and Forrest et al., 2020 (PMID: 32053646). These milestone papers should be cited in the Introduction, Results, and Discussion.

      Thank you for the reviewer’s correction. We apologize for the oversight in our references. In the revised manuscript, we will ensure these key studies are appropriately cited.

      (4) While both DDX6 and deadenylation by the CCR4-NOT were required for mRNA decay by the slowed ribosome, whether DDX6 is required for deadenylation was not investigated. Given that the CCR4-NOT deadenylate complex directly interacts with the empty ribosome E-site in yeast and humans (Buschauer et al., 2020 PMID: 32299921 and Absmeier et al., 2023 PMID: 37653243), whether the loss of DDX6 also affected the action of the CCR4-NOT complex is an important point to investigate, or at least should be discussed in this paper.

      We sincerely appreciate the reviewer's valuable suggestions. This point is indeed crucial, and we have addressed it in the revised version of our manuscript. We have included experimental results confirming that the knockout of DDX6 does not impact the CCR4-NOT complex’s deadenylation function. This addition will contribute to a more comprehensive discussion of the relevant issues and refine our manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors should explain what they use to determine rare codons in their system and distinguish this feature from codon optimality. Codon optimality is a distinct feature from rare codon usage, and both should be defined better in the context of the paper. The authors interchange between the use of codon optimality, rare codon usage, and translation stalling sequences frequently and should explain and clarify these terms or consider only referring to translation stalling sequences for their discussion.

      We appreciate the reviewer's valuable feedback, we have been able to improve the clarity and rigor of the relevant statements in the manuscript. In the revised manuscript, we have provided more explicit and detailed explanations regarding the definition and use of rare codons, and differentiated this from codon optimality, in order to help readers better understand the basis of our analysis and research findings. Furthermore, in the revised manuscript, we are now referring exclusively to 'translation stalling sequences' in our discussion, in order to provide greater clarity.

      Reviewer #2 (Recommendations For The Authors):

      Interestingly, the translation efficiency of zinc-finger domain mRNAs was increased in DDX6 KO cells. This finding is consistent with the previous study reporting that mRNAs encoding zinc-finger domains are enriched with non-optimal codons and unstable. (Diez et al., 2022 PMID: 35840631). The authors might want to cite this paper and mention the consistency of the two studies.

      Thank you for noting the relevance of the increased translation efficiency of zinc-finger domain mRNAs in DDX6 KO cells. We will reference the study by Diez et al. (2022) and emphasize the consistency between their findings and ours, which supports the idea that DDX6 is involved in regulating the translation of mRNAs with these characteristics.

      A mutagenesis analysis of the poly-basic residues of BMP2 would further strengthen the authors' claim that this sequence is a primal cause of ribosome slowdown and mRNA decay.

      We greatly appreciate the reviewer’s suggestion to conduct a mutagenesis analysis of the poly-basic residues of BMP2. We agree that such an analysis could potentially strengthen our claim. However, considering the constraints we are currently encountering, and our study has already provided substantial evidence to support our findings, we believe that at this stage of our research, conducting this analysis may not be the most immediate priority. We will consider undertaking a mutagenesis analysis in future studies to further validate our conclusions.

      In the Introduction, RQC is not commonly referred to as "ribosome-based quality control." Please consider the use of "ribosome-associated quality control."

      We appreciate the reviewer providing this suggestion. During the revision process, we corrected the relevant terminology to ensure more precise and appropriate usage.

      In the Introduction, the authors should avoid introducing NMD as a part of RQC. NMD was discovered and defined independently of RQC.

      Thank you for pointing out this important distinction. We recognize that NMD was discovered and defined independently from RQC, and should not be presented as an integral part of the RQC process. In the revised manuscript, we have made sure to avoid introducing nonsense-mediated decay (NMD) as a component of ribosome-associated quality control (RQC).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      Detection of early-stage colorectal cancer is of great importance. Laboratory scientists and clinicians have reported different exosomal biomarkers to identify colorectal cancer patients. This is a proof-of-principle study of whether exosomal RNAs, and particularly predicted lncRNAs, potential biomarkers of early-stage colorectal cancer and its precancerous lesions.

      Strengths:

      The study provides a valuable dataset of the whole-transcriptomic profile of circulating sEVs, including miRNA, mRNA, and lncRNA. This approach adds to the understanding of sEV-RNAs' role in CRC carcinogenesis and facilitates the discovery of potential biomarkers.

      The developed 60-gene t-SNE model successfully differentiated T1a stage CRC/AA from normal controls with high specificity and sensitivity, indicating the potential of sEV-RNAs as diagnostic markers for early-stage colorectal lesions.

      The study combines RNA-seq, RT-qPCR, and modelling algorithms to select and validate candidate sEV-RNAs, maximising the performance of the developed RNA signature. The comparison of different algorithms and consideration of other factors enhance the robustness of the findings.

      Weaknesses:

      Validation in larger cohorts would be required to establish as biomarkers, and to demonstrate whether the predicted lncRNAs implicated in these biomarkers are indeed present, and whether they are robustly predictive/prognostic.

      Thank you for your careful evaluation and valuable suggestions, which have provided valuable guidance for the improvement of our paper. In response to your feedback, we have implemented the following improvements.

      (1) More detail about how lncRNA and miRNA candidates were defined, and how this compares to previously published miRNA and lncRNA predictions. The Suppl Methods section for lncRNAs does not describe in detail how the "CPC/CNCI/Pfam" "methods" were combined to define lncRNAs here.

      Author response and action taken: Thanks for your comments. In the Supplementary Methods section titled " Selection of Predictive Biomarkers", we have provided a more detailed illustration regarding the screening process for candidate RNA biomarkers. The revised section is as follows: To ensure the predictive performance of the sEV-RNA signature, candidate sEV-RNAs were ultimately selected based on their fold change in colorectal cancer/ precancerous advanced adenoma, absolute abundance, and module attribution. In detail, we initially selected the top 10 RNAs from each category (mRNA, miRNA, and lncRNA) with a fold change greater than 4. In cases where fewer than 10 RNAs were meeting this criterion, all RNAs with a fold change greater than 4 were included. Subsequently, we filtered out RNAs with low abundance, and we selected the top-ranked RNAs from each module based on the fold change ranking for inclusion in the final model.

      Compared to most previous studies on EV biomarkers, the overall discriminative performance of the biomarker model we constructed is considerable, holding clinical value for practical application. In contrast, the supplementary merit of this study lies in uncovering the heterogeneity at the whole transcriptome level among samples of different categories, providing a more comprehensive insight into the dynamic changes of biological states. For instance, we inferred the cell subtypes of EV origins through ssGSEA and correlated them with the tumor microenvironment status. The regulatory relationships among different RNA categories were delineated, and their impacts on biological signaling pathways were analyzed, a feat challenging to accomplish solely through sequencing of a single RNA category.

      In the Supplementary Methods section titled " Identification of mRNAs and lncRNAs", we have provided a more detailed explanation regarding how the "CPC/CNCI/Pfam" methods were combined to define lncRNAs. The revised section is as follows: Three computational approaches including CPC (Coding Potential Calculator)/CNCI (Coding-Non-Coding Index)/Pfam were combined to sort non-protein coding RNA candidates from putative protein-coding RNAs in the unknown transcripts. CPC is a sequence alignment-based tool used to assess protein-coding capacity. By aligning transcripts with known protein databases, CPC evaluates the biological sequence characteristics of each coding frame of the transcript to determine its coding potential and identify non-coding RNAs.1 CNCI analysis is a method used to distinguish between coding and non-coding transcripts based on adjacent nucleotide triplets. This tool does not rely on known annotation files and can effectively predict incomplete transcripts and antisense transcript pairs.2 Pfam divides protein domains into different protein families and establishes statistical models for the amino acid sequences of each family through protein sequence alignment.3 Transcripts that can be aligned are considered to have a certain protein domain, indicating coding potential, while transcripts without alignment results are potential lncRNAs. Putative protein-coding RNAs were filtered out using a minimum length and exon number threshold. Transcripts above 200 nt with more than two exons were selected as lncRNA candidates and further screened by CPC/CNCI/Pfam. We distinguished lncRNAs from protein-coding genes by intersecting the results of the three determination methods mentioned above.

      (2) The role and function of many lncRNAs are unknown, and some lncRNA species may simply be the product of pervasive transcription. Although this is an exploratory and descriptive study of potential biomarkers, it would benefit from some discussion of potential mechanisms because the proposed prediction models include lncRNAs. Do the authors have a hypothesis as to why lncRNAs were informative and predictive in this study? Are these lncRNAs well-studied and/or known to be functional? Or are they markers for pervasive transcription, for example?

      Author response and action taken: Thanks for your comments. Whole transcriptome sequencing results facilitate the discussion of regulatory mechanisms between different biomarkers, supplying evidence for future investigations. Among the three lncRNAs involved in this study, lnc-MKRN2-42:1 is involved in the occurrence and development of Parkinson's disease4. The other two lncRNAs, however, lack relevant reports. Therefore, we cannot confirm that these lncRNAs have specific biological functions. In the Supplementary Methods section titled " Identification of mRNAs and lncRNAs", we acknowledge the limited understanding of sEV-lncRNAs in current research. In contrast, many miRNAs in the model have been proven to participate in the occurrence and development of colorectal cancer, such as miR-36155, miR-425-5p6, and miR-106b-3p7. These data provide biological support for the performance of the model, which is particularly valuable for model prediction.

      (3) In the Results section "Cell-specific features of the sEV-RNA profile indicated the different proportion of cells of sEV origin among different groups", the sEV-RNA profiles were correlated with existing transcriptome profiles from specific cell types (ssGSEA) and used to estimate "tumour microenvironment-associated scores". This transcriptomic correlation is a valuable observation, but there is no further evidence provided that the sEV-RNAs profiles truly reflect differential cell types of sEV origin between the sample subgroups.

      Could the authors clarify the strength of evidence for the cells-of-origin estimates, which are based only on sEV-RNA transcriptome profiles? Would sEV-RNA-derived cells-of-origin be expected to correlate with histopath-derived scores (tumour microenvironment; immune infiltrate) for example? Or is this section intended as an exploratory description of sEV-RNAs, perhaps a check on the plausibility of the sEV-RNA profiles, rather than an accurate estimation of cells-of-origin in each subgroup?

      Author response: Thanks for your comments. This section explores the proportional distribution of EVs from different cellular subgroups solely based on transcriptome profiles and algorithms, rather than providing precise estimates of cellular origins within each subgroup.

      (4) Software and R package version numbers should be provided.

      Author response and action taken: Thanks for your comments. We have added version information for relevant R packages at the first mention in the original text (e.g., WGCNA (version 1.61), Rtsne (version 0.15), GSVA (version 1.42.0), ESTIMATE (version 1.0.13), DOSE (version 3.8.0)).

      References

      (1) Kong L, et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, W345-349 (2007).

      (2) Sun L, et al. Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res. 41, e166 (2013).

      (3) Finn RD, et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222-230 (2014).

      (4) Wang Q, et al. Integrated analysis of exosomal lncRNA and mRNA expression profiles reveals the involvement of lnc-MKRN2-42:1 in the pathogenesis of Parkinson's disease. CNS Neurosci Ther. 26, 527-537 (2020).

      (5) Zheng G, et al. Identification and validation of reference genes for qPCR detection of serum microRNAs in colorectal adenocarcinoma patients. PLoS One. 8, e83025 (2013).

      (6) Liu D, Zhang H, Cui M, Chen C, Feng Y. Hsa-miR-425-5p promotes tumor growth and metastasis by activating the CTNND1-mediated β-catenin pathway and EMT in colorectal cancer. Cell Cycle. 19, 1917-1927 (2020).

      (7) Liu H, et al. Colorectal cancer-derived exosomal miR-106b-3p promotes metastasis by down-regulating DLC-1 expression. Clin Sci (Lond). 134, 419-434 (2020).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides important new insights into how multisensory information is processed in the lateral cortex of the inferior colliculus, a poorly understood part of the auditory midbrain. By developing new imaging techniques that provide the first optical access to the lateral cortex in a living animal, the authors provide convincing in vivo evidence that this region contains separate subregions that can be distinguished by their sensory inputs and neurochemical profiles, as suggested by previous anatomical and in vitro studies. Additional information and analyses are needed, however, to allow readers to fully appreciate what was done, and the comparison of multisensory interactions between awake and anesthetized mice would benefit from being explored in more detail.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors provide a characterisation of auditory responses (tones, noise, and amplitude-modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher-order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristics with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group has previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from the auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised mice appear to be more responsive to more complex sounds (amplitude-modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gabaergic modules in LC. However, while both LC and DC appear to have low-frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice, somatosensory inputs are capable of driving responses on their own in the modules of LC, but very little (possibly not at all) in the matrix. However, bimodal interactions may be different under awake and anesthesia in LC, which warrants deeper investigation by the authors: They find, under anesthesia, more bimodal enhancement in modules of LC compared to the matrix of LC and bimodal suppression dominating the matrix of LC. In contrast, under awake conditions bimodal enhancement is almost exclusively found in the matrix of LC, and bimodal suppression dominates both matrix and modules of LC.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher-order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      Strengths:

      The major strength of this study is undoubtedly the fact that the authors for the first time provide optical access to a subcortical region (the lateral cortex of the inferior colliculus (i.e. higher order auditory midbrain)) which we know (from previous work by the same group) have optically identifiable subdivisions with unique inputs and neurotransmitter release, and plays a central role in auditory and multisensory processing. A description of basic auditory and multisensory properties of this structure is therefore very useful for understanding auditory processing and multisensory interactions in subcortical circuits.

      Weaknesses:

      I have divided my comments about weaknesses and improvements into major and minor comments. All of which I believe are addressable by the reviewers to provide a more clear picture of their characterisation of the higher-order auditory midbrain.

      Major comment:

      (1) The differences between multisensory interactions in LC in anaesthetised and awake preparations appear to be qualitatively different, though the authors claim they are similar (see also minor comment related to figure 10H for further explanation of what I mean). However, the findings in awake and anaesthetised conditions are summarised differently, and plotting of similar findings in the awake figures and anaesthetised figures are different - and different statistics are used for the same comparisons. This makes it very difficult to assess how multisensory integration in LC is different under awake and anaesthetised conditions. I suggest that the authors plot (and test with similar statistics) the summary plots in Figure 8 (i.e. Figure 8H-K) for awake data in Figure 10, and also make similar plots to Figures 10G-H for anaesthetised data. This will help the readers understand the differences between bimodal stimulation effects on awake and anaesthetised preparations - which in its current form, looks very distinct. In general, it is unclear to me why the awake data related to Figures 9 and 10 is presented in a different way for similar comparisons. Please streamline the presentation of results for anaesthetised and awake results to aid the comparison of results in different states, and explicitly state and discuss differences under awake and anaesthetised conditions.

      We thank the reviewer for the valuable suggestion. We only highlighted the similarities between the data obtained from anesthetized and awake preparations to indicate the ability to reproduce the technique in awake animals for future assessment. Identifying those similarities between the two experimental setups was based on the comparison between modules vs matrix or LC vs DC within each experimental setup (awake vs anesthetized). Therefore, the statistics were chosen differently for each setup based on the size of the subjects (n) within each experimental preparation. However, we agree with the reviewer’s comment that there are differences between the anesthetized and awake data. To examine these differences, we ran the same statistics for Figure 5 (tonotopy of LC vs. DC-anesthetic animals) and Figure 9 (tonotopy of LC vs DC-awake animals). In addition, we added a new figure after Figure 9 to separate the statistical analysis from the maps. Accordingly, Figures 4 and 5 (maps and analysis, respectively -anesthetized animals) now match Figures 9 and 10 (maps and analysis, respectively – awake animals). We also did the same thing for Figures 7 (microprism imaging of the LC - anesthetized animals), 8 (imaging of the LC from the dorsal surface - anesthetized animals) as well as Figure 11 or old Figure 10 (microprism imaging of the LC - awake animals) to address the similarities and differences of the multisensory data between awake and anesthetized animals. We edited the text accordingly in the result and discussion sections.

      (2) The claim about the degree of tonotopy in LC and DC should be aided by summary statistics to understand the degree to which tonotopy is actually present. For example, the authors could demonstrate that it is not possible/or is possible to predict above chance a cell's BF based on the group of other cells in the area. This will help understand to what degree the tonotopy is topographic vs salt and pepper. Also, it would be good to know if the gaba'ergic modules have a higher propensity of particular BFs or tonotopic structure compared to matrix regions in LC, and also if general tuning properties (e.g. tuning width) are different from the matrix cells and the ones in DC.

      Thank you for the reviewer’s suggestion. We have examined the tonotopy of LC and DC using two regression models (linear and quadratic polynomial) between the BFs of the cells and their location on the anatomical axis. Therefore, the tonotopy is indicated by a significant regression fit with a high R2 between the BFs the cells, and their location within each structure. For the DC, there was a significant regression fit between the BFs of the cells and their locations over the rostromedial to the caudolateral axis. Additionally, the R2 of the quadratic polynomial fit was higher than that of the linear fit, which indicates a nonlinear distribution of cells based on their BFs, which is consistent with the presence of high-low-high tuning over the DC surface. Given that the microprism cannot image the whole area of the LC, and it images a slightly different area in each animal, it was very difficult to get a consistent map for the LC as well as a solid conclusion about the LC tonotopy. However, we have examined the regression fit between the BFs of cells and their location along the main four anatomical axes of the field of view obtained from each animal (dorsal to ventral), (rostral to caudal), (dorsocaudal to ventrorostral) (dorsorostral to ventrocoudal). Unlike the DC, the LC imaged via microprism showed a lower R2 for both linear and quadratic regression mostly in the dorsoventral axis. We show the fitting curves of these regressions in Figure 4-figure supplement 1 (anesthetized data) and Figure 9-figure supplement 1 (awake data). Despite the inconsistent tonotopy of the LC imaged via microprism, the modules were found to have a higher BFs median at 10 kHz compared to matrix that had a lower BFs median at 7.1 kHz, which was consistent across the anesthetized and awake animals. We have added these results in the corresponding spot in the results section (lines 193-197 and 361-364). We have examined the tuning width using the binarized receptive field sum (RFS) method in which each neuron was given a value of 1 if it responds to a single frequency (Narrow RF), but this value increases if the neuron responds to more neighbor frequencies (wide RF). We did this calculation across all the sound levels. Both DC and LC of the anesthetized animals had higher RFS mean and median than those of awake animals given that ketamine was known to broaden the RF. However, in both preparations (anesthetized and awake), the DC had a higher RFS mean than that of the LC, which could be consistent with the finding that the DC had a relatively lower SMI than the LC. To show these new data, we made a new Figure 10-figure supplement 1, and we edited the text accordingly [lines 372-379 & 527-531].

      (3) Throughout the paper more information needs to be given about the number of cells, sessions, and animals used in each panel, and what level was used as n in the statistical tests. For example, in Figure 4 I can not tell if the 4 mice shown for LC imaging are the only 4 mice imaged, and used in the Figure 4E summary or if these are just examples. In general, throughout the paper, it is currently not possible to assess how many cells, sessions, and animals the data shown comes from.

      Thank you for the reviewer’s comment. We do apologize for not adding this information. We added all the information regarding the size of the statistical subjects (number of cells or number of animals used) for every test outcome. To keep the flow of the text, we added the details of the statistical tests in the legends of the figures.

      (4) Throughout the paper, to better understand the summary maps and plots, it would be helpful to see example responses of the different components investigated. For example, given that module cells appear to have more auditory offset responses, it would be helpful to see what the bimodal, sound-only, and somatosensory responses look like in example cells in LC modules. This also goes for just general examples of what the responses to auditory and somatosensory inputs look like in DC vs LC. In general example plots of what the responses actually look like are needed to better understand what is being summarised.

      Thank you for the reviewer’s comment and suggestion. We modified Figure 6 and the text accordingly to include all the significant examples of cells discussed throughout the work.

      Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      The main achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons), and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it. The writing is not as precise as it could be. Consequently, the manuscript is unclear in some places. For instance, the text is somewhat confusing as to whether there is a difference in the pattern (modules vs matrix) of somatosensory-auditory suppression between anesthetized and awake animals. Furthermore, there are aspects of the results which are potentially very interesting but have not been explored. For example, there is a remarkable degree of clustering of response properties evident in many of the maps included in the paper. Taking Figure 7 for instance, rather than a salt and pepper organization we can see auditory responsive neurons clumped together and non-responsive neurons clumped together and in the panels below we can see off-responsive neurons forming clusters (although it is not easy to make out the magenta dots against the black background). This degree of clustering seems much stronger than expected and deserves further attention.

      Thank you for the reviewer’s comment. We do apologize if some areas in the manuscript were imprecisely written. For anesthetized and awake data, we have only emphasized the similarities between the two setups to show the ability to use microprism in awake animals for future assessment. To highlight the differences between anesthetized and awake animals, we have now run uniform statistics for all the data collected from both setups. Accordingly, we have edited Figures 4 and 5 (tonotopy-anesthetized) to match Figures 9 and new Figure 10 (tonotopy-awake). Also, we edited Figures 7 and 8 (multisensory- anesthetized) to match Figure 11 or old Figure 10 (multisensory- awake). We edited the text accordingly in the results section and discussed the possible differences between anesthetized and awake data in the discussion section [lines 521-553].

      We agree with the reviewer’s comment that the cells were topographically clustered based on their responses. Some of these clusters include the somatosensory responsive cells, which were located mostly in the modules (Figures 7D and 8E). Also, the auditory responsive cells with offset responses were clustered mostly in the modules (Figures 7C and 8F). Accordingly, we have edited the text to emphasize this finding.

      We noticed also that some responsive cells to the tested stimulations were surrounded by nonresponsive cells. By comparing the response of the cells to different stimuli we found that while Figures 7 and 11 (old Figure 10) showed only the response of the cells to auditory stimulation (unmodulated broadband noise at 80 dB) and somatosensory stimulation (whisker deflection), some nonresponsive cells to these specific stimulations were found to be responsive to pure tones of different frequencies and amplitudes. As an indicator of the cells' viability, we additionally examined the spontaneous activity of the nonresponsive cells across different data sets. We note that spontaneous activity was rare for all cells even among the responsive cells to sound or somatosensory stimulations. This finding could be related to the possibility that the 2P imaging of calcium signals may not be sensitive enough to track spontaneous activity that may originate from single spikes. However, in some data sets, we have found that the cells that did not respond to any tested stimuli showed spontaneous activity when no stimulation was given indicating the viability of those cells. We have addressed the activity of the non-responsive cells in the text along with a new Figure 11-figure supplement 1.

      We changed the magenta into a green color to be suitable for the dark background. Also, we have completely changed the color palette of all of our images to be suitable for color-blind readers as suggested by reviewer 1.

      Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were far more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was reversed in the awake prep, where modular neurons became more responsive to somatosensory stimuli than auditory stimuli. Thus, to this reviewer, the most intriguing result of the present study is the dramatic extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggest that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, and the limitations of two-photon imaging for tracking neural activity are acknowledged. Appropriate statistical tests were used. There are three main issues the authors should address, but otherwise, this study represents an important advance in the field.

      (1) Please address whether the Thy1 mouse evenly expresses jRGECO1a in all LC neurons. It is known that these mice express jRGECO1a in subsets of neurons in the cerebral cortex, and similar biases in the LC could have biased the results here.

      Thank you for the reviewer’s comment. In the work published by Dana, et al, the expression of jRGECO1a in all Thy1 mouse lines was determined by the brightness of the jRGECO1a in the soma. Given that some cells do not show a detected level of jRGECO1a fluorescence until activated, the difference in expression shown in different brain regions could be related to the level of neuronal activity at the time of sample processing and not the expression levels of the indicator itself. To the best of our knowledge, there is no antibody for jRGECO1a, which can be used for detecting the expression levels of the indicator regardless of the neuronal activity. To test the hypothesis that DC and LC have different levels of jRGECO1a, we examined the expression levels of jRGECO1a after we perfused the mice with high potassium saline to elicit a general neuronal depolarization in the whole brain. Then we immunostained against NeuN (the neuronal marker) to quantify the percentage of the neurons expressing jRGECO1a to the total number of neurons (indicated by NeuN). To have a fair comparison, we restricted our analysis to include the areas imaged only by 2P as some regions were not accessible by microprism such as the deep ventral regions of the LC. There is a similar % of cells expressing jRGECO1a in DC and LC. As expected, the neurons expressing jRGECO1a were only nonGABAergic cells. We addressed these findings in the new Figure 3-figure Supplement 1 as well as the corresponding text in the results [lines 178-184] and methods sections [lines 878-892].

      (2) I suggest adding a paragraph or two to the discussion to address the large differences observed between the anesthetized and awake preparations. For example, somatosensory responses in the modules increased dramatically from 14.4% in the anesthetized prep to 63.6% in the awake prep. At the same time, auditory responses decreased from 52.1% to 22%. (Numbers for anesthetized prep include auditory responses and somatosensory + auditory responses.). In addition, the tonotopy of the DC shifted in the awake condition. These are intriguing changes that are not entirely expected from the switch to an awake prep and therefore warrant discussion.

      Thank you for the reviewer’s comment. To determine if differences exist between anesthetized and awake data, we have now used the same statistics and edited Figures 4,5,7,8,9, and 10 as well as added a new Figure 11. Accordingly, we have edited the result section and added a paragraph addressing the possible differences between the two preparations in the Discussion section [lines 521-553]..

      (3) For somatosensory stimuli, the authors used whisker deflection, but based on the anatomy, this is presumably not the only somatosensory stimulus that affects LC. The authors could help readers place the present results in a broader context by discussing how other somatosensory stimuli might come into play. For example, might a larger percentage of modular neurons be activated by somatosensory stimuli if more diverse stimuli were used?

      We agree with the reviewer’s point. Indeed, the modules are receiving different inputs from different somatosensory sources such as somatosensory cortex and dorsal column nuclei, which could indicate that the activity of the cells in the modular areas could be evoked by different types of somatosensory stimulations, which is an open area for future studies. We have discussed this point in the revised Discussion section [lines 516-520].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Figure 3H: The lateral surface seems quite damaged by the prism. An example slice of the imaging area of each mouse would help the reader better understand the extent of damage the prism leaves in the area of interest.

      Thank you for the reviewer’s comment. We already have included such images in Figures 4A, 7A, and 9A to present the field of view of all prism experiments. However, we need to clarify the point of tissue damage. The insertion of microprism may be associated with some tissue damage as a result of making the pocket for the microprism to be inserted, but it is not possible to get neuronal signals from a damaged field of view. Therefore, we do not believe that there is tissue damage to the parts of the LC imaged by microprism. However, there may be some areas where the microprism is not in direct contact with the LC surface. These areas are located mostly in the periphery of the field of view, and they are completely black as they are out of focus (i.e., the left side of Figure 3B). The right side of Figure 3b as well as Figure 3A have some black areas, which present the vasculatures, where there are no red signals because of the lack of jRGECO1a expression in those areas.

      (2) In relation to the data shown in Figure 4E it is claimed that LC is tuned to higher frequencies (lines 195-196). However, the majority of cells appear to be tuned to frequencies below 14kHz (with a median of 7.5 kHz), which is quite low for the mouse. I assume that the authors mean frequencies that are relatively higher than the DC, but it is worth mentioning in the text that the BFs found in the LC are quite low-frequency responses for the mouse.

      Thank you for the reviewer’s comment, which we agree with. We edited this part by acknowledging that around 50% of the LC cells had a low-frequency bias to 5 and 7.1 kHz. Then we mentioned that most of the LC cells are tuned to relatively higher frequencies than those of the DC [lines 215-218].

      (3) Figure 5A-C: Is it the tone-responsive cells plus an additional ~22% of cells that respond to AM, or are there also cells that respond to tones that do not respond to AM. Please break down to which degree the tone and AM responsive cells are overlapping.

      Thank you for the reviewer’s comment and suggestion. We broke down the responsive cells into cells responsive only to pure tone (tone selective cells or Tone-sel) or to only AM-noise (noise selective cells or Noise-sel) as well as cells responding to both sounds (nonselective cells or Non-sel). We examined the fractions of these categories of cells in both LC and DC within all responsive neurons. Accordingly, we have edited Figure 5A-C as well as the text [lines 229-243].

      (4) Figure 5D. It is unclear to me how a cell is classified as SMI or TMI responsive after computing the SMI or TMI for each cell. What statistic was used to determine if the cell was responsive or not?

      Thank you for the reviewer’s comment. We do apologize for the confusion caused by Figures 5D and E. These figures do not show the values of SMI or TMI, respectively. Rather, the figures show the percentage of the spectrally or temporally modulated cells, respectively. At each sound level, the cells were categorized into two main types. The spectrally modulated cells are those responsive to pure tones or unmodulated noise, so they can detect the spectral features of the sound (old Figure 5D or new Figure 5E). The temporally modulated cells are those responsive to AM-noise, so they can detect the temporal features of the sound of complex spectra like the broadband noise (old Figure 5E or new Figure 5F). To clear this confusion, we removed the words SMI and TMI from the figures, and then we renamed the x-axis label into “% of spectrally modulated cells” and “% of temporally modulated cells” for Figures 5D (new 5E) and E (new 5F), respectively.

      (5) Figure 5 D, E: Is the decrease in SMI and TMI modulated cells in the modules a result of simply lower sensitivity to sounds (i.e. higher response thresholds)? If a cell responds to neither tone, AM, or noise it will have a low SMI and TMI index. If this is the case that affects the interpretation, as it is then not a decrease in sensitivity to spectral or temporal modulation, but instead a difference in overall sound sensitivity.

      Thank you for the reviewer’s comment. We apologize for the confusion about Figures 5E and D, which did not show the SMI and TMI values. Rather, they show the percentage of spectrally or temporally modulated cells, respectively, as explained in our previous response. Therefore, Figure 5D shows the percentage of cells that can detect the spectral features of sound, while Figure 5E shows the percentage of cells that can detect the temporal features of sounds of complex spectra like broadband noise. Accordingly, Figures 5D and E show the sensitivity to different features of sound and not the overall sound sensitivity.

      (6) Figure 7 and 8: What is the false positive rate expected of the responsive cells using the correlation cell flagging criteria? Especially given that the fraction of cells responsive to somatosensory stimulation in LC (matrix) is 0.88% and 1.3% in DC, it is important to know what the expected false positive rate is in order to be able to state that there are actually somatosensory responses there or if this is what you would expect from false positives given the inclusion test used. Please provide an estimate of the false positive rate given your inclusion test and show that the rate found is statistically significantly above that level - and show this rate with a line in Figure 7 H, I.

      Thank you for the reviewer’s comment. To test the efficiency of the correlation method to determine the responsive cells, we initially ran an ROC curve comparing the automated method to a blinded human interpretation. The AUC of the ROC curve was 0.88. This high AUC value indicates that the correlation method can rank the random responsive cells than the random nonresponsive cells. At the correlation coefficient (0.4), which was the cutoff value to determine the responsive cells for somatosensory stimulation, the specificity was 87% and the sensitivity 72%, the positive predictive value was 73%, and the negative predictive value was 86%. Although the above percentages indicate the efficiency of the correlation method, we excluded all the false responsive cells from the analysis. Therefore, the fractions of cells in the graphs are the true responsive cells with no contamination of the non-responsive cells. We also modified Figures 7H and I to match the other data sets obtained from awake animals. Therefore, Figures 7H and I no longer show the average of the responsive cells. Instead, they show the % of different fractions of responsive cells within each cellular motif (modules and matrix). Accordingly, we believe that there is no need to include a rate line on the graph. We added the section describing the validation part to the methods section [lines 808-815].

      (7) Figure 7: Please clarify what is meant by a cell responding to 'both responding to somatosensory and auditory stimulation'. Does it mean that the cell has responses to both auditory and somatosensory stimulation when presented individually or if it responds to both presented together? If it is the former, I don't understand how the number to both can be higher than the number of somatosensory alone (as both requires it also to respond to somatosensory alone). If it is the latter (combined auditory and somatosensory) then it seems that somatosensory inputs remove the responsiveness of most cells that were otherwise responsive to auditory alone (e.g. in the module while 42% respond to sound alone, combined stimulation would leave only 10% of cells responsive). Please clarify what exactly the authors are plotting and stating here.

      Thank you for the reviewer’s comment. The responsive cells in Figure 7 are divided into three categories. Each category has a completely different group of cells. The first category is for the cells responding only to auditory stimulation (auditory-selective cells or Aud-sel). The second category is for the cells that respond only to somatosensory stimulation (somatosensory selective cells or Som-sel). The third category is for the cells that respond to both auditory and somatosensory stimulations when both stimulations are presented individually (auditory/somatosensory nonselective cells or Aud/Som-nonsel). Accordingly, the number of cells may be different across all these categories. We have clarified this part in the text [lines 299-303]. We have modified Figures 7, 8, and 11 (old Figure 10) to match the data from anesthetized and awake animals, so Figures 7H and I now show the collective % of the cells from all animals within modules vs matrix.

      (8) Why are the inferential statistics used in Figure 9F (chi-square test) and Figure 5A-C (t-test) when it tests the same thing (the only difference is one is anaesthetised data and the other awake)? Indeed, all Figure 9 and 10 (awake data figures) plots use chi-square tests to test differences in percentages instead of t-tests used in earlier (anaesthetised data figures) plots to test differences in percentages between groups. Please clarify the reason for this change in statistics used for similar comparisons.

      Thank you for the reviewer’s comment. Imaging the LC via microprism from awake animals confirmed the ability to run this technique with no interference to the ambulatory functions of the animals. Therefore, the main goal was to highlight the similarities between the data obtained from awake and anesthetized setups by highlighting the comparison between the LC and DC or between modules and matrix within each preparation (anesthetized vs awake). Accordingly, the statistics used to run these comparisons were chosen based on the number of the tested animals at each setup (7 anesthetized animals and 3 awake animals for prism insertion). The low number of animals used for awake data made us use the number of cells collectively from all animals instead of the number of animals, so we used the Chi-square test to examine the differences in percentages.

      (9) Figure 10H: The main text describes the results shown here as similar to what was seen in anaesthetised animals. But it looks to me like the results in awake animals are qualitatively different from the multisensory interaction seen in anaesthetised animals. In anaesthetised animals the authors find that there is a higher chance of auditory responses being enhanced by somatosensory inputs when cells are in the modules compared to in the matrix. However, in awake data, this relationship is flipped, with more bimodal enhancement found in the matrix compared to the modules. Furthermore, almost all cells in the modules are suppressed by combined somatosensory input which looks like it is different from what is found in anaesthestised mice and what is described in the discussion: 'we observed that combined auditory-somatosensory stimulation generally suppressed neural responses to auditory stimuli and that this suppression was most prominent in the LC matrix'.

      Thank you for the reviewer’s comment. Our statement was meant to show how the data obtained from awake and anesthetized animals were generally similar. However, we agree that the statement may not be suitable due to the possible differences between awake and anesthetized animals. To address a fair comparison between the anesthetized and awake preparations, we ran similar statistics and graphs for Figures 7, 8, and 11 (old Figure 10). Given that the areas occupied by modules and matrix are different across animals due to the irregular shape of the modules, we chose to run a chi-square test for all the data to quantify the collective % of responding cells within modules vs matrix from all tested animals for each experimental setup (anesthetized vs awake). The anesthetized and awake animals similarly showed that modules and matrix had higher fractions of auditory responsive cells. However, matrix had more cells responding to auditory stimulations than modules, while modules had more cells responding to somatosensory stimulation than matrix. In contrast, while the anesthetized animals showed higher fractions of offset auditory-responsive cells, which were mostly clustered in the modules, the offset auditory-responsive cells were very rare in awake animals (6 cells/one animal).

      Based on the fractions of cells with suppressed or enhanced auditory response induced by bimodal stimulation, the data obtained from anesthetized and awake animals showed that the auditory response in the matrix was suppressed more than enhanced by bimodal stimulation. In contrast, modules had different profiles across the experimental setups and locations. For instance, the modules imaged via microprism in the anesthetized and awake animals showed suppressed more than enhanced auditory responses, but modules imaged from the dorsal surface in anesthetized animals showed enhanced more than suppressed auditory responses. Additionally, modules had less suppressed and more enhanced auditory responses compared to matrix in the anesthetized animals regardless of the location of the modules (microprism or dorsal surface). Yet, modules from awake animals had more suppressed and less enhanced auditory responses compared to matrix. We have addressed these differences in the results and discussion section.

      Additional minor comments that I think the authors could use to aid their manuscript clarity:

      (1) The figure colour selection - especially in Figures 7 and 8 - is really hard to tell apart. Please choose more distinct colours, and a colour scheme that is appropriate for colour blind readers.

      Thank you for the reviewer’s suggestion. We have noticed that the magenta color assigned for the cells with offset responses was very difficult to distinguish from the black background. We have changed the magenta color to green to be different from the color of other cells. Using Photoshop, we chose a color scheme that is suitable for color-blind readers in all our maps.

      (2) The sentence in lines 331-334 should be rephrased for clarity.

      Thank you for the reviewer’s suggestion. We have rephrased the statement for clarity [lines 364-371].

      Reviewer #2 (Recommendations For The Authors):

      As mentioned in the public review the strong clustering evident in some of the maps (some of which may be related to module/matrix differences but certainly not all of it) seems worth scrutinizing further. Would we expect such a strong spatial segregation of auditory responsive and non-responsive neurons? Would we expect response properties (e.g. off-responsiveness) other than frequency tuning to show evidence of a topographic arrangement in the IC? In addressing this it would, of course, be important to rule out that this clustering is not down to some trivial experimental variables and truly reflects functional organization. For instance, are the patches of non-responsive neurons found in parts of the field of view with poor visibility, poor labelling, etc which may explain why it is difficult to pick up responses there? Are the neurons in non-responsive areas otherwise active (i.e. do they show spontaneous activity) or could they be 'dead'? Could the way neuropil signals are dealt with play a role here (it is weighted by 0.4 which strikes me as quite low)? In relation to this, I am also wondering to what extent the extreme overrepresentation (Figure 4) of neurons with a BF of 5kHz (some of this is, of course, down to the fact that the lower end of the frequency range was 5kHz and that the step size was 0.5 octaves), especially in the DC, is to be interpreted.

      Thank you for the reviewer’s comment. Before analysis, the ROIs of all cells were set around the cell bodies using the jRGECO1a signals as a reference, so all cells (responsive and nonresponsive) were collected from areas of good visibility of jRGECO1a signals. In other words, no cells were collected from regions having poor jRGECO1a signals. In Figures 7, 8, and 11 (old Figure 10), the cells showed response either only to unmodulated broadband noise at 80 dB as an auditory stimulus or to whisker deflection with specific speed and power as a somatosensory stimulus. Given that the two stimuli above had specific parameters, the remaining non-responsive cells may respond to auditory or somatosensory stimulations with other features. For instance, some nonresponsive cells to the unmodulated broadband noise were responding to pure tones with different amplitudes and frequencies or to different AM-noise with different amplitudes and modulation frequencies.  Also, these nonresponsive cells may not respond to any of our tested stimuli and may respond to other sensory stimulations. Some of the non-responsive cells showed spontaneous activity when no stimulations were presented. However, we can not rule out the possibility that some of these nonresponsive cells may not be viable. We have addressed the clustering properties in the revised version of the manuscript in the corresponding spots of the results and discussion sections. We have added a new supplementary figure (Figure 11- Figure Supplement 1) to show how the nonresponsive cells to the unmodulated noise may respond to other types of sound and to show the spontaneous activity of some non-responsive cells.

      For the neuropil, previous reports used the contamination factor (r) in a range of 0.3-0.7 (we referenced these studies in the method section [line 776) based on the tissue or cells imaged, vasculatures, and the objective used for imaging. Therefore, we optimized the contamination factor (r) to be 0.4 through a preliminary analysis based on the tissue we image (LC), and the objective used (16x with NA = 0.8 and 3 mm as a working distance).

      We agree that there is an overrepresentation of 5 kHz as the best tuning frequency for DC cells. The previous report (A. B. Wong & Borst, 2019) showed a large zone of the DC where cells were tuned to (2-8 kHz). Given that 5kHz was the lowest tested frequency in our experiment, we think that the low-frequency bias of the DC surface is consistent between studies. This finding also could be supported by the electrophysiology data obtained by spanning the recording electrodes through the IC tissue along the dorsoventral axis. In those experiments, the cells were tuned to lower frequencies at the dorsal surface of the IC.

      We have changed the magenta-colored cells to green ones, so it will be easier to identify the cells. As required by another reviewer, we changed the color pallets of some images and cellular maps to be suitable for color-blind readers. 

      The manuscript would benefit from more precise language in a number of places, especially in the results section.

      Line 220/221, for instance: "... a significant fraction of cells that did not respond to pure tones did respond to AM-noise" Strictly speaking, this sentence suggests that you considered here only the subset of neurons that did not respond to pure tones and then ran a test on that subset. The test that was done seems to suggest though that the authors tested whether the percentage of responsive cells was greater for pure tones or for AM noise.

      Thank you for the reviewer’s comment. We do apologize for the confusion. In the revised manuscript, we categorized the cells according to their response into cells responding to pure tone only (tone-selective cells or Tone-sel), Am-noise only (noise-selective cells or Nose-sel), and to both pure tone and am-noise (nonselective cells or Non-sel). We have modified Figure 5 accordingly. We did the same thing for the data obtained from awake animals and showed that in a new figure to easily match the analysis done for the anesthetized animals.

      Please refer to the figure panels in the text in consecutive order. 2B, for instance, is mentioned after 2H.

      Thank you for the reviewer’s comment. Throughout the paper, we kept the consecutive order of the figure panels within each figure to be in a smooth flow with the text. Yet, figure 2 was just the only exception for a good reason. Figure 2 is a complex one that includes many panels to show a parallel comparison between LC imaged via microprism and DC through single photon images, two-photon images, validating laser lesioning, and histology. Accordingly, we navigated many panels of the figure to efficiently highlight the aspects of this comparison. We prefer to keep Figure 2 as one figure with its current format to show this parallel comparison between LC and DC.

      The legend for Figure 2 could be clearer. For instance, there are two descriptions for panel D. Line 1009: "(C-E)" [i.e. C, D, E] and line 1010: "(D and F)".

      Thank you for the reviewer’s comment. It should be C and E, not C-E. We have fixed the mistake [line 1224]

      Line 275: What does 'with no preference' mean?

      Thank you for the reviewer’s comment. We do apologize for the confusion. There are three categories of cells. Some cells respond only to auditory stimulation, while others respond to only somatosensory stimulation. However, there is another group of cells that respond nonselectively to auditory and somatosensory stimulations or Aud/Som-nonsel cells. We edited the sentence to be clearer [lines 303-304].

      Line 281 (and other places): What does 'normalized against modules' mean?

      Thank you for the reviewer’s comment. This normalization was done by dividing the number of responsive cells of the same response type in the matrix by that in the modules. Therefore, the value taken by modules was always 1 and the value taken by the matrix is something around 1. Accordingly, the value for matrix could be > 1 if matrix had more cells than modules. In contrast, the value of matrix would be < 1 if matrix had fewer cells than modules. In the revised version, we used this normalization method to make the revised Figures 5C and 10C to describe the cell fractions responding to pure tone only, AM-noise only, or to both stimuli in the matrix vs modules. 

      Sentence starting on line 288. I don't find that point to be as obvious from the figures as the sentences seem to suggest. Are we to compare magenta points (auditory off cells) from 7C with green points in 7F?

      Thank you for the reviewer’s comment. We came to this conclusion based on our visual comparison of magenta points (now green in the revised version to increase the visibility) representing the auditory offset cells in Figure 7C and the green points in Figure 7F representing the cells responding to both somatosensory and auditory stimulations. In the revised manuscript, we statistically examined if the percentage of onset auditory response and offset auditory responses are different within the responsive cells to both somatosensory and auditory stimulations in the modules vs matrix. We have found that most of the cells responding to both somatosensory and auditory stimulations inside the modules had offset auditory responses, which could indicate a level of multisensory integration between somatosensory input and the offset auditory responses in these cells. We have added the statistical results to the revised manuscript to address this effect [lines 312-317]

      Lines 300-302: "These data suggest that the module/matrix system permits preservation of distinct multimodal response properties in the face of massive integration of inputs in the LC". First, I'm not quite sure what that sentence means. Second, it would be more appropriate for the discussion. Third, the fact that we are more likely to find response enhancement in the modules than in the matrix is nicely consistent with the idea (supported by work from the senior author's lab and others) that excitatory somatosensory input predominantly targets neurons in the modules (which is why we see mostly response enhancement in the modules) and that this input targets GABAergic neurons which then project to and inhibit neurons both outside and inside of their module. Therefore, I would recommend that the authors replace the aforementioned sentence with one that interprets these results in light of what we know about this somatosensory-auditory circuitry.

      Thank you for the reviewer’s comment. Despite the massive multimodal inputs, the LC receives from auditory vs nonauditory regions, the module/matrix system is a platform for distinct multimodal responses indicated by more somatosensory responsive cells in modules versus more auditory responsive cells in matrix, which matches the anatomical differences that were reported before. We edited the sentence in the light of the comparison between the data obtained from awake and anesthetized animals and moved it to the discussion section [lines 503-506].

      The term 'LC imaged via microprism' is used dozens of times throughout the manuscript. Replacing it with a suitable acronym or initialism could improve the flow of the text and would make some of the sentences less cumbersome.

      Thank you for the reviewer’s suggestion. We changed the term “LC imaged via microprism” into LC (microprism) throughout the revised manuscript.

      5A-C: It is unclear what is being compared here. What are the Ns? Different animals?

      Thank you for the reviewer’s comment. We do apologize for this missing information. We have added the number of subjects used in every statistical test in each corresponding figure legend.

      5G: minus symbol missing on the y-axis.

      Thank you for the reviewer’s comment. We gladly have fixed that.

      Figure 6: Are these examples or population averages?

      Thank you for the reviewer’s question. Every figure panel of the old Figure 6 represents a single trace of an example cell. However, we modified Figure 6 to include more examples of cells showing different responses complying with another reviewer’s suggestion. Each panel of the new Figure 6 represents the average response of 5 stimulations of the corresponding stimulus type. We preferred to show the average signal because it was the one used for the subsequent analysis.

      How are module borders defined?

      Thank you for the reviewer’s question. The modules were defined based on the intensity of the green channel that shows the expression of the GFP signals. The boundaries of modules were determined according to the distinction between high and low GFP signal boundaries of the modules. This step was done before data analysis to avoid any bias.

      7JKL: How are these to be interpreted? Does panel 7K, for instance, indicate that the fraction of neurons showing 'on' responses was roughly twice as large in the matrix than in the modules and that the fraction of neurons showing 'off' responses was roughly 10 times larger in the modules than in the matrix (the mean seems to be at about 1/10).

      Thank you for the reviewer’s comment. The data represented by Figures 7J-L defined the normalization of the number of cells of the same response type in the matrix against the modules. This normalization was done per animal, and then the data of the matrix were plotted against the normalization line at 1 representing the modules. The matrix will be claimed to have more cells than modules if the median of the matrix values > 1. In contrast, the matrix will be claimed to have fewer cells than the modules if the median of the matrix values < 1. Finally, if the median of matrix values = 1, this means there is no difference between matrix and modules. However, to match the data obtained from anesthetized animals (Figures 7 and 8) with those obtained from awake animals (Figure 11 or old Figure 10), we ran all data through the Chi-square test in the revised manuscript. Therefore, the format of Figures 7K-L was changed in the revised manuscript, so they became new Figures 7I-K.

      10A suggests that significantly more than half the neurons shown here are not auditory responsive. Perhaps I am misinterpreting something here but isn't that in contrast to what is shown in panel 9F?

      Thank you for the reviewer’s comment. The data shown in Figure 10A (or revised Figure 11A) represents the cellular response to only one stimulus (broadband noise at 80 dB with no modulation frequency), while Figure 9F (revised 10B) represents the cells responding to varieties of auditory stimulations of different combinations of frequencies and amplitudes (pure tones) as well as to AM-noise of different amplitudes and modulation frequencies. Accordingly, the old Figure 9F or revised Figure 10B shows different cell types based on their responses. For instance, some cells respond only to pure tone. Others respond only to AM-noise or to both pure tones and AM-noise. This may also support that the nonresponsive cells in Figure 10A (revised 11A) can respond to other types of sound features.

      The way I understood panels 7L and 8K there were more suppressed neurons in the matrix than in the modules (line 296: "cells in the modules had a higher odds of having an enhancement response to bimodal stimulation than matrix, while cells in the matrix had a higher odds of having a suppressive response to bimodal stimulation"). Now, panel 10F indicates that in awake mice there is a greater proportion of suppressed neurons in the modules than in the matrix. I may very well have overlooked or misread something but I may not be the only reader confused by this so please clarify.

      Thank you for the reviewer’s comment. We do apologize for this confusion. The ambiguity between Figures 7 and 8 (anesthetized animals) as well as Figure 10 (awake animals) comes from the fact that different statistics have been used for each preparation. In the revised version, we have fixed that by running the same statistics for all the data, and we accordingly revised Figures 7, 8, and 10 (new Figure 11). In brief, the matrix preserves a higher percentage of cells with suppressed auditory responses than those with enhanced auditory responses induced by bimodal stimulation in all conditions (anesthetized vs awake). In contrast, modules act differently across all tested conditions. While modules had more cells with enhanced auditory responses induced by bimodal interaction in anesthetized animals, they had more cells with suppressed response in awake animals indicating that modules could be sensitive to the effect of anesthesia compared to matrix. We addressed this effect in the discussion of the revised manuscript [lines 521-553].

      Line 438: ...as early AS...

      Thank you for the reviewer’s comment. We gladly fixed that [line 512].  

      Reviewer #3 (Recommendations For The Authors):

      My minor recommendations for the authors are as follows:

      (1) The text can be a bit difficult to follow in places. This is partly due to the convoluted nature of the results, but I suggest a careful read-through to look for opportunities to improve the prose. In particular, there is a tendency to use long sentences and long paragraphs. For example, the third paragraph of the introduction runs for almost fifty lines.

      Thank you for the reviewer’s comment and suggestion. We have fixed that.

      (2) This might be due to journal compression, but some of the bar graphs in the figures are difficult to read. For example, the individual data points, especially when filled with striped background colors get lost. Axes can become invisible, like the y-axis in 7L, and portions of bars, like in 7F, are not always rendered correctly. Error bars are sometimes hidden behind data points, as in 5C. Increasing line thickness and shifting individual data points away from error bars might help with this.

      Thank you for the reviewer’s comment and suggestion. We made all the data points with black color and filled circles to make the data points visible. We put all the data points behind the main columns, so they don’t block the error bars. We have fixed figures 7 and 5.

      (3) Throughout the manuscript, the authors use a higher SMI to indicate a preference of cells for auditory stimuli with "greater spectral... complexity" (e.g., lines 219 and 372). I find this interpretation a bit challenging since SMI compares a neuron's preference for tones over noise, and to me, tones seem like the least spectrally complex of all auditory stimuli. Perhaps some clarification of what the authors mean by this would help. For example, is the assumption that a neuron that prefers tones over noise is, either directly or indirectly, receiving input sculpted by inhibitory processes?

      Thank you for the reviewer’s comment. In general, higher SMI values indicate an increase in the preference of the cells to respond to pure tones than noise with no modulation (less spectral complexity). We will clarify this statement throughout the manuscript. However, the SMI value was not mentioned in lines 219 and 372. The statement mentioned in line 219 describes the revised figure 5C (old 5B), where more cells in matrix specifically respond to AM-noise compared to modules, which indicates the preference of the matrix to respond to sounds of greater spectral and temporal complexity. The statement in 372 in the discussion section refers to the finding in revised figures 5E and F (old 5D and E). In the revised figure 5E or old 5D, the data show that matrix has more cells responding to pure tones or noise with no modulation than modules, so matrix has a lower threshold to detect the spectral features of sound (revised figure 5E or old 5D). In the revised figure 5F or old 5E, the data show that matrix has more cells responding to AM-noise than modules, which indicates that matrix functions more to process the temporal features of sound. As explained above, all findings were related to the percentage of cells responding to specific sound stimuli and not the exact SMI values. We have revised the figures accordingly by removing the terms SMI and TMI from the figures, and we have clarified that in the text.

      (4) Lines 250-253: How does a decrease in SMI correspond to "an increase in pure tone responsiveness?" Doesn't a decrease suggest the opposite?

      Thank you for the reviewer’s comment, which we agree with. We do apologize for that. We have fixed this statement [lines 275-277] and any related findings accordingly.

      (5) Line 304: Add "imaged via microprism" or similar after "response profiles with the LC.".

      Thank you for the reviewer’s suggestion. We have fixed that. However, we changed the term “LC imaged via microprism” into “LC(microprism)” for simplicity as suggested by another reviewer [line 330].

      (6) Figure 5A and C: Both plots show that more neurons responded to AM-noise than tones, but it would be interesting to know how much the tone-responsive and AM-noise responsive populations overlapped. Were all tone-responsive neurons also responsive to AM-noise?

      Thank you for the reviewer’s comment. We have categorized the cells based on their response to pure tone only, AM-only, and both pure tone and AM-noise when each stimulus is presented individually. We have modified Figures 5A and C, and they are now Figures 5B and D.

      (7) Figure 5G: Missing negative sign before "0.5.".

      Thank you for the reviewer’s suggestion. We gladly have fixed that. However, old Figure 5G became a revised Figure 5H.  

      (8) Figure 7 legend, Line 1102: Missing period after "(C and E)".

      Thank you for the reviewer’s suggestion. We think that the period should be placed before (C and E) at the end of “respectively”. The parentheses refer to the statements after them. We gladly fixed that. [line 1394]